Place: Online Seminar: Please sign up for our mailing list at www.physicsmeetsml.org for zoom link. We will also livestream the talk in Chamberlin 5280.
Speaker: Nadav Cohen, Tel Aviv University
Abstract: The mysterious ability of neural networks to generalize is believed to stem from an implicit regularization, a tendency of gradient-based optimization to fit training data with predictors of low “complexity.” Despite vast efforts, a satisfying formalization of this intuition is lacking. In this talk I will present a series of works theoretically analyzing the implicit regularization in quantum tensor networks, known to be equivalent to certain (non-linear) neural networks. Through dynamical characterizations, I will establish an implicit regularization towards low tensor ranks, different from any type of norm minimization, in contrast to prior beliefs. I will then discuss implications of this finding to both theory (potential explanation for generalization over natural data) and practice (compression of neural network layers, novel regularization schemes). An underlying theme of the talk will be the potential of quantum tensor networks to unravel mysteries behind deep learning.
Works covered in the talk were in collaboration with Sanjeev Arora, Wei Hu, Yuping Luo, Asaf Maman and Noam Razin.