BEGIN:VCALENDAR
VERSION:2.0
CALSCALE:GREGORIAN
PRODID:UW-Madison-Physics-Events
BEGIN:VEVENT
SEQUENCE:2
UID:UW-Physics-Event-6445
DTSTART:20210519T160000Z
DTEND:20210519T171500Z
DTSTAMP:20260428T090523Z
LAST-MODIFIED:20210519T145340Z
LOCATION:Online Seminar: Please sign up for our mailing list at www.ph
 ysicsmeetsml.org for zoom link
SUMMARY:Are wider nets better given the same number of parameters?\, P
 hysics ∩ ML Seminar\, Anna Golubeva\, Perimeter Institute
DESCRIPTION:Empirical studies demonstrate that the performance of neur
 al networks improves with increasing number of parameters. In most of 
 these studies\, the number of parameters is increased by increasing th
 e network width. This begs the question: Is the observed improvement d
 ue to the larger number of parameters\, or is it due to the larger wid
 th itself? We compare different ways of increasing model width while k
 eeping the number of parameters constant. We show that for models init
 ialized with a random\, static sparsity pattern in the weight tensors\
 , network width is the determining factor for good performance\, while
  the number of weights is secondary\, as long as the model achieves hi
 gh training accuracy. As a step towards understanding this effect\, we
  analyze these models in the framework of Gaussian Process kernels. We
  find that the distance between the sparse finite-width model kernel a
 nd the infinite-width kernel at initialization is indicative of model 
 performance.
URL:https://www.physics.wisc.edu/events/?id=6445
END:VEVENT
END:VCALENDAR