Upcoming Events

More info about the event will be posted here when available, or visit the event's main website at https://quantum.uchicago.edu/events/
Nid: 8094
Friday, May 29, 2020 - 12:00 to 13:00
One of the most surprising properties of deep neural networks (DNNs) is that they typically perform best in the overparameterised regime. Physicists are taught from a young age that having more parameters than datapoints is a terrible idea. This intuition can be formalised in standard learning theory approaches, based for example on model capacity, which also predict that DNNs should heavily over-fit in this regime, and therefore not generalise at all. So why do DNNs work so well? We use a version of the coding theorem from Algorithmic Information Theory to argue that DNNs are generically biased towards simple solutions. Such an inbuilt Occam’s razor means that they are biased towards solutions that typically generalise well. We further explore the interplay between this simplicity bias and the error spectrum on a dataset to develop a detailed Bayesian theory of training and generalisation that explains why and when SGD trained DNNs generalise, and when they should not. This picture also allows us to derive tight PAC-Bayes bounds that closely track DNN learning curves and can be used to rationalise differences in performance across architectures. Finally, we will discuss some deep analogies between the way DNNs explore function space, and biases in the arrival of variation that explain certain trends observed in biological evolution.
Nid: 8066
Wednesday, June 3, 2020 - 11:00 to 12:00
©2013 Board of Regents of the University of Wisconsin System