PHYSICS PHD DISSERTATION DEFENSE: Stanislav Fort
Ph.D. Candidate: Stanislav Fort
Research Advisor: Surya Ganguli
Date: Wednesday November 10, 2021
Time: 11 AM
Zoom Link: https://stanford.zoom.us/j/99318496379
Zoom Password: email nickswan [at] stanford.edu (nickswan[at]stanford[dot]edu) at least 24 hours in advance for password
Title: Geometric Aspects of Deep Learning
Abstract: Large deep neural networks trained with gradient descent have been extremely successful at learning solutions to a broad suite of difficult problems across a wide range of domains. Despite their tremendous success, we still do not have a detailed, predictive understanding of how they work and what makes them so effective. In this talk, I will describe recent efforts to understand the structure of deep neural network loss landscapes and how gradient descent navigates them during training. In particular, I will discuss a phenomenological approach to modeling their large-scale structure using high-dimensional geometry [1], the role of their nonlinear nature in the early phases of training [2], and its effects on ensembling, calibration, and approximate Bayesian techniques [3].
[1] Stanislav Fort, and Stanislaw Jastrzebski. “Large Scale Structure of Neural Network Loss Landscapes.” NeurIPS 2019. arXiv 1906.04724
[2] Stanislav Fort et al. "Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel". NeurIPS 2020. arXiv 2010.15110
[3] Stanislav Fort, Huiyi Hu, Balaji Lakshminarayanan. "Deep Ensembles: A Loss Landscape Perspective." arXiv 1912.02757