Loading Events

ESE & Statistics Seminar: “Large Neural Networks: Insights from Linearized Models”

September 17, 2019 at 11:00 AM - 12:00 PM
Details
Date: September 17, 2019
Time: 11:00 AM - 12:00 PM
Event Category: Seminar
  • Event Tags:
  • Organizer
    Electrical and Systems Engineering
    Phone: 215-898-6823
    Venue
    PICS Conference Room 534 – A Wing , 5th Floor 3401 Walnut Street
    Philadelphia
    PA 19104
    Google Map

    Abstract: Modern machine learning models, and in particular multilayer neural networks, exhibit a broad range of puzzling phenomena. Their training requires to minimize a highly non-convex high-dimensional cost function, and yet it is efficiently addressed using simple gradient descent (GD) or stochastic gradient descent (SGD) algorithms. This model contains more parameters than the number of samples, and indeed they often are able to achieve zero training error, i.e. to perfectly interpolate or classify the training data. In fact, they can achieve zero training error even if the true labels are replaced by random ones.  Despite this fact, they can generalize well beyond the training set. Finally, far from being a nuisance or limitation, this massive over parameterization appears to play an important role in explaining the power of these models.

    I will discuss these phenomena, and how we can make sense of them by using some simple linear models. Finally, I will discuss the limitations of these `linear explanations’, and open challenges.
    [Based on joint work with: Behrooz Ghorbani, Song Mei, Theodor Misiakiewicz, and with Ryan Tibshirani, Saharon Rosset, Trevor Hastie]