Loading Events

FOLDS SEMINAR: The Hidden Width of Deep ResNets

November 10, 2025 at 2:30 PM - 3:30 PM
Details
Date: November 10, 2025
Time: 2:30 PM - 3:30 PM
Event Category: SeminarColloquium
  • Event Tags:, , , , ,
  • Organizer
    IDEAS Center
    Venue
    Amy Gutmann Hall, Room 414 3333 Chestnut Street
    Philadelphia
    19104
    Google Map

    Zoom link: https://upenn.zoom.us/j/6130182858

     

    We present a mathematical framework to analyze the training dynamics of deep ResNets that rigorously captures practical architectures (including Transformers) trained from standard random initializations. Our approach combines stochastic approximation of ODEs with propagation-of-chaos arguments to obtain tight convergence rates to the “infinite size” limit of the dynamics. It yields the following insights:
    1/ Depth begets width: infinite-depth ResNets of any hidden width behave throughout training as if they were infinitely wide;
    2/ Phase diagram: we derive the phase diagram of the training dynamics, which singles out an “ideal” scaling of hyper-parameters (initialization scale and learning-rates), extending “CompleteP” to more general architectures;
    3/ Optimal shape scaling: our analysis suggests how to scale depth, hidden width and embedding dimension of a ResNet when scaling up parameter count. With the optimal shape and a parameter budget P, we argue that the model converges to its limiting dynamics at rate P^{-1/6}.