Loading Events

ASSET Seminar: “Scaling Your Large Language Models on a Budget” (Atlas Wang, University of Texas at Austin)

January 17, 2024 at 12:00 PM - 1:15 PM
Details
Date: January 17, 2024
Time: 12:00 PM - 1:15 PM
  • Event Tags:,
  • Venue
    Raisler Lounge (Room 225), Towne Building 220 South 33rd Street
    Philadelphia
    PA 19104
    Google Map

    ABSTRACT:

    As the sizes of Large Language Models (LLMs) continue to grow exponentially, it becomes imperative to explore novel computing paradigms that can address the dual challenge of scaling these models while adhering to constraints posed by compute and data resources. This presentation will delve into several strategies aimed at alleviating this dilemma: (1) refraining from training models entirely from scratch, instead making use of readily available pre-trained models to optimize the training starting point of a new, larger model; (2) leveraging this concept of progressive initialization to enhance compute and data efficiency during the neural scaling process; (3) integrating hardness-aware data sampling, and more memory-efficient optimizers (work in progress). The talk will be concluded by a few (informal) thoughts and reflections.