Loading Events

ASSET Seminar: “Demystifying the Inner Workings of Language Models”

February 19, 2025 at 12:00 PM - 1:15 PM
Details
Date: February 19, 2025
Time: 12:00 PM - 1:15 PM
  • Event Tags:, ,
  • Venue
    Amy Gutmann Hall, Room 414 3333 Chestnut Street
    Philadelphia
    19104
    Google Map

    Abstract:

    Large language models (LLMs) power a rapidly-growing and increasingly impactful suite of AI technologies. However, due to their scale and complexity, we lack a fundamental scientific understanding of much of LLMs’ behavior, even when they are open source. The “black-box” nature of LMs not only complicates model debugging and evaluation, but also limits trust and usability. In this talk, I will describe how my research on interpretability (i.e., understanding models’ inner workings) has answered key scientific questions about how models operate. I will then demonstrate how deeper insights into LLMs’ behavior enable both 1) targeted performance improvements and 2) the production of transparent, trustworthy explanations for human users.

    Zoom Link (if unable to attend in-person): https://upenn.zoom.us/j/99113576305