BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Penn Engineering Events - ECPv6.16.3//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Penn Engineering Events
X-ORIGINAL-URL:https://seasevents.nmsdev7.com
X-WR-CALDESC:Events for Penn Engineering Events
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20240310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20241103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20251102T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20260308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20261101T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20250924T120000
DTEND;TZID=America/New_York:20250924T131500
DTSTAMP:20260602T005907
CREATED:20250821T203123Z
LAST-MODIFIED:20250821T203123Z
UID:20913-1758715200-1758719700@seasevents.nmsdev7.com
SUMMARY:ASSET Seminar: "How do LLMs generalize on out-of-distribution tasks? insights from model's internal representations"
DESCRIPTION:A mystery of large language models (LLMs) is their ability to solve novel tasks\, notably through a few demonstrations in the prompt (in-context learning). Such tasks often require the model to generalize far beyond its training distribution\, raising the question: how do LLMs achieve this form of out-of-distribution (OOD) generalization? For example\, in symbolized language reasoning where names/labels are replaced by arbitrary symbols\, yet the model can infer the correct name-label mapping without any finetuning. \nIn this talk\, I will open the black box of LLMs and reveal how three facets of LLM behavior are interconnected: emergent phenomena during training\, OOD generalization\, and a model’s representation of compositions. Focusing on induction heads\, I will show that learning the right compositional structure is a key to OOD generalization\, and this learning process exhibits sharp transitions in training dynamics. Further\, I propose that “”common bridge representation hypothesis””—where a latent subspace in the embedding space acts as a bridge to align multiple attention heads across early and later layers—may be the key geometric structure underlying the success of transformers. \n  \nZoom: https://upenn.zoom.us/j/91447341103
URL:https://seasevents.nmsdev7.com/event/asset-seminar-title-tbd-3/
LOCATION:Amy Gutmann Hall\, Room 414\, 3333 Chestnut Street\, Philadelphia\, 19104\, United States
CATEGORIES:Seminar
ORGANIZER;CN="AI-enabled Systems%3A Safe%2C Explainable%2C and Trustworthy (ASSET) Center":MAILTO:asset-info@seas.upenn.edu
END:VEVENT
END:VCALENDAR