BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Penn Engineering Events - ECPv6.16.3//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Penn Engineering Events
X-ORIGINAL-URL:https://seasevents.nmsdev7.com
X-WR-CALDESC:Events for Penn Engineering Events
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20240310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20241103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20251102T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20260308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20261101T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20251030T120000
DTEND;TZID=America/New_York:20251030T130000
DTSTAMP:20260601T213110
CREATED:20250828T204046Z
LAST-MODIFIED:20250828T204046Z
UID:20943-1761825600-1761829200@seasevents.nmsdev7.com
SUMMARY:FOLDS seminar: Weak to Strong Generalization in Random Feature Models
DESCRIPTION:Zoom link: https://upenn.zoom.us/j/98220304722 \n  \nWeak-to-Strong Generalization (Burns et al.\, 2023) is the phenomenon whereby a strong student\, say GPT-4\, learns a task from a weak teacher\, say GPT-2\, and ends up significantly outperforming the teacher. We show that this phenomenon does not require a strong and complex learner like GPT-4\, nor pre-training. We consider students and teachers that are random feature models\, described by two-layer networks with a random and fixed bottom layer and trained top layer. A ‘weak’ teacher\, with a small number of units (i.e. random features)\, is trained on the population\, and a ‘strong’ student\, with a much larger number of units (i.e. random features)\, is trained only on labels generated by the weak teacher. We demonstrate\, prove and understand\, how the student can outperform the teacher\, even though trained only on data labeled by the teacher\, with no pretraining or other knowledge or data advantage over the teacher. We explain how such weak-to-strong generalization is enabled by early stopping. Importantly\, we also show the quantitative limits of weak-to-strong generalization in this model. \nJoint work with Marko Medvedev\, Kaifeng Lyu\, Dingli Yu\, Sanjeev Arora and Zhiyuan Li.
URL:https://seasevents.nmsdev7.com/event/folds-seminar-tba-6/
LOCATION:Amy Gutmann Hall\, Room 414\, 3333 Chestnut Street\, Philadelphia\, 19104\, United States
CATEGORIES:Seminar,Colloquium
END:VEVENT
END:VCALENDAR