BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Penn Engineering Events - ECPv6.16.3//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Penn Engineering Events
X-ORIGINAL-URL:https://seasevents.nmsdev7.com
X-WR-CALDESC:Events for Penn Engineering Events
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20240310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20241103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20250309T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20251102T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20260308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20261101T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20251002T120000
DTEND;TZID=America/New_York:20251002T130000
DTSTAMP:20260602T001411
CREATED:20250828T184127Z
LAST-MODIFIED:20250828T184127Z
UID:20940-1759406400-1759410000@seasevents.nmsdev7.com
SUMMARY:FOLDS seminar: Theory and practice of LLM quantization
DESCRIPTION:Zoom link: https://upenn.zoom.us/j/98220304722 \n  \nModern LLMs process information by repeatedly applying a basic primitive of matrix multiplication. Estimates show that about 60-84% of the energy consumed by LLMs goes into memory load/store operations. How can we reduce this power consumption? Tokens start as about 16-bit integers but get mapped to vectors of floats of length in the 1000s\, suggesting very low information density per dimension. Thus\, unsurprisingly there has been much success in reducing precision of both weights and activations without much loss in LLM performance. In this talk we will present information-theoretic analysis of quantized representations and show how it lead us to creating NestQuant\, a new SOTA algorithm for weight/KV-cache/activations (ICML’2025). \n 
URL:https://seasevents.nmsdev7.com/event/folds-seminar-tba-3/
LOCATION:Amy Gutmann Hall\, Room 306\, 3317 Chestnut Street\, Philadelphia\, PA\, 19104\, United States
CATEGORIES:Seminar,Colloquium
END:VEVENT
END:VCALENDAR