FOLDS seminar: Theory and practice of LLM quantization
Amy Gutmann Hall, Room 306 3317 Chestnut Street, PhiladelphiaZoom link: https://upenn.zoom.us/j/98220304722 Modern LLMs process information by repeatedly applying a basic primitive of matrix multiplication. Estimates show that about 60-84% of the energy consumed by LLMs goes into […]
