ASSET Seminar: “The coverage principle in language models: From pre-training to test-time scaling”
November 5, 2025 at 12:00 PM - 1:15 PM
Organizer
Test-time compute has emerged as a new axis for scaling language model capabilities, yet we lack a principled understanding of this paradigm. What are the right algorithms and trade-offs for test-time scaling? What properties of the pre-trained model enable it? And can we better align pre-training recipes for test-time success? This talk addresses these questions through a unified lens of coverage. We first show that test-time scaling strategies like best-of-N sampling succeed if and only if the pre-trained model has coverage over high-quality responses. We then demonstrate that coverage, and hence best-of-N performance, can be improved through deliberate exploration, either purely at test time or via RL-style post-training. Finally, we ask why pre-training via next-token prediction yields models with good coverage in the first place. We uncover a rich theoretical landscape driven by an implicit bias of the next-token prediction objective, while also identifying a fundamental misalignment between next-token prediction and coverage, raising the possibility of future algorithmic innovations.
Zoom: https://upenn.zoom.us/j/95189835192
Passcode: 797599

