Name: ASSET Seminar: “Rethinking Test-Time Thinking: From Token-Level Rewards to Robust Generative Agents”
Start: 2025-09-10T12:00:00-04:00
End: 2025-09-10T13:15:00-04:00
Location: Amy Gutmann Hall, Room 414

ASSET Seminar: “Rethinking Test-Time Thinking: From Token-Level Rewards to Robust Generative Agents”

September 10, 2025 at 12:00 PM - 1:15 PM

Share this event

Add to Calendar

Details

Date: September 10, 2025

Time: 12:00 PM - 1:15 PM

Event Category: Seminar

Event Tags:ASSET, CIS, AI, IDEAS

Organizer

AI-enabled Systems: Safe, Explainable, and Trustworthy (ASSET) Center

Email: asset-info@seas.upenn.edu

Website: View Organizer Website

Venue

Amy Gutmann Hall, Room 414 3333 Chestnut Street
Philadelphia
19104 Google Map

We present a unified perspective on test-time thinking as a lens for improving generative AI agents through finer-grained reward modeling, data-centric reasoning, and robust alignment. Beginning with GenARM, we introduce an inductive bias for denser, token-level reward modeling that guides generation during decoding, enabling token-level alignment without retraining. While GenARM targets reward design, ThinkLite-VL focuses on the data side of reasoning. It proposes a self-improvement framework that selects the most informative samples via MCTS-guided search, yielding stronger visual reasoning with fewer labels. Taking this a step further, MORSE-500 moves beyond selection to creation: it programmatically generates targeted, controllable multimodal data to systematically probe and stress-test models’ reasoning abilities. We then interrogate a central assumption in inference-time alignment: Does Thinking More Always Help? Our findings reveal that increased reasoning steps can degrade performance–not due to better or worse reasoning per se, but due to rising variance in outputs, challenging the naive scaling paradigm. Finally, AegisLLM applies test-time thinking in the service of security, using an agentic, multi-perspective framework to defend against jailbreaks, prompt injections, and unlearning attacks–all at inference time. Together, these works chart a path toward generative agents that are not only more capable, but more data-efficient, introspective, and robust in real-world deployment.

Seminar Recording: https://drive.google.com/file/d/13jOKuou0QzqkMo9QHEdoHA1nCIxOPsbm/view?usp=drive_link

ASSET Seminar: “Rethinking Test-Time Thinking: From Token-Level Rewards to Robust Generative Agents”

September 10, 2025 at 12:00 PM - 1:15 PM

Details

Organizer

Venue

Read More