Name: ASSET Seminar: “Efficient Sharing of AI Infrastructures with Specialized Serverless Computing”
Start: 2025-01-29T12:00:00-05:00
End: 2025-01-29T13:15:00-05:00
Location: Amy Gutmann Hall, Room 414

ASSET Seminar: “Efficient Sharing of AI Infrastructures with Specialized Serverless Computing”

January 29, 2025 at 12:00 PM - 1:15 PM

Share this event

Add to Calendar

Details

Date: January 29, 2025

Time: 12:00 PM - 1:15 PM

Event Tags:ASSET, CIS, AI

Venue

Amy Gutmann Hall, Room 414 3333 Chestnut Street
Philadelphia
19104 Google Map

Abstract:

The efficient sharing of AI infrastructures is becoming increasingly important in both public and private data centers. This demand is driven by two key factors: the proliferation of specialized AI models tailored for different users and applications, and the highly dynamic nature of requests, which are often on-demand. Dedicated GPU allocation in such scenarios results in prohibitively high costs and inefficient resource utilization.

In this talk, I will introduce serverless computing as a promising paradigm for addressing these challenges by enabling efficient, on-demand sharing of AI infrastructures. I will highlight its use cases and discuss key barriers to broader adoption. Following this, I will present ServerlessLLM, a state-of-the-art system designed to tackle key challenges in serverless large language model (LLM) inference, particularly cold-start latency. Specifically, I will cover ServerlessLLM’s novel contributions, including its checkpoint format design, locality-aware scheduling, and inference request live migration. Finally, I will outline open challenges beyond efficiency, such as fairness, privacy, and sustainability, which are critical for the future of serverless AI systems.

Zoom Link (if unable to attend in-person): https://upenn.zoom.us/j/95090162762

ASSET Seminar: “Efficient Sharing of AI Infrastructures with Specialized Serverless Computing”

January 29, 2025 at 12:00 PM - 1:15 PM

Details

Venue

Read More