Name: ESE Seminar: “Demystifying (Deep) Reinforcement Learning: The Optimist, The Pessimist, and Their Provable Efficiency”
Start: 2021-03-12T11:00:00-05:00
End: 2021-03-12T12:00:00-05:00
Location: Zoom – Email ESE for Link jbatter@seas.upenn.edu

ESE Seminar: “Demystifying (Deep) Reinforcement Learning: The Optimist, The Pessimist, and Their Provable Efficiency”

March 12, 2021 at 11:00 AM - 12:00 PM

Share this event

Add to Calendar

Details

Date: March 12, 2021

Time: 11:00 AM - 12:00 PM

Event Category: SeminarFacultyColloquiumStudent

Event Tags:ESE

Organizer

Electrical and Systems Engineering

Phone: 215-898-6823

Email: eseevents@seas.upenn.edu

Website: View Organizer Website

Venue

Zoom – Email ESE for Link jbatter@seas.upenn.edu

Coupled with powerful function approximators such as deep neural networks, reinforcement learning (RL) achieves tremendous empirical successes. However, its theoretical understandings lag behind. In particular, it remains unclear how to provably attain the optimal policy with a finite regret or sample complexity. In this talk, we will present the two sides of the same coin, which demonstrates an intriguing duality between pessimism and optimism.

– In the online setting, we aim to learn the optimal policy by actively interacting with an environment. To strike a balance between exploration and exploitation, we propose an optimistic least-squares value iteration algorithm, which achieves a \sqrt{T} regret in the presence of linear, kernel, and neural function approximators.

– In the offline setting, we aim to learn the optimal policy based on a dataset collected a priori. Due to a lack of active interactions with the environment, we suffer from the insufficient coverage of the dataset. To maximally exploit the dataset, we propose a pessimistic least-squares value iteration algorithm, which achieves a minimax-optimal sample complexity.

ESE Seminar: “Demystifying (Deep) Reinforcement Learning: The Optimist, The Pessimist, and Their Provable Efficiency”

March 12, 2021 at 11:00 AM - 12:00 PM

Details

Organizer

Venue

Read More