BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Penn Engineering Events - ECPv6.15.18//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Penn Engineering Events
X-ORIGINAL-URL:https://seasevents.nmsdev7.com
X-WR-CALDESC:Events for Penn Engineering Events
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20200308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20201101T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20210314T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20211107T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20220313T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20221106T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20210216T150000
DTEND;TZID=America/New_York:20210216T160000
DTSTAMP:20260407T035532
CREATED:20210208T214405Z
LAST-MODIFIED:20210208T214405Z
UID:4138-1613487600-1613491200@seasevents.nmsdev7.com
SUMMARY:CIS Seminar: "Structural Foundations of Efficient Reinforcement Learning:
DESCRIPTION:Abstract: \nThe design of learning agents which observe\, interact with\, and manipulate their environment to optimize desirable behaviors is a long-standing goal in machine learning\, with roots in artificial intelligence\, adaptive experimental design and adaptive feedback control. In machine learning\, these questions are typically studied in the area of reinforcement learning (RL)\, which has seen a recent surge of interest both due to potential applications\, from robotics and autonomous systems to healthcare and recommendations; as well as popular successes such as superhuman performance in games and dexterous manipulation of a Rubik’s cube. \nOn the theoretical front\, foundations for sample efficient learning were laid in the early 2000’s\, for problems where the agent perceives the environment through relatively simple observations. However\, these settings fail to capture the complex sensorimotor observation streams which most application domains naturally contain. In this talk\, I will describe a research program aimed at addressing this critical gap in our theoretical understanding. \nI will begin by highlighting some key challenges faced by an RL agent\, and the importance of understanding the structure of real-world applications to address these challenges. With this aim\, I will then introduce a complexity measure\, called Bellman rank\, for general RL problems. Crucially\, many application domains naturally exhibit a small Bellman rank\, and I will describe how low Bellman rank enables sample efficient RL. Bellman rank remains one of the most general ways of measuring the complexity of RL problems\, however\, computationally practical algorithms for all problems with a small Bellman rank still elude us. \nThe second part of the talk will focus on algorithmic questions\, designing optimization-based methods for solving a subclass of problems with a small Bellman rank. I will present an algorithm Policy Cover Policy Gradient (PC-PG)\, which comes with strong practical guarantees when the problem dynamics obey a certain linear structure. The algorithm is highly practical\, and easily composes with modern deep learning libraries for an efficient implementation. I will confirm its effectiveness beyond the confines of the theoretical assumptions in empirical evaluation against popular baselines. \nFinally\, I will conclude with a brief synopsis of work I have done on Contextual Bandits\, a much smaller\, yet practically useful subclass of RL. The research here has led to the design and creation of a general purpose cloud service at Microsoft\, which powers many applications of Contextual Bandits both inside and outside the company. \nThe first two parts of the talk are based on the papers https://arxiv.org/abs/1610.09512 and https://arxiv.org/pdf/2007.08459.pdf respectively.
URL:https://seasevents.nmsdev7.com/event/cis-seminar-structural-foundations-of-efficient-reinforcement-learning/
LOCATION:Zoom – Email CIS for link\, cherylh@cis.upenn.edu
ORGANIZER;CN="Computer and Information Science":MAILTO:cherylh@cis.upenn.edu
END:VEVENT
END:VCALENDAR