BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Penn Engineering Events - ECPv6.15.20//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Penn Engineering Events
X-ORIGINAL-URL:https://seasevents.nmsdev7.com
X-WR-CALDESC:Events for Penn Engineering Events
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20180311T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20181104T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20190310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20191103T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20200308T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20201101T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190222T110000
DTEND;TZID=America/New_York:20190222T120000
DTSTAMP:20260409T023714
CREATED:20190206T211719Z
LAST-MODIFIED:20190206T211719Z
UID:1276-1550833200-1550836800@seasevents.nmsdev7.com
SUMMARY:CIS Seminar: "Towards Generalization and Efficiency in Reinforcement Learning"
DESCRIPTION:Abstract: \n\nIn classic supervised machine learning\, a learning agent behaves as a passive observer: it receives examples from some external environment which it has no control over and then makes predictions. Reinforcement Learning (RL)\, on the other hand\, is fundamentally interactive : an autonomous agent must learn how to behave in an unknown and possibly hostile environment\, by actively interacting with the environment to collect useful feedback. One central challenge in RL is how to explore an unknown environment and collect useful feedback efficiently. In recent practical RL success stories\, we notice that most of them rely on random exploration which requires large a number of interactions with the environment before it can learn anything useful.  The theoretical RL literature has developed more sophisticated algorithms for efficient learning\, however\, the sample complexity of these algorithms has to scale exponentially with respect to key parameters of underlying systems such as the dimensionality of state vector\, which prohibits a direct application of these theoretically elegant RL algorithms to large-scale applications. Without any further assumptions\, RL is hard\, both in practice and in theory.\n  \nIn this work\, we improve generalization and efficiency on RL problems by introducing  extra sources of help and additional assumptions. The first contribution of this work comes from improving RL sample efficiency via Imitation Learning (IL). Imitation Learning reduces policy improvement to classic supervised learning. We study in both theory and in practice how one can imitate experts to reduce sample complexity compared to RL approaches. The second contribution of this work comes from exploiting the underlying structures of the RL problems via model-based learning approaches.  While there exist efficient model-based RL approaches specialized for specific RL problems (e.g.\, tabular MDPs\, Linear Quadratic Systems)\, we develop a unified model-based algorithm that generalizes a large number of RL problems that were often studied independently in the literature. We also revisit the long standing debate on whether model-based RL is more efficient than model-free RL from a theoretical perspective\, and demonstrate that model-based RL can be exponentially more sample efficient than model-free ones\, which to the best of our knowledge\, is the first that separates model-based and model-free general approaches.
URL:https://seasevents.nmsdev7.com/event/cis-seminar-towards-generalization-and-efficiency-in-reinforcement-learning/
LOCATION:Wu and Chen Auditorium (Room 101)\, Levine Hall\, 3330 Walnut Street\, Philadelphia\, PA\, 19104\, United States
END:VEVENT
END:VCALENDAR