BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Penn Engineering Events - ECPv6.15.18//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Penn Engineering Events
X-ORIGINAL-URL:https://seasevents.nmsdev7.com
X-WR-CALDESC:Events for Penn Engineering Events
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20220313T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20221106T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20230312T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20231105T060000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20240310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20241103T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20231221T100000
DTEND;TZID=America/New_York:20231221T110000
DTSTAMP:20260403T223329
CREATED:20231129T212217Z
LAST-MODIFIED:20231129T212217Z
UID:10196-1703152800-1703156400@seasevents.nmsdev7.com
SUMMARY:Fall 2023 GRASP Seminar: Hadar Elor\, Tel Aviv University\, "Marrying Vision and Language: A Mutually Beneficial Relationship?"
DESCRIPTION:This is a virtual event with virtual attendance on Zoom.  \nABSTRACT\nFoundation models that connect vision and language have recently shown great promise for a wide array of tasks such as text-to-image generation. Significant attention has been devoted towards utilizing the visual representations learned from these powerful vision and language models. In this talk\, I will present an ongoing line of research that focuses on the other direction\, aiming at understanding what knowledge language models acquire through exposure to images during pretraining. We first consider in-distribution text and demonstrate how multimodally trained text encoders\, such as that of CLIP\, outperform models trained in a unimodal vacuum\, such as BERT\, over tasks that require implicit visual reasoning. Expanding to out-of-distribution text\, we address a phenomenon known as sound symbolism\, which studies non-trivial correlations between particular sounds and meanings across languages\, and demonstrate the presence of this phenomenon in vision and language models such as CLIP and Stable Diffusion. Our work provides new angles for understanding what is learned by these vision and language foundation models\, offering principled guidelines for designing models for tasks involving visual reasoning.
URL:https://seasevents.nmsdev7.com/event/fall-2023-grasp-seminar-hadar-elor-tel-aviv-university-marrying-vision-and-language-a-mutually-beneficial-relationship/
LOCATION:Zoom
CATEGORIES:Seminar
ORGANIZER;CN="General Robotics%2C Automation%2C Sensing and Perception (GRASP) Lab":MAILTO:grasplab@seas.upenn.edu
END:VEVENT
END:VCALENDAR