Name: CIS Seminar: “Diffusion Models in Computer Vision”
Start: 2023-11-30T15:30:00-05:00
End: 2023-11-30T16:30:00-05:00
Location: Wu and Chen Auditorium (Room 101), Levine Hall

CIS Seminar: “Diffusion Models in Computer Vision”

November 30, 2023 at 3:30 PM - 4:30 PM

Share this event

Add to Calendar

Details

Date: November 30, 2023

Time: 3:30 PM - 4:30 PM

Event Tags:CIS

Organizer

Computer and Information Science

Phone: 215-898-8560

Email: cherylh@cis.upenn.edu

Website: View Organizer Website

Venue

Wu and Chen Auditorium (Room 101), Levine Hall 3330 Walnut Street
Philadelphia
PA 19104 Google Map

View Venue Website

Denoising diffusion models represent a recent emerging topic in computer vision, demonstrating impressive results in generative modeling. A diffusion model is a deep generative model that is based on two stages, a forward diffusion stage and a reverse diffusion stage. In the forward diffusion stage, the input data is gradually perturbed over several steps by adding Gaussian noise. In the reverse stage, a model is tasked at recovering the original input data by learning to gradually reverse the diffusion. Diffusion models are widely appreciated for the quality and diversity of the generated images. In this talk I will present our recent work on how diffusion models can be employed for solving computer vision problems. First, I will discuss temporal action segmentation for comprehending human behaviors in complex videos, which aims to process a long video and produce a sequence that delineates the action category for each frame. I will present a framework based on the denoising diffusion model that iteratively produces action predictions starting with random noise, conditioned on the features of the input video. To effectively capture three key characteristics of human actions, namely the position prior, the boundary ambiguity, and the relational dependency, we propose a cohesive masking strategy for the conditioning features. Next, I will briefly discuss how diffusion models are employed to solve the problems of person image synthesis, cloth-changing person re-identification, and limited field of view cross-view geo-localization and present state of results.

Although the use of diffusion models has yielded positive results in text-to-image generation, there is a notable lack of research regarding the understanding of these models. For example, there is a rising need to understand how to design effective prompts that produce the desired outcome. Next, I will briefly talk about our ongoing work on Reverse Stable Diffusion: What prompt was used to generate this image? I will end this talk by briefly discussing our recent work that underscores the significance of incorporating symmetries into diffusion models, by enforcing equivariance to a general set of transformations within DDPM’s reverse denoising learning process.

CIS Seminar: “Diffusion Models in Computer Vision”

November 30, 2023 at 3:30 PM - 4:30 PM

Details

Organizer

Venue

Read More