Loading Events

CIS Seminar: “Learning to Generate Data by Estimating Gradients of the Data Distribution”

February 24, 2022 at 3:30 PM - 4:30 PM
Details
Date: February 24, 2022
Time: 3:30 PM - 4:30 PM
  • Event Tags:
  • Organizer
    Computer and Information Science
    Phone: 215-898-8560
    Venue
    Zoom – Email CIS for link cherylh@cis.upenn.edu

    Google Map

    Generating data with complex patterns, such as images, audio, and molecular structures, requires fitting very flexible statistical models to the data distribution. Even in the age of deep neural networks, building such models is difficult because they typically require an intractable normalization procedure to represent a probability distribution. To address this challenge, I propose to model the vector field of gradients of the data distribution (known as the score function), which does not require normalization and therefore can take full advantage of the flexibility of deep neural networks. I will show how to (1) estimate the score function from data with flexible deep neural networks and principled statistical methods, (2) generate new data using stochastic differential equations and Markov chain Monte Carlo, and even (3) evaluate probabilities as in a traditional statistical model. The resulting method, called score-based generative modeling, achieves record-breaking performance in applications including image synthesis, text-to-speech generation, time series prediction, and point cloud generation, challenging the long-time dominance of generative adversarial networks (GANs) on many of these tasks. Furthermore, unlike GANs, score-based generative models are suitable for Bayesian reasoning tasks such as solving ill-posed inverse problems, and I have demonstrated their superior performance on examples like sparse-view computed tomography and accelerated magnetic resonance imaging. Finally, I will discuss how score-based generative modeling opens up new opportunities and new future research directions for building better machines to create and understand complex data in various disciplines of science and engineering.