CIS Seminar: “Probabilistic Experimental Design for Petascale DNA Synthesis”
March 18, 2025 at 3:30 PM - 4:30 PM
Details
Organizer
Generative modeling offers a powerful paradigm for designing novel functional DNA, RNA and protein sequences. In this talk, I introduce probabilistic experimental design methods to efficiently manufacture samples from generative models of biomolecules in the real world. These algorithms merge computational techniques for approximate sampling with physical randomness. I also develop tools to rigorously evaluate the quality of manufactured samples, including nonparametric
two-sample tests with consistency guarantees and scalable algorithms. I demonstrate synthesizing ~10^16 samples from a generative model of human antibodies, at a sample quality comparable to state-of-the-art protein language models, and a cost of ~$10^3. The library yields candidate therapeutics for “undruggable” cancer targets. Using previous methods, manufacturing a DNA library of the same size and quality would cost roughly ~$10^15.

