CBE Seminar: “Structure-Independent Peptide Binder Design via Generative Language Models” (Chatterjee, Duke University)
November 1, 2023 at 3:30 PM - 4:30 PM
Organizer
Venue
The ability to modulate pathogenic proteins represents a powerful treatment strategy for diseases. Unfortunately, many proteins are considered “undruggable” by small molecules, and are often intrinsically disordered, precluding the usage of structure-based tools for binder design. To address these challenges, we have developed a suite of algorithms that enable the design of target-specific peptides via protein language model embeddings, without the requirement of 3D structures. First, we train a model, SaLT&PepPR, that leverages ESM-2 embeddings to efficiently select high-affinity peptides from natural protein interaction interfaces. Next, we develop a generator-discriminator model, PepPrCLIP, based on the CLIP architecture, to generate and screen de novo peptides with selectivity to a specified target protein. As input to the discriminator, we create a Gaussian diffusion generator to sample an ESM-2 based latent space, fine-tuned on experimentally-valid peptide sequences. Finally, to enable target-conditioned de novo generation of binding peptides, we train a masked language model, PepMLM to discontinuously unmask peptides given target sequences. Our final model demonstrates low perplexities across both existing and generated peptide sequences. We experimentally fuse model-derived peptides to E3 ubiquitin ligase domains and reliably identify candidates exhibiting functionally potent degradation of undruggable, disordered targets in cancer models. Overall, our work enables generation of programmable modulators to any target protein, without the requirement of conformationally stable three-dimensional structures.

