645 Publications

Combining machine learning with structure-based protein design to predict and engineer post-translational modifications of proteins

Moritz Ertelt, V. Mulligan, et al.

Post-translational modifications (PTMs) of proteins play a vital role in their function and stability. These modifications influence protein folding, signaling, protein-protein interactions, enzyme activity, binding affinity, aggregation, degradation, and much more. To date, over 400 types of PTMs have been described, representing chemical diversity well beyond the genetically encoded amino acids. Such modifications pose a challenge to the successful design of proteins, but also represent a major opportunity to diversify the protein engineering toolbox. To this end, we first trained artificial neural networks (ANNs) to predict eighteen of the most abundant PTMs, including protein glycosylation, phosphorylation, methylation, and deamidation. In a second step, these models were implemented inside the computational protein modeling suite Rosetta, which allows flexible combination with existing protocols to model the modified sites and understand their impact on protein stability as well as function. Lastly, we developed a new design protocol that either maximizes or minimizes the predicted probability of a particular site being modified. We find that this combination of ANN prediction and structure-based design can enable the modification of existing, as well as the introduction of novel, PTMs. The potential applications of our work include, but are not limited to, glycan masking of epitopes, strengthening protein-protein interactions through phosphorylation, as well as protecting proteins from deamidation liabilities. These applications are especially important for the design of new protein therapeutics where PTMs can drastically change the therapeutic properties of a protein. Our work adds novel tools to Rosetta’s protein engineering toolbox that allow for the rational design of PTMs.

Show Abstract

ERK inhibits Cic repressor function via multisite phosphorylation

Sayantanee Paul, Khandan Ilkhani, S. Shvartsman, et al.

The receptor tyrosine kinase (RTK)/Extracellular Signal-Regulated Kinase (ERK) signaling pathway controls cell proliferation, differentiation, and survival. How ERK activation is relayed to its phosphorylation targets is not well understood. The transcriptional repressor Capicua (Cic) has emerged as a key target for ERK-mediated downregulation in Drosophila and mammals, and mutations in human CIC result in cancer and neurological diseases. Phosphorylation by ERK is critical for Cic downregulation, but the identities of phosphosites in Drosophila Cic are unknown. Here, we identify sites of phosphorylation in Cic that are directly targeted by ERK and validate their developmental functions in vivo using mutant Cic variants. Cic phosphosites are distributed throughout the length of the protein, and a group of centrally located sites appears to have a primary role in Cic downregulation. Cic mutated in 20 high-confidence sites behaves as a “super-repressor” in vivo that is largely insensitive to ERK-mediated downregulation, despite fully retaining the ability to bind to ERK. No single site is sufficient to turn off Cic activity; instead, we find that ERK must phosphorylate multiple sites in Cic simultaneously to achieve full downregulation. This multisite phosphorylation likely targets phosphodegrons that are recognized by ubiquitin ligases such as Ago/FBXW7 and contributes to Cic degradation. This study advances our understanding of the molecular mechanisms of signal interpretation downstream of the RTK/ERK signaling network.

Show Abstract
March 14, 2024

Ensemble Detection of DNA Engineering Signatures

Aaron Adler, Joel S. Bader, A. Persikov

Synthetic biology is creating genetically engineered organisms at an increasing rate for many potentially valuable applications, but this potential comes with the risk of misuse or accidental release. To begin to address this issue, we have developed a system called GUARDIAN that can automatically detect signatures of engineering in DNA sequencing data, and we have conducted a blinded test of this system using a curated Test and Evaluation (T&E) data set. GUARDIAN uses an ensemble approach based on the guiding principle that no single approach is likely to be able to detect engineering with perfect accuracy. Critically, ensembling enables GUARDIAN to detect sequence inserts in 13 target organisms with a high degree of specificity that requires no subject matter expert (SME) review.

Show Abstract

A cell autonomous regulator of neuronal excitability modulates tau in Alzheimer’s disease vulnerable neurons

Patricia Rodriguez-Rodriguez, Luis Enrique Arroyo-Garcia, O. Troyanskaya, et al.

Neurons from layer II of the entorhinal cortex (ECII) are the first to accumulate tau protein aggregates and degenerate during prodromal Alzheimer’s disease. Gaining insight into the molecular mechanisms underlying this vulnerability will help reveal genes and pathways at play during incipient stages of the disease. Here, we use a data-driven functional genomics approach to model ECII neurons in silico and identify the proto-oncogene DEK as a regulator of tau pathology.

We show that epigenetic changes caused by Dek silencing alter activity-induced transcription, with major effects on neuronal excitability. This is accompanied by the gradual accumulation of tau in the somatodendritic compartment of mouse ECII neurons in vivo, reactivity of surrounding microglia, and microglia-mediated neuron loss. These features are all characteristic of early Alzheimer’s disease.

The existence of a cell-autonomous mechanism linking Alzheimer’s disease pathogenic mechanisms in the precise neuron type where the disease starts provides unique evidence that synaptic homeostasis dysregulation is of central importance in the onset of tau pathology in Alzheimer’s disease.

Show Abstract

Computational Design of Phosphotriesterase Improves V-Agent Degradation Efficiency

Jacob Kronenberg, Stanley Chu, D. Renfrew, et al.

Organophosphates (OPs) are a class of neurotoxic acetylcholinesterase inhibitors including widely used pesticides as well as nerve agents such as VX and VR. Current treatment of these toxins relies on reactivating acetylcholinesterase, which remains ineffective. Enzymatic scavengers are of interest for their ability to degrade OPs systemically before they reach their target. Here we describe a library of computationally designed variants of phosphotriesterase (PTE), an enzyme that is known to break down OPs. The mutations G208D, F104A, K77A, A80V, H254G, and I274N broadly improve catalytic efficiency of VX and VR hydrolysis without impacting the structure of the enzyme. The mutation I106 A improves catalysis of VR and L271E abolishes activity, likely due to disruptions of PTE's structure. This study elucidates the importance of these residues and contributes to the design of enzymatic OP scavengers with improved efficiency.

Show Abstract

Precision Medicine in Nephrology: An Integrative Framework of Multidimensional Data in the Kidney Precision Medicine Project

Tarek M. El-Achkar, Michael T. Eadon, R. Sealfon

Chronic kidney disease (CKD) and acute kidney injury (AKI) are heterogeneous syndromes defined clinically by serial measures of kidney function. Each condition possesses strong histopathologic associations, including glomerular obsolescence or acute tubular necrosis, respectively. Despite such characterization, there remains wide variation in patient outcomes and treatment responses. Precision medicine efforts, as exemplified by the Kidney Precision Medicine Project (KPMP), have begun to establish evolving, spatially anchored, cellular and molecular atlases of the cell types, states, and niches of the kidney in health and disease. The KPMP atlas provides molecular context for CKD and AKI disease drivers and will help define subtypes of disease that are not readily apparent from canonical functional or histopathologic characterization but instead are appreciable through advanced clinical phenotyping, pathomic, transcriptomic, proteomic, epigenomic, and metabolomic interrogation of kidney biopsy samples. This perspective outlines the structure of the KPMP, its approach to the integration of these diverse datasets, and its major outputs relevant to future patient care.

Show Abstract

Specificity, synergy, and mechanisms of splice-modifying drugs

Yuma Ishigami, Mandy S. Wong, S. Hanson, et al.

Drugs that target pre-mRNA splicing hold great therapeutic potential, but the quantitative understanding of how these drugs work is limited. Here we introduce mechanistically interpretable quantitative models for the sequence-specific and concentration-dependent behavior of splice-modifying drugs. Using massively parallel splicing assays, RNA-seq experiments, and precision dose-response curves, we obtain quantitative models for two small-molecule drugs, risdiplam and branaplam, developed for treating spinal muscular atrophy. The results quantitatively characterize the specificities of risdiplam and branaplam for 5’ splice site sequences, suggest that branaplam recognizes 5’ splice sites via two distinct interaction modes, and contradict the prevailing two-site hypothesis for risdiplam activity at SMN2 exon 7. The results also show that anomalous single-drug cooperativity, as well as multi-drug synergy, are widespread among small-molecule drugs and antisense-oligonucleotide drugs that promote exon inclusion. Our quantitative models thus clarify the mechanisms of existing treatments and provide a basis for the rational development of new therapies.

Show Abstract

NOVA1 acts as an oncogenic RNA-binding protein to regulate cholesterol homeostasis in human glioblastoma cells

Yuhki Saito, C. Park, et al.

NOVA1 is a neuronal RNA-binding protein identified as the target antigen of a rare autoimmune disorder associated with cancer and neurological symptoms, termed paraneoplastic opsoclonus-myoclonus ataxia. Despite the strong association between NOVA1 and cancer, it has been unclear how NOVA1 function might contribute to cancer biology. In this study, we find that NOVA1 acts as an oncogenic factor in a GBM (glioblastoma multiforme) cell line established from a patient. Interestingly, NOVA1 and Argonaute (AGO) CLIP identified common 3′ untranslated region (UTR) targets, which were down-regulated in NOVA1 knockdown GBM cells, indicating a transcriptome-wide intersection of NOVA1 and AGO–microRNA (miRNA) targets regulation. NOVA1 binding to 3′UTR targets stabilized transcripts including those encoding cholesterol homeostasis related proteins. Selective inhibition of NOVA1–RNA interactions with antisense oligonucleotides disrupted GBM cancer cell fitness. The precision of our GBM CLIP studies point to both mechanism and precise RNA sequence sites to selectively inhibit oncogenic NOVA1–RNA interactions. Taken together, we find that NOVA1 is commonly overexpressed in GBM, where it can antagonize AGO2–miRNA actions and consequently up-regulates cholesterol synthesis, promoting cell viability.

Show Abstract

Training self-learning circuits for power-efficient solutions

Menachem Stern , Douglas J. Durian, Andrea J. Liu, et al.

As the size and ubiquity of artificial intelligence and computational machine learning models grow, the energy required to train and use them is rapidly becoming economically and environmentally unsustainable. Recent laboratory prototypes of self-learning electronic circuits, such as “physical learning machines,” open the door to analog hardware that directly employs physics to learn desired functions from examples at a low energy cost. In this work, we show that this hardware platform allows for an even further reduction in energy consumption by using good initial conditions and a new learning algorithm. Using analytical calculations, simulations, and experiments, we show that a trade-off emerges when learning dynamics attempt to minimize both the error and the power consumption of the solution—greater power reductions can be achieved at the cost of decreasing solution accuracy. Finally, we demonstrate a practical procedure to weigh the relative importance of error and power minimization, improving the power efficiency given a specific tolerance to error.

Show Abstract

Tapioca: a platform for predicting de novo protein–protein interactions in dynamic contexts

Tavis J. Reed, Matthew D. Tyl, O. Troyanskaya, et al.

Protein–protein interactions (PPIs) drive cellular processes and responses to environmental cues, reflecting the cellular state. Here we develop Tapioca, an ensemble machine learning framework for studying global PPIs in dynamic contexts. Tapioca predicts de novo interactions by integrating mass spectrometry interactome data from thermal/ion denaturation or cofractionation workflows with protein properties and tissue-specific functional networks. Focusing on the thermal proximity coaggregation method, we improved the experimental workflow. Finely tuned thermal denaturation afforded increased throughput, while cell lysis optimization enhanced protein detection from different subcellular compartments. The Tapioca workflow was next leveraged to investigate viral infection dynamics. Temporal PPIs were characterized during the reactivation from latency of the oncogenic Kaposi’s sarcoma-associated herpesvirus. Together with functional assays, NUCKS was identified as a proviral hub protein, and a broader role was uncovered by integrating PPI networks from alpha- and betaherpesvirus infections. Altogether, Tapioca provides a web-accessible platform for predicting PPIs in dynamic contexts.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates

privacy consent banner

Privacy preference

We use cookies to provide you with the best online experience. By clicking "Accept All," you help us understand how our site is used and enhance its performance. You can change your choice at any time here. To learn more, please visit our Privacy Policy.