645 Publications

Designing Peptides on a Quantum Computer

V. Mulligan, H Melo, H Merritt, S Slocum, B Weitzner, A Watkins, D. Renfrew, C Pelissier, P Arora, R. Bonneau

Although a wide variety of quantum computers are currently being developed, actual computational results have been largely restricted to contrived, artificial tasks. Finding ways to apply quantum computers to useful, real-world computational tasks remains an active research area. Here we describe our mapping of the protein design problem to the D-Wave quantum annealer. We present a system whereby Rosetta, a state-of-the-art protein design software suite, interfaces with the D-Wave quantum processing unit to find amino acid side chain identities and conformations to stabilize a fixed protein backbone. Our approach, which we call the QPacker, uses a large side-chain rotamer library and the full Rosetta energy function, and in no way reduces the design task to a simpler format. We demonstrate that quantum annealer-based design can be applied to complex real-world design tasks, producing designed molecules comparable to those produced by widely adopted classical design approaches. We also show through large-scale classical folding simulations that the results produced on the quantum annealer can inform wet-lab experiments. For design tasks that scale exponentially on classical computers, the QPacker achieves nearly constant runtime performance over the range of problem sizes that could be tested. We anticipate better than classical performance scaling as quantum computers mature.

Show Abstract
March 11, 2020

Metabolome-Informed Microbiome Analysis Refines Metadata Classifications and Reveals Unexpected Medication Transfer in Captive Cheetahs

J. Gauglitz, J. Morton, A. Tripathi, S. Hansen, M. Gaffney, C. Carpenter, K. Weldon, R. Shah, A. Parampil, A. Fidgett, A. Swafford, R. Knight, P. Dorrenstein

Topological defects determine the structure and function of physical and biological matter over a wide range of scales, from the turbulent vortices in planetary atmospheres, oceans or quantum fluids to bioelectrical signalling in the heart1,2,3 and brain4, and cell death5. Many advances have been made in understanding and controlling the defect dynamics in active6,7,8,9 and passive9,10 non-equilibrium fluids. Yet, it remains unknown whether the statistical laws that govern the dynamics of defects in classical11 or quantum fluids12,13,14 extend to the active matter7,15,16 and information flows17,18 in living systems. Here, we show that a defect-mediated turbulence underlies the complex wave propagation patterns of Rho-GTP signalling protein on the membrane of starfish egg cells, a process relevant to cytoskeletal remodelling and cell proliferation19,20. Our experiments reveal that the phase velocity field extracted from Rho-GTP concentration waves exhibits vortical defect motions and annihilation dynamics reminiscent of those seen in quantum systems12,13, bacterial turbulence15 and active nematics7. Several key statistics and scaling laws of the defect dynamics can be captured by a minimal Helmholtz–Onsager point vortex model21 as well as a generic complex Ginzburg–Landau22 continuum theory, suggesting a close correspondence between the biochemical signal propagation on the surface of a living cell and a widely studied class of two-dimensional turbulence23 and wave22 phenomena.

Show Abstract
March 10, 2020

Inference of Multisite Phosphorylation Rate Constants and Their Modulation by Pathogenic Mutations

E. Yeung, S. McFann, L. Marsh, E. Dufresne, S. Filippi, H. Harrington, S. Shvartsman, M. Wühr

Multisite protein phosphorylation plays a critical role in cell regulation [1, 2, 3]. It is widely appreciated that the functional capabilities of multisite phosphorylation depend on the order and kinetics of phosphorylation steps, but kinetic aspects of multisite phosphorylation remain poorly understood [4, 5, 6]. Here, we focus on what appears to be the simplest scenario, when a protein is phosphorylated on only two sites in a strict, well-defined order. This scenario describes the activation of ERK, a highly conserved cell-signaling enzyme. We use Bayesian parameter inference in a structurally identifiable kinetic model to dissect dual phosphorylation of ERK by MEK, a kinase that is mutated in a large number of human diseases [7, 8, 9, 10, 11, 12]. Our results reveal how enzyme processivity and efficiencies of individual phosphorylation steps are altered by pathogenic mutations. The presented approach, which connects specific mutations to kinetic parameters of multisite phosphorylation mechanisms, provides a systematic framework for closing the gap between studies with purified enzymes and their effects in the living organism.

Show Abstract

Inference of Multisite Phosphorylation Rate Constants and Their Modulation by Pathogenic Mutations

E. Yeung, S. McFann, L. Marsh, E. Dufresne, S. Fillipi, H. Harrington, S. Shvartsman, M. Wühr

Multisite protein phosphorylation plays a critical role in cell regulation [1, 2, 3]. It is widely appreciated that the functional capabilities of multisite phosphorylation depend on the order and kinetics of phosphorylation steps, but kinetic aspects of multisite phosphorylation remain poorly understood [4, 5, 6]. Here, we focus on what appears to be the simplest scenario, when a protein is phosphorylated on only two sites in a strict, well-defined order. This scenario describes the activation of ERK, a highly conserved cell-signaling enzyme. We use Bayesian parameter inference in a structurally identifiable kinetic model to dissect dual phosphorylation of ERK by MEK, a kinase that is mutated in a large number of human diseases [7, 8, 9, 10, 11, 12]. Our results reveal how enzyme processivity and efficiencies of individual phosphorylation steps are altered by pathogenic mutations. The presented approach, which connects specific mutations to kinetic parameters of multisite phosphorylation mechanisms, provides a systematic framework for closing the gap between studies with purified enzymes and their effects in the living organism.

Show Abstract

Optimal tuning of weighted kNN- and diffusion-based methods for denoising single cell genomics data

A Tjärnberg, O Mahmood, C Jackson, G Saldi, K Cho, L Christiaen, R. Bonneau

The analysis of single-cell genomics data presents several statistical challenges, and extensive efforts have been made to produce methods for the analysis of this data that impute missing values, address sampling issues and quantify and correct for noise. In spite of such efforts, no consensus on best practices has been established and all current approaches vary substantially based on the available data and empirical tests. The k-Nearest Neighbor Graph (kNN-G) is often used to infer the identities of, and relationships between, cells and is the basis of many widely used dimensionality-reduction and projection methods. The kNN-G has also been the basis for imputation methods using, e.g., neighbor averaging and graph diffusion. However, due to the lack of an agreed-upon optimal objective function for choosing hyperparameters, these methods tend to oversmooth data, thereby resulting in a loss of information with regard to cell identity and the specific gene-to-gene patterns underlying regulatory mechanisms. In this paper, we investigate the tuning of kNN- and diffusion-based denoising methods with a novel non-stochastic method for optimally preserving biologically relevant informative variance in single-cell data. The framework, Denoising Expression data with a Weighted Affinity Kernel and Self-Supervision (DEWÄKSS), uses a self-supervised technique to tune its parameters. We demonstrate that denoising with optimal parameters selected by our objective function (i) is robust to preprocessing methods using data from established benchmarks, (ii) disentangles cellular identity and maintains robust clusters over dimension-reduction methods, (iii) maintains variance along several expression dimensions, unlike previous heuristic-based methods that tend to oversmooth data variance, and (iv) rarely involves diffusion but rather uses a fixed weighted kNN graph for denoising. Together, these findings provide a new understanding of kNN- and diffusion-based denoising methods and serve as a foundation for future research. Code and example data for DEWÄKSS is available at https://gitlab.com/Xparx/dewakss/-/tree/Tjarnberg2020branch.

Show Abstract

A Bayesian nonparametric approach to super-resolution single-molecule localization

M. Gabitto, H. Marie-Nelly, A. Pakman, A. Pataki, X. Darzacq, M. Jordan

We consider the problem of single-molecule identification in super-resolution microscopy. Super-resolution microscopy overcomes the diffraction limit by localizing individual fluorescing molecules in a field of view. This is particularly difficult since each individual molecule appears and disappears randomly across time and because the total number of molecules in the field of view is unknown. Additionally, data sets acquired with super-resolution microscopes can contain a large number of spurious fluorescent fluctuations caused by background noise.

To address these problems, we present a Bayesian nonparametric framework capable of identifying individual emitting molecules in super-resolved time series. We tackle the localization problem in the case in which each individual molecule is already localized in space. First, we collapse observations in time and develop a fast algorithm that builds upon the Dirichlet process. Next, we augment the model to account for the temporal aspect of fluorophore photo-physics. Finally, we assess the performance of our methods with ground-truth data sets having known biological structure.

Show Abstract
February 25, 2020

Subtype-specific transcriptional regulators in breast tumors subjected to genetic and epigenetic alterations

Q Zhu, X Tekpli, O. Troyanskaya, V Kristensen

Motivation
Breast cancer consists of multiple distinct tumor subtypes, and results from epigenetic and genetic aberrations that give rise to distinct transcriptional profiles. Despite previous efforts to understand transcriptional deregulation through transcription factor networks, the transcriptional mechanisms leading to subtypes of the disease remain poorly understood.

Results
We used a sophisticated computational search of thousands of expression datasets to define extended signatures of distinct breast cancer subtypes. Using ENCODE ChIP-seq data of surrogate cell lines and motif analysis we observed that these subtypes are determined by a distinct repertoire of lineage-specific transcription factors. Furthermore, specific pattern and abundance of copy number and DNA methylation changes at these TFs and targets, compared to other genes and to normal cells were observed. Overall, distinct transcriptional profiles are linked to genetic and epigenetic alterations at lineage-specific transcriptional regulators in breast cancer subtypes.

Show Abstract

The design and logic of terminal patterning in Drosophila

C. Smits, S. Shvartsman

Terminal regions of the early Drosophila embryo are patterned by the highly conserved ERK cascade, giving rise to the nonsegmented terminal structures of the future larva. In less than an hour, this signaling event establishes several gene expression boundaries and sets in motion a sequence of elaborate morphogenetic events. Genetic studies of terminal patterning discovered signaling components and transcription factors that are involved in numerous developmental contexts and deregulated in human diseases. This review summarizes current understanding of signaling and morphogenesis during terminal patterning and discusses several open questions that can now be rigorously investigated using live imaging, omics, and optogenetic approaches. The anatomical simplicity of the terminal patterning system and its amenability to a broad range of increasingly sophisticated genetic perturbations will continue to make it a premier quantitative model for studying multiple aspects of tissue patterning by dynamically controlled cell signaling pathways.

Show Abstract

Activation-induced substrate engagement in ERK signaling

S. Paul, L. Yang, H. Mattingly , Y. Goyal, S. Shvartsman, A. Veraksa

The extracellular signal-regulated kinase (ERK) pathway is an essential component of developmental signaling in metazoans. Previous models of pathway activation suggested that dissociation of activated dually phosphorylated ERK (dpERK) from MAPK/ERK kinase (MEK), a kinase that phosphorylates ERK, and other cytoplasmic anchors, is sufficient for allowing ERK interactions with its substrates. Here, we provide evidence for an additional step controlling ERK’s access to substrates. Specifically, we demonstrate that interaction of ERK with its substrate Capicua (Cic) is controlled at the level of ERK phosphorylation, whereby Cic binds to dpERK much stronger than to unphosphorylated ERK, both in vitro and in vivo. Mathematical modeling suggests that the differential affinity of Cic for dpERK versus ERK is required for both down-regulation of Cic and stabilizing phosphorylated ERK. Preferential association of Cic with dpERK serves two functions: it prevents unproductive competition of Cic with unphosphorylated ERK and contributes to efficient signal propagation. We propose that high-affinity substrate binding increases the specificity and efficiency of signal transduction through the ERK pathway.

Show Abstract

Characterizing chromatin landscape from aggregate and single-cell genomic assays using flexible duration modeling

M. Gabitto, A. Rasmussen, O Wapinksi, K Allaway, N. Carriero, G Fishell, R. Bonneau

ATAC-seq has become a leading technology for probing the chromatin landscape of single and aggregated cells. Distilling functional regions from ATAC-seq presents diverse analysis challenges. Methods commonly used to analyze chromatin accessibility datasets are adapted from algorithms designed to process different experimental technologies, disregarding the statistical and biological differences intrinsic to the ATAC-seq technology. Here, we present a Bayesian statistical approach that uses latent space models to better model accessible regions, termed ChromA. ChromA annotates chromatin landscape by integrating information from replicates, producing a consensus de-noised annotation of chromatin accessibility. ChromA can analyze single cell ATAC-seq data, correcting many biases generated by the sparse sampling inherent in single cell technologies. We validate ChromA on multiple technologies and biological systems, including mouse and human immune cells, establishing ChromA as a top performing general platform for mapping the chromatin landscape in different cellular populations from diverse experimental designs.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates

privacy consent banner

Privacy preference

We use cookies to provide you with the best online experience. By clicking "Accept All," you help us understand how our site is used and enhance its performance. You can change your choice at any time here. To learn more, please visit our Privacy Policy.