500 Publications

Atlas of primary cell-type-specific sequence models of gene expression and variant effects

Ksenia Sokolova , Chandra L. Theesfeld, A. Wong, O. Troyanskaya, et al.

Human biology is rooted in highly specialized cell types programmed by a common genome, 98% of which is outside of genes. Genetic variation in the enormous noncoding space is linked to the majority of disease risk. To address the problem of linking these variants to expression changes in primary human cells, we introduce ExPectoSC, an atlas of modular deep-learning-based models for predicting cell-type-specific gene expression directly from sequence. We provide models for 105 primary human cell types covering 7 organ systems, demonstrate their accuracy, and then apply them to prioritize relevant cell types for complex human diseases. The resulting atlas of sequence-based gene expression and variant effects is publicly available in a user-friendly interface and readily extensible to any primary cell types. We demonstrate the accuracy of our approach through systematic evaluations and apply the models to prioritize ClinVar clinical variants of uncertain significance, verifying our top predictions experimentally.

Show Abstract

Mitochondrial electron transport chain, ceramide and Coenzyme Q are linked in a pathway that drives insulin resistance in skeletal muscle

Alexis Diaz-Vegas, Soren Madsen, M. Astore, et al.

Insulin resistance (IR) is a complex metabolic disorder that underlies several human diseases, including type 2 diabetes and cardiovascular disease. Despite extensive research, the precise mechanisms underlying IR development remain poorly understood. Here, we provide new insights into the mechanistic connections between cellular alterations associated with IR, including increased ceramides, deficiency of coenzyme Q (CoQ), mitochondrial dysfunction, and oxidative stress. We demonstrate that elevated levels of ceramide in the mitochondria of skeletal muscle cells results in CoQ depletion and loss of mitochondrial respiratory chain components, leading to mitochondrial dysfunction and IR. Further, decreasing mitochondrial ceramide levels in vitro and in animal models (under chow and high fat diet) increased CoQ levels and was protective against IR. CoQ supplementation also rescued ceramide-associated IR. Examination of the mitochondrial proteome from human muscle biopsies revealed a strong correlation between the respirasome system and mitochondrial ceramide as key determinants of insulin sensitivity. Our findings highlight the mitochondrial Ceramide-CoQ-respiratory chain nexus as a potential foundation of an IR pathway that may also play a critical role in other conditions associated with ceramide accumulation and mitochondrial dysfunction, such as heart failure, cancer, and aging. These insights may have important clinical implications for the development of novel therapeutic strategies for the treatment of IR and related metabolic disorders.

Show Abstract
September 19, 2023

Scaling behaviour and control of nuclear wrinkling

Jonathan A. Jackson, Nicolas Romeo, J. I. Alsous, et al.

The cell nucleus is enveloped by a complex membrane, whose wrinkling has been implicated in disease and cellular aging. The biophysical dynamics and spectral evolution of nuclear wrinkling during multicellular development remain poorly understood due to a lack of direct quantitative measurements. Here we characterize the onset and dynamics of nuclear wrinkling during egg development in the fruit fly when nurse cell nuclei increase in size and display stereotypical wrinkling behaviour. A spectral analysis of three-dimensional high-resolution live-imaging data from several hundred nuclei reveals a robust asymptotic power-law scaling of angular fluctuations consistent with renormalization and scaling predictions from a nonlinear elastic shell model. We further demonstrate that nuclear wrinkling can be reversed through osmotic shock and suppressed by microtubule disruption, providing tunable physical and biological control parameters for probing the mechanical properties of the nuclear envelope. Our findings advance the biophysical understanding of nuclear membrane fluctuations during early multicellular development.

Show Abstract

Liquid Filled Elastomers: From Linearization to Elastic Enhancement

Juan Casado Dìaz , G. Francfort, Oscar Lopez-Pamies, Maria Giovanna Mora

Surface tension at cavity walls can play havoc with the mechanical properties of perforated soft solids when the cavities are filled with a fluid. This study is an investigation of the macroscopic elastic properties of elastomers embedding spherical cavities filled with a pressurized liquid in the presence of surface tension, starting with the linearization of the fully nonlinear model and ending with the enhancement properties of the linearized model when many such liquid filled cavities are present.

Show Abstract
September 7, 2023

Multi-Task Curriculum Learning for Partially Labeled Data

Won-Dong Jang, D. Needleman, et al

Incomplete labels are common in multi-task learning for biomedical applications due to several practical difficulties, e.g., expensive annotation efforts by experts, limit of data collection, different sources of data. A naive approach to enable joint learning for partially labeled data is adding self-supervised learning for tasks without ground truths by augmenting an input image and forcing the multi-task model to return the same outputs for both the input and augmented images. However, the partially labeled setting can result in imbalanced learning of tasks since not all tasks are trainable with ground truth supervisions for each data sample. In this work, we propose a multi-task curriculum learning method tailored for partially labeled data. For balanced learning of tasks, our multitask curriculum prioritizes less performing tasks during training by setting different supervised learning frequencies for each task. We demonstrate that our method outperforms standard approaches on one biomedical and two natural image datasets. Furthermore, our learning method with partially labeled data performs better than the standard multi-task learning methods with fully labeled data for the same number of annotations.

Show Abstract

Learning Vector Quantized Shape Code for Amodal Blastomere Instance Segmentation

Won-Dong Jang, D. Needleman, et al.

Blastomere instance segmentation is important for analyzing embryos’ abnormality. To measure the accurate shapes and sizes of blastomeres, their amodal segmentation is necessary. Amodal instance segmentation aims to recover an object’s complete silhouette even when the object is not fully visible. For each detected object, previous methods directly regress the target mask from input features. However, images of an object under different amounts of occlusion should have the same amodal mask output, making it harder to train the regression model. To alleviate the problem, we propose to classify input features into intermediate shape codes and recover complete object shapes. First, we pre-train the Vector Quantized Variational Autoencoder (VQ-VAE) model to learn these discrete shape codes from ground truth amodal masks. Then, we incorporate the VQ-VAE model into the amodal instance segmentation pipeline with an additional refinement module. We also detect an occlusion map to integrate occlusion information with a backbone feature. As such, our network faithfully detects bounding boxes of amodal objects. On an internal embryo cell image benchmark, the proposed method outperforms previous state-of-the-art methods. To show generalizability, we show segmentation results on the public KINS natural image benchmark. Our method would enable accurate measurement of blastomeres in In Vitro Fertilization (IVF) clinics, potentially increasing the IVF success rate.

Show Abstract

Structure-function analysis suggests that the photoreceptor LITE-1 is a light-activated ion channel

S. Hanson, Jan Scholüke , Jana Liewald

Sensation of light is essential for all organisms. The eye-less nematode Caenorhabditis elegans detects UV and blue light to evoke escape behavior. The photosensor LITE-1 absorbs UV photons with an unusually high extinction coefficient, involving essential tryptophans. Here, we modeled the structure and dynamics of LITE-1 using AlphaFold2-multimer and molecular dynamics (MD) simulations and performed mutational and behavioral assays in C. elegans to characterize its function. LITE-1 resembles olfactory and gustatory receptors from insects, recently shown to be tetrameric ion channels. We identified residues required for channel gating, light absorption, and mechanisms of photo-oxidation, involving a likely binding site for the peroxiredoxin PRDX-2. Furthermore, we identified the binding pocket for a putative chromophore. Several residues lining this pocket have previously been established as essential for LITE-1 function. A newly identified critical cysteine pointing into the pocket represents a likely chromophore attachment site. We derived a model for how photon absorption, via a network of tryptophans and other aromatic amino acids, induces an excited state that is transferred to the chromophore. This evokes conformational changes in the protein, possibly leading to a state receptive to oxidation of cysteines and, jointly, to channel gating. Electrophysiological data support the idea that LITE-1 is a photon and H2O2-coincidence detector. Other proteins with similarity to LITE-1, specifically C. elegans GUR-3, likely use a similar mechanism for photon detection. Thus, a common protein fold and assembly, used for chemoreception in insects, possibly by binding of a particular compound, may have evolved into a light-activated ion channel.

Show Abstract

Evolutionary history of MEK1 illuminates the nature of deleterious mutations

Ekaterina P. Andrianova, Robert A. Marmion, S. Shvartsman, Igor B. Zhulin

Mutations in signal transduction pathways lead to various diseases including cancers. MEK1 kinase, encoded by the human MAP2K1 gene, is one of the central components of the MAPK pathway and more than a hundred somatic mutations in the MAP2K1 gene were identified in various tumors. Germline mutations deregulating MEK1 also lead to congenital abnormalities, such as the cardiofaciocutaneous syndrome and arteriovenous malformation. Evaluating variants associated with a disease is a challenge, and computational genomic approaches aid in this process. Establishing evolutionary history of a gene improves computational prediction of disease-causing mutations; however, the evolutionary history of MEK1 is not well understood. Here, by revealing a precise evolutionary history of MEK1, we construct a well-defined dataset of MEK1 metazoan orthologs, which provides sufficient depth to distinguish between conserved and variable amino acid positions. We matched known and predicted disease-causing and benign mutations to evolutionary changes observed in corresponding amino acid positions and found that all known and many suspected disease-causing mutations are evolutionarily intolerable. We selected several variants that cannot be unambiguously assessed by automated prediction tools but that are confidently identified as “damaging” by our approach, for experimental validation in Drosophila. In all cases, evolutionary intolerant variants caused increased mortality and severe defects in fruit fly embryos confirming their damaging nature. We anticipate that our analysis will serve as a blueprint to help evaluate known and novel missense variants in MEK1 and that our approach will contribute to improving automated tools for disease-associated variant interpretation.

Show Abstract
August 14, 2023

Learning fast, accurate, and stable closures of a kinetic theory of an active fluid

Important classes of active matter systems can be modeled using kinetic theories. However, kinetic theories can be high dimensional and challenging to simulate. Reduced-order representations based on tracking only low-order moments of the kinetic model serve as an efficient alternative, but typically require closure assumptions to model unrepresented higher-order moments. In this study, we present a learning framework based on neural networks that exploit rotational symmetries in the closure terms to learn accurate closure models directly from kinetic simulations. The data-driven closures demonstrate excellent a-priori predictions comparable to the state-of-the-art Bingham closure. We provide a systematic comparison between different neural network architectures and demonstrate that nonlocal effects can be safely ignored to model the closure terms. We develop an active learning strategy that enables accurate prediction of the closure terms across the entire parameter space using a single neural network without the need for retraining. We also propose a data-efficient training procedure based on time-stepping constraints and a differentiable pseudo-spectral solver, which enables the learning of stable closures suitable for a-posteriori inference. The coarse-grained simulations equipped with data-driven closure models faithfully reproduce the mean velocity statistics, scalar order parameters, and velocity power spectra observed in simulations of the kinetic theory. Our differentiable framework also facilitates the estimation of parameters in coarse-grained descriptions conditioned on data.

Show Abstract
August 13, 2023

Conformational heterogeneity and probability distributions from single-particle cryo-electron microscopy

W. S. Wai Shing, Ellen D. Zhong, S. Hanson, E. Thiede, P. Cossio

Single-particle cryo-electron microscopy (cryo-EM) is a technique that takes projection images of biomolecules frozen at cryogenic temperatures. A major advantage of this technique is its ability to image single biomolecules in heterogeneous conformations. While this poses a challenge for data analysis, recent algorithmic advances have enabled the recovery of heterogeneous conformations from the noisy imaging data. Here, we review methods for the reconstruction and heterogeneity analysis of cryo-EM images, ranging from linear-transformation-based methods to nonlinear deep generative models. We overview the dimensionality-reduction techniques used in heterogeneous 3D reconstruction methods and specify what information each method can infer from the data. Then, we review the methods that use cryo-EM images to estimate probability distributions over conformations in reduced subspaces or predefined by atomistic simulations. We conclude with the ongoing challenges for the cryo-EM community.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates