661 Publications

Contrastive pre-training for sequence based genomics models

Ksenia Sokolova, Kathleen M. Chen, O. Troyanskaya

In recent years deep learning has become one of the central approaches in a number of applications, including many tasks in genomics. However, as models grow in depth and complexity, they either require more data or a strategic initialization technique to improve performance. In this project, we introduce cGen, a novel unsupervised, model-agnostic contrastive pretraining method for sequence-based models. cGen can be used before training to initialize weights, reducing the size of the dataset needed. It works through learning the intrinsic features of the reference genome and makes no assumptions on the underlying structure. We show that the embeddings produced by the unsupervised model are already informative for gene expression prediction and that the sequence features provide a meaningful clustering. We demonstrate that cGen improves model performance in various sequence-based deep learning applications, such as chromatin profiling prediction and gene expression. Our findings suggest that using cGen, particularly in areas constrained by data availability, could improve the performance of deep learning genomic models without the need to modify the model architecture.

Show Abstract
June 12, 2024

Variational bounds and nonlinear stability of an active nematic suspension

We use the entropy method to analyse the nonlinear dynamics and stability of a continuum kinetic model of an active nematic suspension. From the time evolution of the relative entropy, an energy-like quantity in the kinetic model, we derive a variational bound on relative entropy fluctuations that can be expressed in terms of orientational order parameters. From this bound we show isotropic suspensions are nonlinearly stable for sufficiently low activity, and derive upper bounds on spatiotemporal averages in the unstable regime that are consistent with fully nonlinear simulations. This work highlights the self-organising role of activity in particle suspensions, and places limits on how organised such systems can be.

Show Abstract

Martini without the twist: Unveiling a mechanically correct microtubule through bottom-up coarse-graining in Martini 3

Microtubules are essential cytoskeletal filaments involved in cell motility, division, and intracellular transport. These biomolecular assemblies can exhibit complex structural be-haviors influenced by various biophysical factors. However, simulating microtubule systems at the atomistic scale is challenging due to their large spatial scales. Here, we present an approach utilizing the Martini 3 Coarse-Grained (CG) model coupled with an appropriate elastic network to simulate microtubule-based systems accurately. By iteratively optimiz-ing the elastic network parameters, we matched the structural fluctuations of CG hetero-dimer building blocks to their atomistic counterparts. Our efforts culminated in a ∼ 200nm microtubule built with ∼ 6 million interaction-centers that could reproduce experimentally observed mechanical properties. Our aim is to employ these CG simulations to investigate specific biophysical phenomena at a microscopic level. These microscopic perspectives can provide valuable insights into the underlying mechanisms and contribute to our knowledge of microtubule-associated processes in cellular biology. With MARTINI 3 CG simulations, we can bridge the gap between computational efficiency and molecular detail, enabling in-vestigations into these biophysical processes over longer spatio-temporal scales with amino acid-level insights.

Show Abstract
June 1, 2024

Metabolic imaging of human cumulus cells reveals associations with pregnancy and live birth

M. Venturas, C Racowsky, D. Needleman

Can fluorescence lifetime imaging microscopy (FLIM) detect associations between the metabolic state of cumulus cell (CC) samples and the clinical outcome of the corresponding embryos?

FLIM can detect significant variations in the metabolism of CC associated with the corresponding embryos that resulted in a clinical pregnancy versus those that did not.

Show Abstract

MousiPLIER: A Mouse Pathway-Level Information Extractor Model

Shuo Zhang , Benjamin J. Heil, W. Mao , et al.

High throughput gene expression profiling measures individual gene expression across conditions. However, genes are regulated in complex networks, not as individual entities, limiting the interpretability of gene expression data. Machine learning models that incorporate prior biological knowledge are a powerful tool to extract meaningful biology from gene expression data. Pathway-level information extractor (PLIER) is an unsupervised machine learning method that defines biological pathways by leveraging the vast amount of published transcriptomic data. PLIER converts gene expression data into known pathway gene sets, termed latent variables (LVs), to substantially reduce data dimensionality and improve interpretability. In the current study, we trained the first mouse PLIER model on 190,111 mouse brain RNA-sequencing samples, the greatest amount of training data ever used by PLIER. We then validated the mousiPLIER approach in a study of microglia and astrocyte gene expression across mouse brain aging. mousiPLIER identified biological pathways that are significantly associated with aging, including one latent variable (LV41) corresponding to striatal signal. To gain further insight into the genes contained in LV41, we performed k-means clustering on the training data to identify studies that respond strongly to LV41. We found that the variable was relevant to striatum and aging across the scientific literature. Finally, we built a web server (http://mousiplier.greenelab.com/) for users to easily explore the learned latent variables. Taken together this study defines mousiPLIER as a method to uncover meaningful biological processes in mouse brain transcriptomic studies.

Show Abstract
May 24, 2024

Temperature compensation through kinetic regulation in biochemical oscillators

Yuhai Tu, et al.

Nearly all circadian clocks maintain a period that is insensitive to temperature changes, a phenomenon known as temperature compensation (TC). Yet, it is unclear whether there is any common feature among different systems that exhibit TC. From a general timescale invariance, we show that TC relies on the existence of certain period-lengthening reactions wherein the period of the system increases strongly with the rates in these reactions. By studying several generic oscillator models, we show that this counterintuitive dependence is nonetheless a common feature of oscillators in the nonlinear (far-from-onset) regime where the oscillation can be separated into fast and slow phases. The increase of the period with the period-lengthening reaction rates occurs when the amplitude of the slow phase in the oscillation increases with these rates while the progression speed in the slow phase is controlled by other rates of the system. The positive dependence of the period on the period-lengthening rates balances its inverse dependence on other kinetic rates in the system, which gives rise to robust TC in a wide range of parameters. We demonstrate the existence of such period-lengthening reactions and their relevance for TC in all four model systems we considered. Theoretical results for a model of the Kai system are supported by experimental data. A study of the energy dissipation also shows that better TC performance requires higher energy consumption. Our study unveils a general mechanism by which a biochemical oscillator achieves TC by operating in parameter regimes far from the onset where period-lengthening reactions exist.

Show Abstract

Molecular adaptations in response to exercise training are associated with tissue-specific transcriptomic and epigenomic signatures

Venugopalan D. Nair , Hanna Pincas, W. Mao , et al.

Regular exercise has many physical and brain health benefits, yet the molecular mechanisms mediating exercise effects across tissues remain poorly understood. Here we analyzed 400 high-quality DNA methylation, ATAC-seq, and RNA-seq datasets from eight tissues from control and endurance exercise-trained (EET) rats. Integration of baseline datasets mapped the gene location dependence of epigenetic control features and identified differing regulatory landscapes in each tissue. The transcriptional responses to 8 weeks of EET showed little overlap across tissues and predominantly comprised tissue-type enriched genes. We identified sex differences in the transcriptomic and epigenomic changes induced by EET. However, the sex-biased gene responses were linked to shared signaling pathways. We found that many G protein-coupled receptor-encoding genes are regulated by EET, suggesting a role for these receptors in mediating the molecular adaptations to training across tissues. Our findings provide new insights into the mechanisms underlying EET-induced health benefits across organs.

Show Abstract

Learning fast, accurate, and stable closures of a kinetic theory of an active fluid

Important classes of active matter systems can be modeled using kinetic theories. However, kinetic theories can be high dimensional and challenging to simulate. Reduced-order representations based on tracking only low-order moments of the kinetic model serve as an efficient alternative, but typically require closure assumptions to model unrepresented higher-order moments. In this study, we present a learning framework based on neural networks that exploits rotational symmetries in the closure terms to learn accurate closure models directly from kinetic simulations. The data-driven closures demonstrate excellent a-priori predictions comparable to the state-of-the-art Bingham closure. We provide a systematic comparison between different neural network architectures and demonstrate that nonlocal effects can be safely ignored to model the closure terms. We develop an active learning strategy that enables accurate prediction of the closure terms across the entire parameter space using a single neural network without the need for retraining. We also propose a data-efficient training procedure based on time-stepping constraints and a differentiable pseudo-spectral solver, which enables the learning of stable closures suitable for a-posteriori inference. The coarse-grained simulations equipped with data-driven closure models faithfully reproduce the mean velocity statistics, scalar order parameters, and velocity power spectra observed in simulations of the kinetic theory. Our differentiable framework also facilitates the estimation of parameters in coarse-grained descriptions conditioned on data.

Show Abstract

Microstructure-Based Modeling of Primary Cilia Mechanics

Nima Mostafazadeh, Y.-N. Young, et al.

A primary cilium, made of nine microtubule doublets enclosed in a cilium membrane, is a mechanosensing organelle that bends under an external mechanical load and sends an intracellular signal through transmembrane proteins activated by cilium bending. The nine microtubule doublets are the main load-bearing structural component, while the transmembrane proteins on the cilium membrane are the main sensing component. No distinction was made between these two components in all existing models, where the stress calculated from the structural component (nine microtubule doublets) was used to explain the sensing location, which may be totally misleading. For the first time, we developed a microstructure-based primary cilium model by considering these two components separately. First, we refined the analytical solution of bending an orthotropic cylindrical shell for individual microtubule, and obtained excellent agreement between finite element simulations and the theoretical predictions of a microtubule bending as a validation of the structural component in the model. Second, by integrating the cilium membrane with nine microtubule doublets and simulating the tip-anchored optical tweezer experiment on our computational model, we found that the microtubule doublets may twist significantly as the whole cilium bends. Third, besides being cilium-length-dependent, we found the mechanical properties of the cilium are also highly deformation-dependent. More important, we found that the cilium membrane near the base is not under pure in-plane tension or compression as previously thought, but has significant local bending stress. This challenges the traditional model of cilium mechanosensing, indicating that transmembrane proteins may be activated more by membrane curvature than membrane stretching. Finally, we incorporated imaging data of primary cilia into our microstructure-based cilium model, and found that comparing to the ideal model with uniform microtubule length, the imaging-informed model shows the nine microtubule doublets interact more evenly with the cilium membrane, and their contact locations can cause even higher bending curvature in the cilium membrane than near the base.

Show Abstract
April 27, 2024

Self-organized dynamics of a viscous drop with interfacial nematic activity

M. Firouznia , David Saintillan

We study emergent dynamics in a viscous drop subject to interfacial nematic activity. Using hydrodynamic simulations, we show how the interplay of nematodynamics, activity-driven flows and surface deformations gives rise to a sequence of self-organized behaviors of increasing complexity, from periodic braiding motions of topological defects to chaotic defect dynamics and active turbulence, along with spontaneous shape changes and translation. Our findings recapitulate qualitative features of experiments and shed light on the mechanisms underpinning morphological dynamics in active interfaces.

Show Abstract
April 17, 2024
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates

privacy consent banner

Privacy preference

We use cookies to provide you with the best online experience. By clicking "Accept All," you help us understand how our site is used and enhance its performance. You can change your choice at any time here. To learn more, please visit our Privacy Policy.