726 Publications

Microtubules in Martini: Parameterizing a heterogeneous elastic-network towards a mechanically accurate microtubule

Microtubules are essential cytoskeletal filaments involved in cell motility, division, and intracellular transport, exhibiting complex structural dynamics governed by diverse biophysical factors. Atomistic simulations of microtubule assemblies remain challenging due to their extensive spatiotemporal scales. To address this, we present a multiscale approach combining the primarily top-down Martini 3 coarse-grained (CG) model with an appropriately parameterized heterogeneous elastic network to capture microtubule mechanics and molecular detail efficiently. By iteratively tuning the elastic network, we matched the structural fluctuations of CG heterodimeric building blocks to atomistic reference data, reproducing experimentally consistent mechanical properties. This framework helped us identify stabilizing long-lived interactions between charged C-terminal tails and the folded domain of neighboring tubulin subunits, offering insight into sequence-specific contributions to lattice stability. Our efforts culminated in the construction of a 200 nm microtubule composed of million interaction centers, enabling exploration of large-scale microtubule-associated processes with amino acid-level resolution. This work bridges the gap between molecular specificity and computational scalability, offering a platform for simulating biophysical processes across cellular length and time scales.

Show Abstract

Microtubules in Martini: Parameterizing a heterogeneous elastic-network towards a mechanically accurate microtubule

Microtubules are essential cytoskeletal filaments involved in cell motility, division, and intracellular transport, exhibiting complex structural dynamics governed by diverse biophysical factors. Atomistic simulations of microtubule assemblies remain challenging due to their extensive spatiotemporal scales. To address this, we present a multiscale approach combining the primarily top-down Martini 3 coarse-grained (CG) model with an appropriately parameterized heterogeneous elastic network to capture microtubule mechanics and molecular detail efficiently. By iteratively tuning the elastic network, we matched the structural fluctuations of CG heterodimeric building blocks to atomistic reference data, reproducing experimentally consistent mechanical properties. This framework helped us identify stabilizing long-lived interactions between charged C-terminal tails and the folded domain of neighboring tubulin subunits, offering insight into sequence-specific contributions to lattice stability. Our efforts culminated in the construction of a 200 nm microtubule composed of million interaction centers, enabling exploration of large-scale microtubule-associated processes with amino acid-level resolution. This work bridges the gap between molecular specificity and computational scalability, offering a platform for simulating biophysical processes across cellular length and time scales.

Show Abstract

RocketSHP: Ultra-fast Proteome-scale Prediction of Protein Dynamics

Proteins are dynamic molecules that depend on conformational flexibility to carry out functions in the cell, yet despite significant advances in the modeling of static protein structure, prediction of these dynamics remains challenging. We introduce RocketSHP, a machine learning model that predicts dynamic protein properties from sequence or static structure with unprecedented speed and accuracy. Trained on thousands of molecular dynamics trajectories spanning diverse protein families, RocketSHP simultaneously models multiple dynamics features: root-mean-square fluctuations (RMSF), generalized correlation coefficients (GCC-LMI), and a novel structural heterogeneity profile (SHP) based on recent structure quantization methods. RocketSHP significantly outperforms existing methods in predicting simulation-derived dynamics. We reduce RMSF prediction error by 57% compared to BioEmu and calibrated Dyna-1 predictions, including an up to 73% error reduction for long proteins. We validate these predictions with experimental hetNOE data, and we demonstrate the ability to adapt predictions to different physical temperatures. We highlight RocketSHP’s utility in constructing allosteric networks in the oncogene KRAS and identify structural sub-modules with correlated motions, and we validate RocketSHP by showing that changes in node centrality within predicted KRAS allosteric networks correlate with changes of folding free energy in experimental DMS data. Our approach makes predictions in seconds rather than hours or days, enabling us to perform the first comprehensive dynamics analysis of the entire human proteome. RocketSHP bridges the gap between static structural biology and dynamic functional understanding, enabling dynamics-aware structural analysis and variant effect prediction at scales previously unavailable. RocketSHP is available as free and open-source software at https://github.com/flatironinstitute/RocketSHP.

Show Abstract
June 17, 2025

RocketSHP: Ultra-fast Proteome-scale Prediction of Protein Dynamics

Proteins are dynamic molecules that depend on conformational flexibility to carry out functions in the cell, yet despite significant advances in the modeling of static protein structure, prediction of these dynamics remains challenging. We introduce RocketSHP, a machine learning model that predicts dynamic protein properties from sequence or static structure with unprecedented speed and accuracy. Trained on thousands of molecular dynamics trajectories spanning diverse protein families, RocketSHP simultaneously models multiple dynamics features: root-mean-square fluctuations (RMSF), generalized correlation coefficients (GCC-LMI), and a novel structural heterogeneity profile (SHP) based on recent structure quantization methods. RocketSHP significantly outperforms existing methods in predicting simulation-derived dynamics. We reduce RMSF prediction error by 57% compared to BioEmu and calibrated Dyna-1 predictions, including an up to 73% error reduction for long proteins. We validate these predictions with experimental hetNOE data, and we demonstrate the ability to adapt predictions to different physical temperatures. We highlight RocketSHP’s utility in constructing allosteric networks in the oncogene KRAS and identify structural sub-modules with correlated motions, and we validate RocketSHP by showing that changes in node centrality within predicted KRAS allosteric networks correlate with changes of folding free energy in experimental DMS data. Our approach makes predictions in seconds rather than hours or days, enabling us to perform the first comprehensive dynamics analysis of the entire human proteome. RocketSHP bridges the gap between static structural biology and dynamic functional understanding, enabling dynamics-aware structural analysis and variant effect prediction at scales previously unavailable. RocketSHP is available as free and open-source software at https://github.com/flatironinstitute/RocketSHP.

Show Abstract

ExEnDiff: An Experiment-Guided Diffusion Model for Protein Conformational Ensemble Generation

Yikai Liu, A. Sahoo, S. Hanson, et al.

Understanding protein conformation is key to understanding their function. Importantly, most proteins adopt multiple conformations with nontrivial ensemble distributions that change depending on their environment to perform functions like catalysis, signaling, and transport. Recently, machine learning techniques, especially deep generative models, have been employed to develop protein conformation generators. These models, known as unified protein ensemble samplers, are trained on the Protein Data Bank (PDB) dataset and can generate diverse protein conformation ensembles given a protein sequence. However, their reliance solely on structural data from the PDB, which primarily captures folded protein states, restricts the diversity of the generated ensembles and can result in physically unrealistic conformations. In this paper, we overcome these challenges by introducing ExEnDiff, an experiment-guided diffusion model for protein conformation generation. ExEnDiff integrates experimental measurements as a physical prior, enabling the generation of protein conformations with desired properties. Our experiments on a variety of fast-folding and intrinsically disordered proteins demonstrate that ExEnDiff significantly advances the capabilities of current unified protein ensemble samplers. With little computational cost, ExEnDiff can capture important proteins' configuration properties and the underlying Boltzmann distribution, paving the way for a next-generation molecular dynamics engine. We further demonstrate the effectiveness of ExEnDiff to capture conformational changes in the presence of mutations and as an efficient tool for determining a reasonable collective variable space for protein ensembles. With these results, ExEnDiff is well poised to push the study of protein ensembles into a data-rich regime currently available to few problems in biology.

Show Abstract
June 10, 2025

ExEnDiff: An Experiment-Guided Diffusion Model for Protein Conformational Ensemble Generation

Yikai Liu, A. Sahoo, S. Hanson, et al.

Understanding protein conformation is key to understanding their function. Importantly, most proteins adopt multiple conformations with nontrivial ensemble distributions that change depending on their environment to perform functions like catalysis, signaling, and transport. Recently, machine learning techniques, especially deep generative models, have been employed to develop protein conformation generators. These models, known as unified protein ensemble samplers, are trained on the Protein Data Bank (PDB) dataset and can generate diverse protein conformation ensembles given a protein sequence. However, their reliance solely on structural data from the PDB, which primarily captures folded protein states, restricts the diversity of the generated ensembles and can result in physically unrealistic conformations. In this paper, we overcome these challenges by introducing ExEnDiff, an experiment-guided diffusion model for protein conformation generation. ExEnDiff integrates experimental measurements as a physical prior, enabling the generation of protein conformations with desired properties. Our experiments on a variety of fast-folding and intrinsically disordered proteins demonstrate that ExEnDiff significantly advances the capabilities of current unified protein ensemble samplers. With little computational cost, ExEnDiff can capture important proteins' configuration properties and the underlying Boltzmann distribution, paving the way for a next-generation molecular dynamics engine. We further demonstrate the effectiveness of ExEnDiff to capture conformational changes in the presence of mutations and as an efficient tool for determining a reasonable collective variable space for protein ensembles. With these results, ExEnDiff is well poised to push the study of protein ensembles into a data-rich regime currently available to few problems in biology.

Show Abstract

Nonlinear spontaneous flow instability in active nematics

I. Lavi, Ricard Alert, Jean-François Joanny, Jaume Casademunt

Active nematics exhibit spontaneous flows through a well-known linear instability of the uniformly aligned quiescent state. Here, we show that even a linearly stable uniform state can experience a nonlinear instability, resulting in a discontinuous transition to spontaneous flows. In this case, quiescent and flowing states may coexist. Through a weakly nonlinear analysis and a numerical study, we trace the bifurcation diagram of striped patterns and show that the underlying pitchfork bifurcation switches from supercritical (continuous) to subcritical (discontinuous) by varying the flow-alignment parameter. We predict that the discontinuous spontaneous flow transition occurs for a wide range of parameters, including systems of contractile flow-aligning rods. Our predictions are relevant to active nematic turbulence and can potentially be tested in experiments on either cell layers or active cytoskeletal suspensions.

Show Abstract

Generation of fate patterns via intercellular forces

H. Nunley , Xufeng Xue, Jianping Fu, David K. Lubensky

Studies of fate patterning during development typically emphasize cell-cell communication via diffusible chemical signals. Recent experiments on stem cell colonies, however, suggest that in some cases mechanical stresses, rather than secreted chemicals, enable long-ranged cell-cell interactions that specify positional information and pattern cell fates. These findings inspire a model of mechanical patterning: fate affects cell contractility, and pressure in the cell layer biases fate. Cells at the colony edge, more contractile than cells at the center, seed a pattern that propagates via force transmission. Strikingly, our model implies that the width of the outer fate domain varies nonmonotonically with substrate stiffness, a prediction that we confirm experimentally; we argue that a similar dependence on substrate stiffness can be achieved by a chemical morphogen only if strong constraints on the signaling pathway's mechanobiology are met. Our findings thus support the idea that mechanical stress can mediate patterning in the complete absence of chemical morphogens, even in nonmotile cell layers, thus expanding the repertoire of possible roles for mechanical signals in development and morphogenesis. Future tests of additional model predictions, like the effect of anisotropic substrate rigidity, will further broaden the range of achievable fate patterns.

Show Abstract

Amortized template matching of molecular conformations from cryoelectron microscopy images using simulation-based inference

Lars Dingeldein, P. Cossio, et al.

Characterizing the conformational ensemble of biomolecular systems is key to understand their functions. Cryoelectron microscopy (cryo-EM) captures two-dimensional snapshots of biomolecular ensembles, giving in principle access to thermodynamics. However, these images are very noisy and show projections of the molecule in unknown orientations, making it very difficult to identify the biomolecule’s conformation in each individual image. Here, we introduce cryo-EM simulation-based inference (cryoSBI) to infer the conformations of biomolecules and the uncertainties associated with the inference from individual cryo-EM images. CryoSBI builds on simulation-based inference, a merger of physics-based simulations and probabilistic deep learning, allowing us to use Bayesian inference even when likelihoods are too expensive to calculate. We begin with an ensemble of conformations, templates from experiments, and molecular modeling, serving as structural hypotheses. We train a neural network approximating the Bayesian posterior using simulated images from these templates and then use it to accurately infer the conformation of the biomolecule from each experimental image. Training is only done once on simulations, and after that, it takes just a few milliseconds to make inference on an image, making cryoSBI suitable for arbitrarily large datasets and direct analysis on micrographs. CryoSBI eliminates the need to estimate particle pose and imaging parameters, significantly enhancing the computational speed compared to explicit likelihood methods. Importantly, we obtain interpretable machine learning models by integrating physics-based approaches with deep neural networks, ensuring that our results are transparent and reliable. We illustrate and benchmark cryoSBI on synthetic data and showcase its promise on experimental single-particle cryo-EM data.

Show Abstract

Self-reorganization and Information Transfer in Massive Schools of Fish

Haotian Hang, Chenchen Huang, A. Barnett, Eva Kanso

The remarkable cohesion and coordination observed in moving animal groups and their collective responsiveness to threats are thought to be mediated by scale-free correlations, where changes in the behavior of one animal influence others in the group, regardless of the distance between them. But are these features independent of group size? Here, we investigate group cohesiveness and collective responsiveness in computational models of massive schools of fish of up to 50,000 individuals. We show that as the number of swimmers increases, flow interactions destabilize the school, creating clusters that constantly fragment, disperse, and regroup, similar to their biological counterparts. We calculate the spatial correlation and speed of information propagation in these dynamic clusters. Spatial correlations in cohesive and polarized clusters are indeed scale free, much like in natural animal groups, but fragmentation events are preceded by a decrease in correlation length, thus diminishing the group's collective responsiveness, leaving it more vulnerable to predation events. Importantly, in groups undergoing collective turns, the information about the change in direction propagates linearly in time among group members, thanks to the non-reciprocal nature of the visual interactions between individuals. Merging speeds up the transfer of information within each cluster by several fold, while fragmentation slows it down. Our findings suggest that flow interactions may have played an important role in group size regulation, behavioral adaptations, and dispersion in living animal groups.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates