2795 Publications

seekrflow: Towards end-to-end automated simulation pipeline with machine-learned force fields for accelerated drug-target kinetic and thermodynamic predictions

A. A. Ojha, Lane W. Votapka, S. Hanson, et al.

Accurate prediction of drug-target binding and unbinding kinetics and thermodynamics is essential for guiding drug discovery and lead optimization. However, traditional atomistic simulations are often too computationally expensive to capture rare events that govern ligand (un)binding. Several enhanced sampling methods exist to overcome these limitations, but they require extensive manual intervention and introduce variability and artifacts in free energy and kinetic estimates that limit high-throughput scalability. The present work introduces seekrflow, an automated multiscale milestoning simulation pipeline that streamlines the entire workflow from a single receptor-ligand input structure to kinetic and thermodynamic predictions in a single step. This integrated approach minimizes manual intervention, reduces computational overhead, and enhances the reproducibility and accuracy of kinetic and thermodynamic predictions. The accuracy and efficiency of the pipeline is demonstrated on multiple receptor-ligand complexes, including inhibitors of heat shock protein 90, threonine-tyrosine kinase, and the trypsin protein, with predicted kinetic parameters closely matching experimental estimates. seekrflow establishes a new benchmark for automated and high-throughput physics-based predictions of kinetics and thermodynamics.

Show Abstract

Study of Protein-Protein Interactions in Septin Assembly: Multiple amphipathic helix domains cooperate in binding to the lipid membrane

Septins are a conserved family of cytoskeletal proteins known for sensing micron-scale membrane curvature via amphipathic helix (AH) domains. While cooperative interactions in septin assembly have been suggested, the molecular mechanisms governing membrane binding and assembly remain unclear. Building on prior findings, we use all-atom molecular dynamics simulations to examine how single and paired extended AH domains, derived from Cdc12, interact with lipid bilayers. A single membrane-bound AH adopts a curved conformation. In solution, a second AH peptide preferentially interacts with the bound peptide through conserved salt bridges, favoring an antiparallel arrangement. Simulations of covalently linked AH tandems confirm this configuration. Dual membrane-bound peptides induce lipid packing defects, reduce tail order, and exhibit slight membrane displacement, suggesting curved membranes may better accommodate multiple AH domains. Our findings advance the mechanistic understanding of septin-membrane interactions and highlight the role of cooperative AH domain binding in stabilizing higher-order structures.

Show Abstract
August 12, 2025

Study of Protein-Protein Interactions in Septin Assembly: Multiple amphipathic helix domains cooperate in binding to the lipid membrane

Septins are a conserved family of cytoskeletal proteins known for sensing micron-scale membrane curvature via amphipathic helix (AH) domains. While cooperative interactions in septin assembly have been suggested, the molecular mechanisms governing membrane binding and assembly remain unclear. Building on prior findings, we use all-atom molecular dynamics simulations to examine how single and paired extended AH domains, derived from Cdc12, interact with lipid bilayers. A single membrane-bound AH adopts a curved conformation. In solution, a second AH peptide preferentially interacts with the bound peptide through conserved salt bridges, favoring an antiparallel arrangement. Simulations of covalently linked AH tandems confirm this configuration. Dual membrane-bound peptides induce lipid packing defects, reduce tail order, and exhibit slight membrane displacement, suggesting curved membranes may better accommodate multiple AH domains. Our findings advance the mechanistic understanding of septin-membrane interactions and highlight the role of cooperative AH domain binding in stabilizing higher-order structures.

Show Abstract

Progressive Optimal Path Sampling for Closed-Loop Optimal Control Design with Deep Neural Networks

Xuanxi Zhang , Jihao Long, Wei Hu, Weinan E , J. Han

Closed-loop optimal control design for high-dimensional nonlinear systems has been a long-standing challenge. Traditional methods, such as solving the associated Hamilton-Jacobi-Bellman equation, suffer from the curse of dimensionality. Recent literature proposed a new promising approach based on supervised learning, by leveraging powerful open-loop optimal control solvers to generate training data and neural networks as efficient high-dimensional function approximators to fit the closed-loop optimal control. This approach successfully handles certain high-dimensional optimal control problems but still performs poorly on more challenging problems. One of the crucial reasons for the failure is the so-called distribution mismatch phenomenon brought by the controlled dynamics. In this paper, we investigate this phenomenon and propose the progressive optimal path sampling method to mitigate this problem. We theoretically prove that this enhanced sampling strategy outperforms both the vanilla approach and the widely used dataset aggregation method on the classical linear-quadratic regulator by a factor proportional to the total time duration. We further numerically demonstrate that the proposed sampling strategy significantly improves the performance on tested control problems, including the optimal landing problem of a quadrotor and the optimal reaching problem of a 7-DoF manipulator.

Show Abstract

Velocity optimization of self-equilibrated obstacles in a two-dimensional viscous flow

G. Francfort, Alessandro Giacomini, S. Weady

An obstacle is immersed in an externally driven 2D Stokes or Navier-Stokes fluid. We study the self-equilibration conditions for that obstacle under steady state assumptions on the flow. We then seek to optimize the translational and/or angular velocity of the obstacle by varying its shape. To allow general variations, we must consider a very large class of obstacles for which the notion of trace is meaningless. This forces us to revisit the notion of self-equilibration for both Stokes and Navier-Stokes in a measure theoretic environment.

Show Abstract
August 7, 2025

Velocity optimization of self-equilibrated obstacles in a two-dimensional viscous flow

G. Francfort, Alessandro Giacomini, S. Weady

An obstacle is immersed in an externally driven 2D Stokes or Navier-Stokes fluid. We study the self-equilibration conditions for that obstacle under steady state assumptions on the flow. We then seek to optimize the translational and/or angular velocity of the obstacle by varying its shape. To allow general variations, we must consider a very large class of obstacles for which the notion of trace is meaningless. This forces us to revisit the notion of self-equilibration for both Stokes and Navier-Stokes in a measure theoretic environment.

Show Abstract

Quantitative and Predictive Folding Models from Limited Single-Molecule Data Using Simulation-Based Inference

Lars Dingeldein, Aaron Lyons, P. Cossio

The study of biomolecular folding has been greatly advanced by single-molecule force spectroscopy (SMFS), which enables the observation of the dynamics of individual molecules. However, extracting quantitative models of fundamental properties such as folding landscapes from SMFS data is very challenging due to instrumental noise, linker artifacts, and the inherent stochasticity of the process, often requiring extensive datasets and complex calibration. Here, we introduce a framework based on simulation-based inference (SBI) that overcomes these limitations by integrating physics-based modeling with deep learning. We first apply this framework to analyze constant-force measurements of a DNA hairpin. From a single experimental trajectory of only two seconds, we successfully reconstruct the hairpin's free energy landscape and folding dynamics, obtaining results in close agreement with established deconvolution methods that require 10 - 100 times more data. Furthermore, we demonstrate the generality of our approach by applying it to a riboswitch aptamer featuring multiple states and tertiary contacts, resolving the profile of a landscape featuring four metastable states from a single trajectory. The Bayesian nature of this approach robustly quantifies uncertainties for all inferred parameters, including diffusion coefficients and linker stiffness, without needing independent measurements of instrument properties. The inferred models are predictive, generating simulated trajectories that quantitatively reproduce experimental thermodynamics and kinetics. The ability to derive statistically robust models from minimal datasets is crucial for investigating complex biomolecular systems where extensive data collection is impractical, paving the way for novel applications of SMFS.

Show Abstract
August 4, 2025

Quantitative and Predictive Folding Models from Limited Single-Molecule Data Using Simulation-Based Inference

Lars Dingeldein, Aaron Lyons, P. Cossio

The study of biomolecular folding has been greatly advanced by single-molecule force spectroscopy (SMFS), which enables the observation of the dynamics of individual molecules. However, extracting quantitative models of fundamental properties such as folding landscapes from SMFS data is very challenging due to instrumental noise, linker artifacts, and the inherent stochasticity of the process, often requiring extensive datasets and complex calibration. Here, we introduce a framework based on simulation-based inference (SBI) that overcomes these limitations by integrating physics-based modeling with deep learning. We first apply this framework to analyze constant-force measurements of a DNA hairpin. From a single experimental trajectory of only two seconds, we successfully reconstruct the hairpin's free energy landscape and folding dynamics, obtaining results in close agreement with established deconvolution methods that require 10 - 100 times more data. Furthermore, we demonstrate the generality of our approach by applying it to a riboswitch aptamer featuring multiple states and tertiary contacts, resolving the profile of a landscape featuring four metastable states from a single trajectory. The Bayesian nature of this approach robustly quantifies uncertainties for all inferred parameters, including diffusion coefficients and linker stiffness, without needing independent measurements of instrument properties. The inferred models are predictive, generating simulated trajectories that quantitatively reproduce experimental thermodynamics and kinetics. The ability to derive statistically robust models from minimal datasets is crucial for investigating complex biomolecular systems where extensive data collection is impractical, paving the way for novel applications of SMFS.

Show Abstract
August 4, 2025

Statistical mechanics of support vector regression

A key problem in deep learning and computational neuroscience is relating the geometrical properties of neural representations to task performance. Here, we consider this problem for continuous decoding tasks where neural variability may affect task precision. Using methods from statistical mechanics, we study the average-case learning curves for ɛ-insensitive support vector regression and discuss its capacity as a measure of linear decodability. Our analysis reveals a phase transition in training error at a critical load, capturing the interplay between the tolerance parameter ɛ and neural variability. We uncover a double-descent phenomenon in the generalization error, showing that ɛ acts as a regularizer, both suppressing and shifting these peaks. Theoretical predictions are validated both with toy models and deep neural networks, extending the theory of support vector machines to continuous tasks with inherent neural variability.

Show Abstract

Methylation Data Analysis and Interpretation

Yuehua Zhu, W. Mao , et al.

DNA methylation, a covalent modification, fundamentally shapes mammalian gene regulation and cellular identity. This review examines methylation's biochemical underpinnings, genomic distribution patterns, and analytical approaches. We highlight three distinctive aspects that separate methylation from other epigenetic marks: its remarkable stability as a silencing mechanism, its capacity to maintain distinct states independently of DNA sequence, and its effectiveness as a quantitative trait linking genotype to disease risk. We also explore the phenomenon of methylation clocks and their biological significance. The review addresses technical considerations across major assay types—both array-based technologies and sequencing approaches—with emphasis on data normalization, quality control, cell proportion inference, and the specialized statistical models required for next-generation sequencing analysis.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates