2795 Publications

Random batch sum-of-Gaussians algorithm for molecular dynamics simulations of Yukawa systems in three dimensions

Chen Chen, J. Liang, Zhenli Xu

Yukawa systems have drawn widespread interest across various applications, including plasma physics, colloidal science, and astrophysics, due to their critical role in modeling electrostatic interactions. In this paper, we introduce a novel random batch sum-of-Gaussians (RBSOG) algorithm for molecular dynamics simulations of three-dimensional Yukawa systems with periodic boundary conditions. We develop a sum-of-Gaussians (SOG) decomposition of the Yukawa kernel, dividing the interactions into near-field and far-field components. The near-field component, singular but compactly supported in a local domain, is calculated directly. The far-field component, represented as a sum of smooth Gaussians, is treated using the random batch approximation in Fourier space with an adaptive importance sampling strategy to reduce the variance of force calculations. Unlike the traditional Ewald decomposition, which introduces discontinuities and significant truncation error at the cutoff, the SOG decomposition achieves high-order smoothness and accuracy near the cutoff, allowing for efficient and energy-stable simulations. Additionally, by avoiding the use of the fast Fourier transform, our method achieves optimal O(N) complexity while maintaining high parallel scalability. Finally, unlike previous random batch approaches, the proposed adaptive importance sampling strategy achieves nearly optimal variance reduction across the regime of the coupling parameters, which is essential for handling varying coupling strengths across weak and strong regimes of electrostatic interactions. Rigorous theoretical analyses are presented, including SOG decomposition construction, variance estimation, and simulation convergence. We validate the performance of RBSOG method through numerical simulations of one-component plasma under weak and strong coupling conditions, using up to 106 particles and 1024 CPU cores. As a practical application in fusion ignition, we simulate high-temperature, high-density deuterium-α mixtures to study the energy exchange between deuterium and high-energy α particles. Due to the flexibility of the Gaussian approximation, the RBSOG method can be readily extended to other dielectric response functions, offering a promising approach for large-scale simulations.

Show Abstract

Complex scaling for open waveguides

C. Epstein, Tristan Goodwill, Jeremy Hoskins, S. Quinn, M. Rachh

In this work we analyze the complex scaling method applied to the problem of time-harmonic scalar wave propagation in junctions between `leaky,' or open dielectric waveguides. In [arXiv:2302.04353, arXiv:2310.05816, arXiv:2401.04674, arXiv:2411.11204], it was shown that under suitable assumptions the problem can be reduced to a system of Fredholm second-kind integral equations on an infinite interface, transverse to the waveguides. Here, we show that the kernels appearing in the integral equation admit a rapidly decaying analytic continuation on certain natural totally real submanifolds of $\mathbb{C}^2.$ We then show that for suitable, physically-meaningful, boundary data the resulting solutions to the integral equations themselves admit analytic continuation and satisfy related asymptotic estimates. By deforming the integral equation to a suitable contour, the decay in the kernels, density, and data enable straightforward discretization and truncation, with an error that decays exponentially in the truncation length. We illustrate our results with several representative numerical examples.

Show Abstract

ExEnDiff: An Experiment-Guided Diffusion Model for Protein Conformational Ensemble Generation

Yikai Liu, A. Sahoo, S. Hanson, et al.

Understanding protein conformation is key to understanding their function. Importantly, most proteins adopt multiple conformations with nontrivial ensemble distributions that change depending on their environment to perform functions like catalysis, signaling, and transport. Recently, machine learning techniques, especially deep generative models, have been employed to develop protein conformation generators. These models, known as unified protein ensemble samplers, are trained on the Protein Data Bank (PDB) dataset and can generate diverse protein conformation ensembles given a protein sequence. However, their reliance solely on structural data from the PDB, which primarily captures folded protein states, restricts the diversity of the generated ensembles and can result in physically unrealistic conformations. In this paper, we overcome these challenges by introducing ExEnDiff, an experiment-guided diffusion model for protein conformation generation. ExEnDiff integrates experimental measurements as a physical prior, enabling the generation of protein conformations with desired properties. Our experiments on a variety of fast-folding and intrinsically disordered proteins demonstrate that ExEnDiff significantly advances the capabilities of current unified protein ensemble samplers. With little computational cost, ExEnDiff can capture important proteins' configuration properties and the underlying Boltzmann distribution, paving the way for a next-generation molecular dynamics engine. We further demonstrate the effectiveness of ExEnDiff to capture conformational changes in the presence of mutations and as an efficient tool for determining a reasonable collective variable space for protein ensembles. With these results, ExEnDiff is well poised to push the study of protein ensembles into a data-rich regime currently available to few problems in biology.

Show Abstract
June 10, 2025

Uncertainty Prioritized Experience Replay

Rodrigo Carrasco-Davis, S. Lee, Claudia Clopath, Will Dabney

Prioritized experience replay, which improves sample efficiency by selecting relevant transitions to update parameter estimates, is a crucial component of contemporary value-based deep reinforcement learning models. Typically, transitions are prioritized based on their temporal difference error. However, this approach is prone to favoring noisy transitions, even when the value estimation closely approximates the target mean. This phenomenon resembles the noisy TV problem postulated in the exploration literature, in which exploration-guided agents get stuck by mistaking noise for novelty. To mitigate the disruptive effects of noise in value estimation, we propose using epistemic uncertainty estimation to guide the prioritization of transitions from the replay buffer. Epistemic uncertainty quantifies the uncertainty that can be reduced by learning, hence reducing transitions sampled from the buffer generated by unpredictable random processes. We first illustrate the benefits of epistemic uncertainty prioritized replay in two tabular toy models: a simple multi-arm bandit task, and a noisy gridworld. Subsequently, we evaluate our prioritization scheme on the Atari suite, outperforming quantile regression deep Q-learning benchmarks; thus forging a path for the use of uncertainty prioritized replay in reinforcement learning agents.

Show Abstract
June 10, 2025

ExEnDiff: An Experiment-Guided Diffusion Model for Protein Conformational Ensemble Generation

Yikai Liu, A. Sahoo, S. Hanson, et al.

Understanding protein conformation is key to understanding their function. Importantly, most proteins adopt multiple conformations with nontrivial ensemble distributions that change depending on their environment to perform functions like catalysis, signaling, and transport. Recently, machine learning techniques, especially deep generative models, have been employed to develop protein conformation generators. These models, known as unified protein ensemble samplers, are trained on the Protein Data Bank (PDB) dataset and can generate diverse protein conformation ensembles given a protein sequence. However, their reliance solely on structural data from the PDB, which primarily captures folded protein states, restricts the diversity of the generated ensembles and can result in physically unrealistic conformations. In this paper, we overcome these challenges by introducing ExEnDiff, an experiment-guided diffusion model for protein conformation generation. ExEnDiff integrates experimental measurements as a physical prior, enabling the generation of protein conformations with desired properties. Our experiments on a variety of fast-folding and intrinsically disordered proteins demonstrate that ExEnDiff significantly advances the capabilities of current unified protein ensemble samplers. With little computational cost, ExEnDiff can capture important proteins' configuration properties and the underlying Boltzmann distribution, paving the way for a next-generation molecular dynamics engine. We further demonstrate the effectiveness of ExEnDiff to capture conformational changes in the presence of mutations and as an efficient tool for determining a reasonable collective variable space for protein ensembles. With these results, ExEnDiff is well poised to push the study of protein ensembles into a data-rich regime currently available to few problems in biology.

Show Abstract

Learning Free Terminal Time Optimal Closed-loop Control of Manipulators

Wei Hu , Yue Zhao, Weinan E , J. Han, Jihao Long

This paper presents a novel approach to learning free terminal time closed-loop control for robotic manipulation tasks, enabling dynamic adjustment of task duration and control inputs to enhance performance. We extend the supervised learning approach, namely solving selected optimal open-loop problems and utilizing them as training data for a policy network, to the free terminal time scenario. Three main challenges are addressed in this extension. First, we introduce a marching scheme that enhances the solution quality and increases the success rate of the open-loop solver by gradually refining time discretization. Second, we extend the QRnet in [1] to the free terminal time setting to address discontinuity and improve stability at the terminal state. Third, we present a more automated version of the initial value problem (IVP) enhanced sampling method from previous work [2] to adaptively update the training dataset, significantly improving its quality. By integrating these techniques, we develop a closed-loop policy that operates effectively over a broad domain with varying optimal time durations, achieving near globally optimal total costs. The appendix and videos are available at https://deepoptimalcontrol.github.io/FreeTimeManipulator.

Show Abstract

Nonlinear spontaneous flow instability in active nematics

I. Lavi, Ricard Alert, Jean-François Joanny, Jaume Casademunt

Active nematics exhibit spontaneous flows through a well-known linear instability of the uniformly aligned quiescent state. Here, we show that even a linearly stable uniform state can experience a nonlinear instability, resulting in a discontinuous transition to spontaneous flows. In this case, quiescent and flowing states may coexist. Through a weakly nonlinear analysis and a numerical study, we trace the bifurcation diagram of striped patterns and show that the underlying pitchfork bifurcation switches from supercritical (continuous) to subcritical (discontinuous) by varying the flow-alignment parameter. We predict that the discontinuous spontaneous flow transition occurs for a wide range of parameters, including systems of contractile flow-aligning rods. Our predictions are relevant to active nematic turbulence and can potentially be tested in experiments on either cell layers or active cytoskeletal suspensions.

Show Abstract

Amortized template matching of molecular conformations from cryoelectron microscopy images using simulation-based inference

Lars Dingeldein, P. Cossio, et al.

Characterizing the conformational ensemble of biomolecular systems is key to understand their functions. Cryoelectron microscopy (cryo-EM) captures two-dimensional snapshots of biomolecular ensembles, giving in principle access to thermodynamics. However, these images are very noisy and show projections of the molecule in unknown orientations, making it very difficult to identify the biomolecule’s conformation in each individual image. Here, we introduce cryo-EM simulation-based inference (cryoSBI) to infer the conformations of biomolecules and the uncertainties associated with the inference from individual cryo-EM images. CryoSBI builds on simulation-based inference, a merger of physics-based simulations and probabilistic deep learning, allowing us to use Bayesian inference even when likelihoods are too expensive to calculate. We begin with an ensemble of conformations, templates from experiments, and molecular modeling, serving as structural hypotheses. We train a neural network approximating the Bayesian posterior using simulated images from these templates and then use it to accurately infer the conformation of the biomolecule from each experimental image. Training is only done once on simulations, and after that, it takes just a few milliseconds to make inference on an image, making cryoSBI suitable for arbitrarily large datasets and direct analysis on micrographs. CryoSBI eliminates the need to estimate particle pose and imaging parameters, significantly enhancing the computational speed compared to explicit likelihood methods. Importantly, we obtain interpretable machine learning models by integrating physics-based approaches with deep neural networks, ensuring that our results are transparent and reliable. We illustrate and benchmark cryoSBI on synthetic data and showcase its promise on experimental single-particle cryo-EM data.

Show Abstract

Generation of fate patterns via intercellular forces

H. Nunley , Xufeng Xue, Jianping Fu, David K. Lubensky

Studies of fate patterning during development typically emphasize cell-cell communication via diffusible chemical signals. Recent experiments on stem cell colonies, however, suggest that in some cases mechanical stresses, rather than secreted chemicals, enable long-ranged cell-cell interactions that specify positional information and pattern cell fates. These findings inspire a model of mechanical patterning: fate affects cell contractility, and pressure in the cell layer biases fate. Cells at the colony edge, more contractile than cells at the center, seed a pattern that propagates via force transmission. Strikingly, our model implies that the width of the outer fate domain varies nonmonotonically with substrate stiffness, a prediction that we confirm experimentally; we argue that a similar dependence on substrate stiffness can be achieved by a chemical morphogen only if strong constraints on the signaling pathway's mechanobiology are met. Our findings thus support the idea that mechanical stress can mediate patterning in the complete absence of chemical morphogens, even in nonmotile cell layers, thus expanding the repertoire of possible roles for mechanical signals in development and morphogenesis. Future tests of additional model predictions, like the effect of anisotropic substrate rigidity, will further broaden the range of achievable fate patterns.

Show Abstract

Amortized template matching of molecular conformations from cryoelectron microscopy images using simulation-based inference

Lars Dingeldein, P. Cossio, et al.

Characterizing the conformational ensemble of biomolecular systems is key to understand their functions. Cryoelectron microscopy (cryo-EM) captures two-dimensional snapshots of biomolecular ensembles, giving in principle access to thermodynamics. However, these images are very noisy and show projections of the molecule in unknown orientations, making it very difficult to identify the biomolecule’s conformation in each individual image. Here, we introduce cryo-EM simulation-based inference (cryoSBI) to infer the conformations of biomolecules and the uncertainties associated with the inference from individual cryo-EM images. CryoSBI builds on simulation-based inference, a merger of physics-based simulations and probabilistic deep learning, allowing us to use Bayesian inference even when likelihoods are too expensive to calculate. We begin with an ensemble of conformations, templates from experiments, and molecular modeling, serving as structural hypotheses. We train a neural network approximating the Bayesian posterior using simulated images from these templates and then use it to accurately infer the conformation of the biomolecule from each experimental image. Training is only done once on simulations, and after that, it takes just a few milliseconds to make inference on an image, making cryoSBI suitable for arbitrarily large datasets and direct analysis on micrographs. CryoSBI eliminates the need to estimate particle pose and imaging parameters, significantly enhancing the computational speed compared to explicit likelihood methods. Importantly, we obtain interpretable machine learning models by integrating physics-based approaches with deep neural networks, ensuring that our results are transparent and reliable. We illustrate and benchmark cryoSBI on synthetic data and showcase its promise on experimental single-particle cryo-EM data.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates