481 Publications

Progressive Optimal Path Sampling for Closed-Loop Optimal Control Design with Deep Neural Networks

Xuanxi Zhang , Jihao Long, Wei Hu, Weinan E , J. Han

Closed-loop optimal control design for high-dimensional nonlinear systems has been a long-standing challenge. Traditional methods, such as solving the associated Hamilton-Jacobi-Bellman equation, suffer from the curse of dimensionality. Recent literature proposed a new promising approach based on supervised learning, by leveraging powerful open-loop optimal control solvers to generate training data and neural networks as efficient high-dimensional function approximators to fit the closed-loop optimal control. This approach successfully handles certain high-dimensional optimal control problems but still performs poorly on more challenging problems. One of the crucial reasons for the failure is the so-called distribution mismatch phenomenon brought by the controlled dynamics. In this paper, we investigate this phenomenon and propose the progressive optimal path sampling method to mitigate this problem. We theoretically prove that this enhanced sampling strategy outperforms both the vanilla approach and the widely used dataset aggregation method on the classical linear-quadratic regulator by a factor proportional to the total time duration. We further numerically demonstrate that the proposed sampling strategy significantly improves the performance on tested control problems, including the optimal landing problem of a quadrotor and the optimal reaching problem of a 7-DoF manipulator.

Show Abstract

Velocity optimization of self-equilibrated obstacles in a two-dimensional viscous flow

G. Francfort, Alessandro Giacomini, S. Weady

An obstacle is immersed in an externally driven 2D Stokes or Navier-Stokes fluid. We study the self-equilibration conditions for that obstacle under steady state assumptions on the flow. We then seek to optimize the translational and/or angular velocity of the obstacle by varying its shape. To allow general variations, we must consider a very large class of obstacles for which the notion of trace is meaningless. This forces us to revisit the notion of self-equilibration for both Stokes and Navier-Stokes in a measure theoretic environment.

Show Abstract

Quantitative and Predictive Folding Models from Limited Single-Molecule Data Using Simulation-Based Inference

Lars Dingeldein, Aaron Lyons, P. Cossio

The study of biomolecular folding has been greatly advanced by single-molecule force spectroscopy (SMFS), which enables the observation of the dynamics of individual molecules. However, extracting quantitative models of fundamental properties such as folding landscapes from SMFS data is very challenging due to instrumental noise, linker artifacts, and the inherent stochasticity of the process, often requiring extensive datasets and complex calibration. Here, we introduce a framework based on simulation-based inference (SBI) that overcomes these limitations by integrating physics-based modeling with deep learning. We first apply this framework to analyze constant-force measurements of a DNA hairpin. From a single experimental trajectory of only two seconds, we successfully reconstruct the hairpin's free energy landscape and folding dynamics, obtaining results in close agreement with established deconvolution methods that require 10 - 100 times more data. Furthermore, we demonstrate the generality of our approach by applying it to a riboswitch aptamer featuring multiple states and tertiary contacts, resolving the profile of a landscape featuring four metastable states from a single trajectory. The Bayesian nature of this approach robustly quantifies uncertainties for all inferred parameters, including diffusion coefficients and linker stiffness, without needing independent measurements of instrument properties. The inferred models are predictive, generating simulated trajectories that quantitatively reproduce experimental thermodynamics and kinetics. The ability to derive statistically robust models from minimal datasets is crucial for investigating complex biomolecular systems where extensive data collection is impractical, paving the way for novel applications of SMFS.

Show Abstract
August 4, 2025

Query Efficient Structured Matrix Learning

Noah Amsel, Pratyush Avi, Tyler Chen, Feyza Duman Keles, Chinmay Hegde, Cameron Musco, Christopher Musco, D. Persson

We study the problem of learning a structured approximation (low-rank, sparse, banded, etc.) to an unknown matrix $A$ given access to matrix-vector product (matvec) queries of the form $x \rightarrow Ax$ and $x \rightarrow A^Tx$. This problem is of central importance to algorithms across scientific computing and machine learning, with applications to fast multiplication and inversion for structured matrices, building preconditioners for first-order optimization, and as a model for differential operator learning. Prior work focuses on obtaining query complexity upper and lower bounds for learning specific structured matrix families that commonly arise in applications.
We initiate the study of the problem in greater generality, aiming to understand the query complexity of learning approximations from general matrix families. Our main result focuses on finding a near-optimal approximation to $A$ from any finite-sized family of matrices, $\mathcal{F}$. Standard results from matrix sketching show that $O(\log|\mathcal{F}|)$ matvec queries suffice in this setting. This bound can also be achieved, and is optimal, for vector-matrix-vector queries of the form $x,y\rightarrow x^TAy$, which have been widely studied in work on rank-$1$ matrix sensing.
Surprisingly, we show that, in the matvec model, it is possible to obtain a nearly quadratic improvement in complexity, to $\tilde{O}(\sqrt{\log|\mathcal{F}|})$. Further, we prove that this bound is tight up to log-log this http URL covering number arguments, our result extends to well-studied infinite families. As an example, we establish that a near-optimal approximation from any \emph{linear matrix family} of dimension $q$ can be learned with $\tilde{O}(\sqrt{q})$ matvec queries, improving on an $O(q)$ bound achievable via sketching techniques and vector-matrix-vector queries.

Show Abstract

On learning Gaussian multi-index models with gradient flow part I: General properties and two-timescale learning

A. Bietti, Joan Bruna, L. Pillaud-Vivien

We study gradient flow on the multi-index regression problem for high-dimensional Gaussian data. Multi-index functions consist of a composition of an unknown low-rank linear projection and an arbitrary unknown, low-dimensional link function. As such, they constitute a natural template for feature learning in neural networks. We consider a two-timescale algorithm, whereby the low-dimensional link function is learnt with a non-parametric model infinitely faster than the subspace parametrizing the low-rank projection. By appropriately exploiting the matrix semigroup structure arising over the subspace correlation matrices, we establish global convergence of the resulting Grassmannian gradient flow dynamics, and provide a quantitative description of its associated “saddle-to-saddle” dynamics. Notably, the timescales associated with each saddle can be explicitly characterized in terms of an appropriate Hermite decomposition of the target link function.

Show Abstract

DISCO: learning to DISCover an evolution Operator for multi-physics-agnostic prediction

R. Morel, J. Han, E. Oyallon

We address the problem of predicting the next states of a dynamical system governed by unknown temporal partial differential equations (PDEs) using only a short trajectory. While standard transformers provide a natural blackbox solution to this task, the presence of a wellstructured evolution operator in the data suggests a more tailored and efficient approach. Specifically, when the PDE is fully known, classical numerical solvers can evolve the state accurately with only a few parameters. Building on this observation, we introduce DISCO, a model that uses a large hypernetwork to process a short trajectory and generate the parameters of a much smaller operator network, which then predicts the next states through time integration. Our framework decouples dynamics estimation — i.e., DISCovering an evolution Operator from a short trajectory — from state prediction — i.e., evolving this operator. Experiments show that pretraining our model on diverse physics datasets achieves state-of-the-art performance while requiring significantly fewer epochs. Moreover, it generalizes well to unseen initial conditions and remains competitive when fine-tuned on downstream tasks. The code will be made publicly available upon publication at https: //github.com/RudyMorel/DISCO.

Show Abstract

Learning Compositional Functions with Transformers from Easy-to-Hard Data

Zixuan Wang, Eshaan Nichani, A. Bietti, Alex Damian, Daniel Hsu, Jason D. Lee, D. Wu

Transformer-based language models have demonstrated impressive capabilities across a range of complex reasoning tasks. Prior theoretical work exploring the expressive power of transformers has shown that they can efficiently perform multi-step reasoning tasks involving parallelizable computations. However, the learnability of such constructions, particularly the conditions on the data distribution that enable efficient learning via SGD, remains an open question. Towards answering this question, we study the learnability of a task called the \emph{$k$-fold composition}, which requires computing an interleaved composition of $k$ input permutations and $k$ hidden permutations, and can be expressed by a transformer with $O(\log k)$ layers. On the negative front, we provide a Statistical Query lower bound showing that any learner which is trained on samples from the $k$-fold composition task and makes polynomially many queries must have sample size exponential in $k$, thus establishing a statistical-computational gap. On the other hand, we show that this function class can be efficiently learned, with runtime and sample complexity polynomial in $k$, by gradient descent on an $O(\log k)$-depth transformer via two different curriculum learning strategies: one in which data consists of $k’$-fold composition functions with $k’ \le k$ presented in increasing order of difficulty, and another in which all data is presented simultaneously. Our work sheds light on the necessity and sufficiency of having both easy and hard examples in the data distribution for transformers to learn complex compositional tasks.

Show Abstract

Microtubules in Martini: Parameterizing a heterogeneous elastic-network towards a mechanically accurate microtubule

Microtubules are essential cytoskeletal filaments involved in cell motility, division, and intracellular transport, exhibiting complex structural dynamics governed by diverse biophysical factors. Atomistic simulations of microtubule assemblies remain challenging due to their extensive spatiotemporal scales. To address this, we present a multiscale approach combining the primarily top-down Martini 3 coarse-grained (CG) model with an appropriately parameterized heterogeneous elastic network to capture microtubule mechanics and molecular detail efficiently. By iteratively tuning the elastic network, we matched the structural fluctuations of CG heterodimeric building blocks to atomistic reference data, reproducing experimentally consistent mechanical properties. This framework helped us identify stabilizing long-lived interactions between charged C-terminal tails and the folded domain of neighboring tubulin subunits, offering insight into sequence-specific contributions to lattice stability. Our efforts culminated in the construction of a 200 nm microtubule composed of million interaction centers, enabling exploration of large-scale microtubule-associated processes with amino acid-level resolution. This work bridges the gap between molecular specificity and computational scalability, offering a platform for simulating biophysical processes across cellular length and time scales.

Show Abstract

RocketSHP: Ultra-fast Proteome-scale Prediction of Protein Dynamics

Proteins are dynamic molecules that depend on conformational flexibility to carry out functions in the cell, yet despite significant advances in the modeling of static protein structure, prediction of these dynamics remains challenging. We introduce RocketSHP, a machine learning model that predicts dynamic protein properties from sequence or static structure with unprecedented speed and accuracy. Trained on thousands of molecular dynamics trajectories spanning diverse protein families, RocketSHP simultaneously models multiple dynamics features: root-mean-square fluctuations (RMSF), generalized correlation coefficients (GCC-LMI), and a novel structural heterogeneity profile (SHP) based on recent structure quantization methods. RocketSHP significantly outperforms existing methods in predicting simulation-derived dynamics. We reduce RMSF prediction error by 57% compared to BioEmu and calibrated Dyna-1 predictions, including an up to 73% error reduction for long proteins. We validate these predictions with experimental hetNOE data, and we demonstrate the ability to adapt predictions to different physical temperatures. We highlight RocketSHP’s utility in constructing allosteric networks in the oncogene KRAS and identify structural sub-modules with correlated motions, and we validate RocketSHP by showing that changes in node centrality within predicted KRAS allosteric networks correlate with changes of folding free energy in experimental DMS data. Our approach makes predictions in seconds rather than hours or days, enabling us to perform the first comprehensive dynamics analysis of the entire human proteome. RocketSHP bridges the gap between static structural biology and dynamic functional understanding, enabling dynamics-aware structural analysis and variant effect prediction at scales previously unavailable. RocketSHP is available as free and open-source software at https://github.com/flatironinstitute/RocketSHP.

Show Abstract

Random batch sum-of-Gaussians algorithm for molecular dynamics simulations of Yukawa systems in three dimensions

Chen Chen, J. Liang, Zhenli Xu

Yukawa systems have drawn widespread interest across various applications, including plasma physics, colloidal science, and astrophysics, due to their critical role in modeling electrostatic interactions. In this paper, we introduce a novel random batch sum-of-Gaussians (RBSOG) algorithm for molecular dynamics simulations of three-dimensional Yukawa systems with periodic boundary conditions. We develop a sum-of-Gaussians (SOG) decomposition of the Yukawa kernel, dividing the interactions into near-field and far-field components. The near-field component, singular but compactly supported in a local domain, is calculated directly. The far-field component, represented as a sum of smooth Gaussians, is treated using the random batch approximation in Fourier space with an adaptive importance sampling strategy to reduce the variance of force calculations. Unlike the traditional Ewald decomposition, which introduces discontinuities and significant truncation error at the cutoff, the SOG decomposition achieves high-order smoothness and accuracy near the cutoff, allowing for efficient and energy-stable simulations. Additionally, by avoiding the use of the fast Fourier transform, our method achieves optimal O(N) complexity while maintaining high parallel scalability. Finally, unlike previous random batch approaches, the proposed adaptive importance sampling strategy achieves nearly optimal variance reduction across the regime of the coupling parameters, which is essential for handling varying coupling strengths across weak and strong regimes of electrostatic interactions. Rigorous theoretical analyses are presented, including SOG decomposition construction, variance estimation, and simulation convergence. We validate the performance of RBSOG method through numerical simulations of one-component plasma under weak and strong coupling conditions, using up to 106 particles and 1024 CPU cores. As a practical application in fusion ignition, we simulate high-temperature, high-density deuterium-α mixtures to study the energy exchange between deuterium and high-energy α particles. Due to the flexibility of the Gaussian approximation, the RBSOG method can be readily extended to other dielectric response functions, offering a promising approach for large-scale simulations.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates