443 Publications

On learning Gaussian multi-index models with gradient flow part I: General properties and two-timescale learning

A. Bietti, Joan Bruna, L. Pillaud-Vivien

We study gradient flow on the multi-index regression problem for high-dimensional Gaussian data. Multi-index functions consist of a composition of an unknown low-rank linear projection and an arbitrary unknown, low-dimensional link function. As such, they constitute a natural template for feature learning in neural networks. We consider a two-timescale algorithm, whereby the low-dimensional link function is learnt with a non-parametric model infinitely faster than the subspace parametrizing the low-rank projection. By appropriately exploiting the matrix semigroup structure arising over the subspace correlation matrices, we establish global convergence of the resulting Grassmannian gradient flow dynamics, and provide a quantitative description of its associated “saddle-to-saddle” dynamics. Notably, the timescales associated with each saddle can be explicitly characterized in terms of an appropriate Hermite decomposition of the target link function.

Show Abstract

DISCO: learning to DISCover an evolution Operator for multi-physics-agnostic prediction

R. Morel, J. Han, E. Oyallon

We address the problem of predicting the next states of a dynamical system governed by unknown temporal partial differential equations (PDEs) using only a short trajectory. While standard transformers provide a natural blackbox solution to this task, the presence of a wellstructured evolution operator in the data suggests a more tailored and efficient approach. Specifically, when the PDE is fully known, classical numerical solvers can evolve the state accurately with only a few parameters. Building on this observation, we introduce DISCO, a model that uses a large hypernetwork to process a short trajectory and generate the parameters of a much smaller operator network, which then predicts the next states through time integration. Our framework decouples dynamics estimation — i.e., DISCovering an evolution Operator from a short trajectory — from state prediction — i.e., evolving this operator. Experiments show that pretraining our model on diverse physics datasets achieves state-of-the-art performance while requiring significantly fewer epochs. Moreover, it generalizes well to unseen initial conditions and remains competitive when fine-tuned on downstream tasks. The code will be made publicly available upon publication at https: //github.com/RudyMorel/DISCO.

Show Abstract

Learning Compositional Functions with Transformers from Easy-to-Hard Data

Zixuan Wang, Eshaan Nichani, A. Bietti, Alex Damian, Daniel Hsu, Jason D. Lee, D. Wu

Transformer-based language models have demonstrated impressive capabilities across a range of complex reasoning tasks. Prior theoretical work exploring the expressive power of transformers has shown that they can efficiently perform multi-step reasoning tasks involving parallelizable computations. However, the learnability of such constructions, particularly the conditions on the data distribution that enable efficient learning via SGD, remains an open question. Towards answering this question, we study the learnability of a task called the \emph{$k$-fold composition}, which requires computing an interleaved composition of $k$ input permutations and $k$ hidden permutations, and can be expressed by a transformer with $O(\log k)$ layers. On the negative front, we provide a Statistical Query lower bound showing that any learner which is trained on samples from the $k$-fold composition task and makes polynomially many queries must have sample size exponential in $k$, thus establishing a statistical-computational gap. On the other hand, we show that this function class can be efficiently learned, with runtime and sample complexity polynomial in $k$, by gradient descent on an $O(\log k)$-depth transformer via two different curriculum learning strategies: one in which data consists of $k’$-fold composition functions with $k’ \le k$ presented in increasing order of difficulty, and another in which all data is presented simultaneously. Our work sheds light on the necessity and sufficiency of having both easy and hard examples in the data distribution for transformers to learn complex compositional tasks.

Show Abstract

Microtubules in Martini: Parameterizing a heterogeneous elastic-network towards a mechanically accurate microtubule

Microtubules are essential cytoskeletal filaments involved in cell motility, division, and intracellular transport, exhibiting complex structural dynamics governed by diverse biophysical factors. Atomistic simulations of microtubule assemblies remain challenging due to their extensive spatiotemporal scales. To address this, we present a multiscale approach combining the primarily top-down Martini 3 coarse-grained (CG) model with an appropriately parameterized heterogeneous elastic network to capture microtubule mechanics and molecular detail efficiently. By iteratively tuning the elastic network, we matched the structural fluctuations of CG heterodimeric building blocks to atomistic reference data, reproducing experimentally consistent mechanical properties. This framework helped us identify stabilizing long-lived interactions between charged C-terminal tails and the folded domain of neighboring tubulin subunits, offering insight into sequence-specific contributions to lattice stability. Our efforts culminated in the construction of a 200 nm microtubule composed of million interaction centers, enabling exploration of large-scale microtubule-associated processes with amino acid-level resolution. This work bridges the gap between molecular specificity and computational scalability, offering a platform for simulating biophysical processes across cellular length and time scales.

Show Abstract

RocketSHP: Ultra-fast Proteome-scale Prediction of Protein Dynamics

Proteins are dynamic molecules that depend on conformational flexibility to carry out functions in the cell, yet despite significant advances in the modeling of static protein structure, prediction of these dynamics remains challenging. We introduce RocketSHP, a machine learning model that predicts dynamic protein properties from sequence or static structure with unprecedented speed and accuracy. Trained on thousands of molecular dynamics trajectories spanning diverse protein families, RocketSHP simultaneously models multiple dynamics features: root-mean-square fluctuations (RMSF), generalized correlation coefficients (GCC-LMI), and a novel structural heterogeneity profile (SHP) based on recent structure quantization methods. RocketSHP significantly outperforms existing methods in predicting simulation-derived dynamics. We reduce RMSF prediction error by 57% compared to BioEmu and calibrated Dyna-1 predictions, including an up to 73% error reduction for long proteins. We validate these predictions with experimental hetNOE data, and we demonstrate the ability to adapt predictions to different physical temperatures. We highlight RocketSHP’s utility in constructing allosteric networks in the oncogene KRAS and identify structural sub-modules with correlated motions, and we validate RocketSHP by showing that changes in node centrality within predicted KRAS allosteric networks correlate with changes of folding free energy in experimental DMS data. Our approach makes predictions in seconds rather than hours or days, enabling us to perform the first comprehensive dynamics analysis of the entire human proteome. RocketSHP bridges the gap between static structural biology and dynamic functional understanding, enabling dynamics-aware structural analysis and variant effect prediction at scales previously unavailable. RocketSHP is available as free and open-source software at https://github.com/flatironinstitute/RocketSHP.

Show Abstract

Complex scaling for open waveguides

C. Epstein, Tristan Goodwill, Jeremy Hoskins, S. Quinn, M. Rachh

In this work we analyze the complex scaling method applied to the problem of time-harmonic scalar wave propagation in junctions between `leaky,' or open dielectric waveguides. In [arXiv:2302.04353, arXiv:2310.05816, arXiv:2401.04674, arXiv:2411.11204], it was shown that under suitable assumptions the problem can be reduced to a system of Fredholm second-kind integral equations on an infinite interface, transverse to the waveguides. Here, we show that the kernels appearing in the integral equation admit a rapidly decaying analytic continuation on certain natural totally real submanifolds of $\mathbb{C}^2.$ We then show that for suitable, physically-meaningful, boundary data the resulting solutions to the integral equations themselves admit analytic continuation and satisfy related asymptotic estimates. By deforming the integral equation to a suitable contour, the decay in the kernels, density, and data enable straightforward discretization and truncation, with an error that decays exponentially in the truncation length. We illustrate our results with several representative numerical examples.

Show Abstract

ExEnDiff: An Experiment-Guided Diffusion Model for Protein Conformational Ensemble Generation

Yikai Liu, A. Sahoo, S. Hanson, et al.

Understanding protein conformation is key to understanding their function. Importantly, most proteins adopt multiple conformations with nontrivial ensemble distributions that change depending on their environment to perform functions like catalysis, signaling, and transport. Recently, machine learning techniques, especially deep generative models, have been employed to develop protein conformation generators. These models, known as unified protein ensemble samplers, are trained on the Protein Data Bank (PDB) dataset and can generate diverse protein conformation ensembles given a protein sequence. However, their reliance solely on structural data from the PDB, which primarily captures folded protein states, restricts the diversity of the generated ensembles and can result in physically unrealistic conformations. In this paper, we overcome these challenges by introducing ExEnDiff, an experiment-guided diffusion model for protein conformation generation. ExEnDiff integrates experimental measurements as a physical prior, enabling the generation of protein conformations with desired properties. Our experiments on a variety of fast-folding and intrinsically disordered proteins demonstrate that ExEnDiff significantly advances the capabilities of current unified protein ensemble samplers. With little computational cost, ExEnDiff can capture important proteins' configuration properties and the underlying Boltzmann distribution, paving the way for a next-generation molecular dynamics engine. We further demonstrate the effectiveness of ExEnDiff to capture conformational changes in the presence of mutations and as an efficient tool for determining a reasonable collective variable space for protein ensembles. With these results, ExEnDiff is well poised to push the study of protein ensembles into a data-rich regime currently available to few problems in biology.

Show Abstract

Self-reorganization and Information Transfer in Massive Schools of Fish

Haotian Hang, Chenchen Huang, A. Barnett, Eva Kanso

The remarkable cohesion and coordination observed in moving animal groups and their collective responsiveness to threats are thought to be mediated by scale-free correlations, where changes in the behavior of one animal influence others in the group, regardless of the distance between them. But are these features independent of group size? Here, we investigate group cohesiveness and collective responsiveness in computational models of massive schools of fish of up to 50,000 individuals. We show that as the number of swimmers increases, flow interactions destabilize the school, creating clusters that constantly fragment, disperse, and regroup, similar to their biological counterparts. We calculate the spatial correlation and speed of information propagation in these dynamic clusters. Spatial correlations in cohesive and polarized clusters are indeed scale free, much like in natural animal groups, but fragmentation events are preceded by a decrease in correlation length, thus diminishing the group's collective responsiveness, leaving it more vulnerable to predation events. Importantly, in groups undergoing collective turns, the information about the change in direction propagates linearly in time among group members, thanks to the non-reciprocal nature of the visual interactions between individuals. Merging speeds up the transfer of information within each cluster by several fold, while fragmentation slows it down. Our findings suggest that flow interactions may have played an important role in group size regulation, behavioral adaptations, and dispersion in living animal groups.

Show Abstract

Designing objects that are invisible to electromagnetic waves

Johan Helsing, S. Jiang, Anders Karlsson

This article shows that it is, in principle, possible to make a dielectric rod completely invisible to an incident electromagnetic plane wave of a given frequency. Students can derive the conditions that make the rod invisible if they understand the concept of plane waves, the boundary conditions for electric and magnetic fields, and the complex representation of electromagnetic fields. With access to appropriate software, students can determine the bandwidth of the invisibility and investigate whether it is possible to make an invisible rod out of real-world materials. A more advanced project proposed is to use electromagnetic software to find perfectly conducting hollow structures that are invisible to an incident plane wave.

Show Abstract

High-order and adaptive optical conductivity calculations using Wannier interpolation

Lorenzo Van Muñoz, J. Kaye, A. Barnett, Sophie Beck

The optical conductivity provides a comprehensive view of the electronic response of materials to electromagnetic fields, offering insights into transport phenomena, optoelectronic properties, and other fundamental aspects of condensed matter physics. We present an automatic, high-order accurate, and adaptive Brillouin zone integration algorithm for the calculation of the optical conductivity using the Kubo formula, with a nonzero but small broadening factor 𝜂, focusing on the case in which a Hamiltonian in a downfolded model can be evaluated efficiently using Wannier interpolation. The algorithm uses iterated adaptive integration to exploit the localization of the transport distribution near energy and energy-difference isosurfaces, yielding polylogarithmic computational complexity with respect to 𝜂, rather than the algebraic complexity of uniform integration rules. To demonstrate the method, we compute the AC optical conductivity of a three-band tight-binding model, and are able to resolve the Drude and interband peaks with broadening in the sub-meV regime to several digits of accuracy. Our algorithm automates convergence testing to a user-specified error tolerance, providing an important tool in black-box first-principles calculations of electrical transport phenomena and other response functions.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates