421 Publications

Walrus: A Cross-Domain Foundation Model for Continuum Dynamics

M. McCabe, Payel Mukhopadhyay, Tanya Marwah, B. Régaldo-Saint Blancard, Francois Rozet, Cristiana Diaconu, Lucas Meyer, Kaze W. K. Wong, Hadi Sotoudeh, A. Bietti, Irina Espejo, Tom Hehir, S. Golkar, Tom Hehir, Keiya Hirashima, G. Krawezik, F. Lanusse, R. Morel, R. Ohana, L. Parker, M. Pettee, Jeff Shen, K. Cho, M. Cranmer, S. Ho

Foundation models have transformed machine learning for language and vision, but achieving comparable impact in physical simulation remains a challenge. Data heterogeneity and unstable long-term dynamics inhibit learning from sufficiently diverse dynamics, while varying resolutions and dimensionalities challenge efficient training on modern hardware. Through empirical and theoretical analysis, we incorporate new approaches to mitigate these obstacles, including a harmonic-analysis-based stabilization method, load-balanced distributed 2D and 3D training strategies, and compute-adaptive tokenization. Using these tools, we develop Walrus, a transformer-based foundation model developed primarily for fluid-like continuum dynamics. Walrus is pretrained on nineteen diverse scenarios spanning astrophysics, geoscience, rheology, plasma physics, acoustics, and classical fluids. Experiments show that Walrus outperforms prior foundation models on both short and long term prediction horizons on downstream tasks and across the breadth of pretraining data, while ablation studies confirm the value of our contributions to forecast stability, training throughput, and transfer performance over conventional approaches. Code and weights are released for community use.

Show Abstract

Higher-order continuum models for twisted bilayer graphene

S. Quinn, Tianyu Kong, M. Luskin, Alexander B. Watson

The first-order continuum partial differential equation (PDE) model proposed by Bistritzer and MacDonald [Proc. Natl. Acad. Sci. U. S. A. 108, 12233–12237 (2011)] accurately describes the single-particle electronic properties of twisted bilayer graphene at small twist angles. In this paper, we obtain higher-order corrections to the Bistritzer–MacDonald (BM) model via a systematic multiple-scales expansion. We prove that the solution of the resulting higher-order PDE model accurately approximates the corresponding tight-binding wave function under a natural choice of parameters and given initial conditions that are spectrally localized to the monolayer Dirac points. Numerical simulations of tight-binding and continuum dynamics demonstrate the validity of the higher-order continuum model. Symmetries of the higher-order models are also discussed. This work extends the analysis from Watson et al., J. Math. Phys. 64, 031502 (2023), which rigorously established the validity of the (first-order) BM model.

Show Abstract

Neutral Gas Phase Distribution from H I Morphology: Phase Separation with Scattering Spectra and Variational Autoencoders

Minjie Lei , S. E. Clark, R. Morel, Et al.

Unraveling the multiphase structure of the diffuse interstellar medium as traced by neutral hydrogen (H i) is essential to understanding the lifecycle of the Milky Way. However, H i phase separation is a challenging and underconstrained problem. The neutral gas phase distribution is often inferred from the spectral line structure of H i emission. In this work, we develop a data-driven phase-separation method that extracts H i phase structure solely from the spatial morphology of H i emission intensity structures. We combine scattering spectra (SS) statistics with a Gaussian-mixture variational autoencoder model to (1) derive an interpretable statistical model of different H i phases from their multiscale morphological structures, and (2) we use this model to decompose the 2D channel maps of GALFA-H i emission in diffuse high-latitude (|b|>30) regions over narrow velocity channels (Δv=3 km/s) into cold neutral medium (CNM), warm neutral medium (WNM), and noise components. We integrate our CNM map over velocity channels to compare it to an existing map produced by a spectrum-based method. We find that the two maps are highly correlated, but ours recovers more spatially coherent structures at small scales. Our work illustrates and quantifies a clear physical connection between the H i morphology and H i phase structure, and it unlocks a new avenue for improving future phase-separation techniques by making use of both H i spectral and spatial information to decompose H i in 3D position–position–velocity space. These results are consistent with a physical picture where processes that drive H i phase transitions also shape the morphology of H i gas, imprinting a sparse, filamentary CNM that forms out of a diffuse, extended WNM.

Show Abstract

Universal Spectral Tokenization via Self-Supervised Panchromatic Representation Learning

Jeff Shen, F. Lanusse, Liam Holden Parker, L. Sarra, A. Bietti, R. Morel, Et al.

Sequential scientific data span many resolutions and domains, and unifying them into a common representation is a key step toward developing foundation models for the sciences. Astronomical spectra exemplify this challenge: massive surveys have collected millions of spectra across a wide range of wavelengths and resolutions, yet analyses remain fragmented across spectral domains (e.g., optical vs. infrared) and object types (e.g., stars vs. galaxies), limiting the ability to pool information across datasets. We present a deep learning model that jointly learns from heterogeneous spectra in a self-supervised manner. Our universal spectral tokenizer processes spectra from a variety of object types and resolutions directly on their native wavelength grids, producing intrinsically aligned, homogeneous, and physically meaningful representations that can be efficiently adapted to achieve competitive performance across a range of downstream tasks. For the first time, we demonstrate that a single model can unify spectral data across resolutions and domains, suggesting that our model can serve as a powerful building block for foundation models in astronomy—and potentially extend to other scientific domains with heterogeneous sequential data, such as climate and healthcare.

Show Abstract

The Helmholtz Dirichlet and Neumann problems on piecewise smooth open curves

Johan Helsing, S. Jiang

A numerical scheme is presented for solving the Helmholtz equation with Dirichlet or Neumann boundary conditions on piecewise smooth open curves, where the curves may have corners and multiple junctions. Existing integral equation methods for smooth open curves rely on analyzing the exact singularities of the density at endpoints for associated integral operators, explicitly extracting these singularities from the densities in the formulation, and using global quadrature to discretize the boundary integral equation. Extending these methods to handle curves with corners and multiple junctions is challenging because the singularity analysis becomes much more complex, and constructing high-order quadrature for discretizing layer potentials with singular and hypersingular kernels and singular densities is nontrivial. The proposed scheme is built upon the following two observations. First, the single-layer potential operator and the normal derivative of the double-layer potential operator serve as effective preconditioners for each other locally. Second, the recursively compressed inverse preconditioning (RCIP) method can be extended to address “implicit” second-kind integral equations. The scheme is high-order, adaptive, and capable of handling corners and multiple junctions without prior knowledge of the density singularity. It is also compatible with fast algorithms, such as the fast multipole method. The performance of the scheme is illustrated with several numerical examples.

Show Abstract

Interpolative separable density fitting on adaptive real space grids

H. Zhu, C. Yeh, Miguel A. Morales, L. Greengard, S. Jiang, J. Kaye

We generalize the interpolative separable density fitting (ISDF) method, used for compressing the four-index electron repulsion integral (ERI) tensor, to incorporate adaptive real space grids for potentially highly localized single-particle basis functions. To do so, we employ a fast adaptive algorithm, the recently-introduced dual-space multilevel kernel-splitting method, to solve the Poisson equation for the ISDF auxiliary basis functions. The adaptive grids are generated using a high-order accurate, black-box procedure that satisfies a user-specified error tolerance. Our algorithm relies on the observation, which we prove, that an adaptive grid resolving the pair densities appearing in the ERI tensor can be straightforwardly constructed from one that resolves the single-particle basis functions, with the number of required grid points differing only by a constant factor. We find that the ISDF compression efficiency for the ERI tensor with highly localized basis sets is comparable to that for smoother basis sets compatible with uniform grids. To demonstrate the performance of our procedure, we consider several molecular systems with all-electron basis sets which are intractable using uniform grid-based methods. Our work establishes a pathway for scalable many-body electronic structure simulations with arbitrary smooth basis functions, making simulations of phenomena like core-level excitations feasible on a large scale.

Show Abstract

A Method of Fundamental Solutions for Large-Scale 3D Elastance and Mobility Problems

Anna Broms, A. Barnett, Anna-Karin Tornberg

The method of fundamental solutions (MFS) is known to be effective for solving 3D Laplace and Stokes Dirichlet boundary value problems in the exterior of a large collection of simple smooth objects. Here, we present new scalable MFS formulations for the corresponding elastance and mobility problems. The elastance problem computes the potentials of conductors with given net charges, while the mobility problem—crucial to rheology and complex fluid applications—computes rigid body velocities given net forces and torques on the particles. The key idea is orthogonal projection of the net charge (or forces and torques) in a rectangular variant of a “completion flow.” The proposal is compatible with one-body preconditioning, resulting in well-conditioned square linear systems amenable to fast multipole accelerated iterative solution, thus a cost linear in the particle number. For large suspensions with moderate lubrication forces, MFS sources on inner proxy-surfaces give accuracy on par with a well-resolved boundary integral formulation. Our several numerical tests include a suspension of 10,000 nearby ellipsoids, using 2.6\times 10^7
total preconditioned degrees of freedom, where GMRES converges to five digits of accuracy in under two hours on one workstation

Show Abstract

GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments

Hanlin Zhu, Tianyu Guo, Song Mei, Stuart Russell, N. Ghosh, A. Bietti, Jiantao Jiao

As LLMs are increasingly deployed as agents, agentic reasoning - the ability to combine tool use, especially search, and reasoning - becomes a critical skill. However, it is hard to disentangle agentic reasoning when evaluated in complex environments and tasks. Current agent benchmarks often mix agentic reasoning with challenging math reasoning, expert-level knowledge, and other advanced capabilities. To fill this gap, we build a novel benchmark, GSM-Agent, where an LLM agent is required to solve grade-school-level reasoning problems, but is only presented with the question in the prompt without the premises that contain the necessary information to solve the task, and needs to proactively collect that information using tools. Although the original tasks are grade-school math problems, we observe that even frontier models like GPT-5 only achieve 67% accuracy. To understand and analyze the agentic reasoning patterns, we propose the concept of agentic reasoning graph: cluster the environment's document embeddings into nodes, and map each tool call to its nearest node to build a reasoning path. Surprisingly, we identify that the ability to revisit a previously visited node, widely taken as a crucial pattern in static reasoning, is often missing for agentic reasoning for many models. Based on the insight, we propose a tool-augmented test-time scaling method to improve LLM's agentic reasoning performance by adding tools to encourage models to revisit. We expect our benchmark and the agentic reasoning framework to aid future studies of understanding and pushing the boundaries of agentic reasoning.

Show Abstract

A domain decomposition method for computing the scattering matrix of waveguide circuits

Tristan Goodwill, S. Jiang, M. Rachh, Kosuke Sugita

We analyze and develop numerical methods for time-harmonic wave scattering in metallic waveguide structures of infinite extent. We show that radiation boundary conditions formulated via projectors onto outgoing modes determine the coefficients of propagating modes uniquely, even when the structure supports trapped modes. Building on this, we introduce a fast divide-and-conquer solver that constructs solution operators on subdomains as impedance-to-impedance maps and couples them by enforcing continuity conditions across their interfaces. For Dirichlet waveguides, the computation of impedance-to-impedance maps requires the solution of mixed Dirichlet-Impedance boundary value problems. We construct a second-kind Fredholm integral equation that avoids near-hypersingular operators, requiring only integral operators whose kernels are at most weakly singular. Numerical experiments on large structures with many circuit elements demonstrate substantial efficiency gains: the proposed approach typically outperforms state-of-the-art fast iterative and fast direct solvers by one to two orders of magnitude.

Show Abstract

Fast summation of Stokes potentials using a new kernel-splitting in the DMK framework

Ludvig af Klinteberg, L. Greengard, S. Jiang, Anna-Karin Tornberg

Classical Ewald methods for Coulomb and Stokes interactions rely on ``kernel-splitting," using decompositions based on Gaussians to divide the resulting potential into a near field and a far field component. Here, we show that a more efficient splitting for the scalar biharmonic Green's function can be derived using zeroth-order prolate spheroidal wave functions (PSWFs), which in turn yields new efficient splittings for the Stokeslet, stresslet, and elastic kernels, since these Green's tensors can all be derived from the biharmonic kernel. This benefits all fast summation methods based on kernel splitting, including FFT-based Ewald summation methods, that are suitable for uniform point distributions, and DMK-based methods that allow for nonuniform point distributions. The DMK (dual-space multilevel kernel-splitting) algorithm we develop here is fast, adaptive, and linear-scaling, both in free space and in a periodic cube. We demonstrate its performance with numerical examples in two and three dimensions.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates