443 Publications

Towards Seamless Interoperability of MPI-OpenMP Applications

B. Smith, M. Berger, Junchao Zhang, Hui Zhou

A chasm exists between mathematical software libraries written for MPI-based applications and those written for OpenMP applications. Recently, however, PETSc enables the simple use of its MPI-based linear solvers from OpenMP applications. Separately, the MPICH MPI development team has started a new project to allow almost seamless MPI use in OpenMP applications. Both proposed approaches would result in a similar user experience. We discuss the reasons for these projects and their potential for providing more numerical library choices for OpenMP applications, including the unlimited assortment of linear solvers available in PETSc. In addition, we present the performance of an application using the first approach, demonstrating its efficacy.

Show Abstract

Predicting partially observable dynamical systems via diffusion models with a multiscale inference scheme

R. Morel, Francesco Pio Ramunno, Jeff Shen, A. Bietti, K. Cho, M. Cranmer, S. Golkar, Olexandr Gugnin , G. Krawezik, Et al.

Conditional diffusion models provide a natural framework for probabilistic prediction of dynamical systems and have been successfully applied to fluid dynamics and weather prediction. However, in many settings, the available information at a given time represents only a small fraction of what is needed to predict future states, either due to measurement uncertainty or because only a small fraction of the state can be observed. This is true for example in solar physics, where we can observe the Sun’s surface and atmosphere, but its evolution is driven by internal processes for which we lack direct measurements. In this paper, we tackle the probabilistic prediction of partially observable, long-memory dynamical systems, with applications to solar dynamics and the evolution of active regions. We show that standard inference schemes, such as autoregressive rollouts, fail to capture long-range dependencies in the data, largely because they do not integrate past information effectively. To overcome this, we propose a multiscale inference scheme for diffusion models, tailored to physical processes. Our method generates trajectories that are temporally fine-grained near the present and coarser as we move farther away, which enables capturing long-range temporal dependencies without increasing computational cost. When integrated into a diffusion model, we show that our inference scheme significantly reduces the bias of the predicted distributions and improves rollout stability.

Show Abstract

AION-1: Omnimodal Foundation Model for Astronomical Sciences

L. Parker, F. Lanusse, Jeff Shen, Ollie Liu, Tom Hehir, L. Sarra, Lucas Meyer, Micah Bowles, S. Wagner-Carena, H. Qu, S. Golkar, A. Bietti, R. Morel, Et al.

While foundation models have shown promise across a variety of fields, astronomy lacks a unified framework for joint modeling across its highly diverse data modalities. In this paper, we present AION-1, the first large-scale multimodal foundation family of models for astronomy. AION-1 enables arbitrary transformations between heterogeneous data types using a two-stage architecture: modality-specific tokenization followed by transformer-based masked modeling of cross-modal token sequences. Trained on over 200M astronomical objects, AION-1 demonstrates strong performance across regression, classification, generation, and object retrieval tasks. Beyond astronomy, AION-1 provides a scalable blueprint for multimodal scientific foundation models that can seamlessly integrate heterogeneous combinations of real-world observations. Our model release is entirely open source, including the dataset, training script, and weights.

Show Abstract

Quasi-optimal hierarchically semi-separable matrix approximation

Noah Amsel, Tyler Chen, Feyza Duman Keles, Diana Halikias, Cameron Musco, Christopher Musco, D. Persson

We present a randomized algorithm for producing a quasi-optimal hierarchically semi-
separable (HSS) approximation to an N ×N matrix A using only matrix-vector products with A and
AT. We prove that, using O(k log(N/k)) matrix-vector products and O(N k2 log(N/k)) additional
runtime, the algorithm returns an HSS matrix B with rank-k blocks whose expected Frobenius norm
error E[∥A − B∥2
F] is at most O(log(N/k)) times worse than the best possible approximation error by
an HSS rank-k matrix. In fact, the algorithm we analyze in a simple modification of an empirically
effective method proposed by [Levitt & Martinsson, SISC 2024]. As a stepping stone towards our
main result, we prove two results that are of independent interest: a similar guarantee for a variant of
the algorithm which accesses A’s entries directly, and explicit error bounds for near-optimal subspace
approximation using projection-cost-preserving sketches. To the best of our knowledge, our analysis
constitutes the first polynomial-time quasi-optimality result for HSS matrix approximation, both in
the explicit access model and the matrix-vector product query model.

Show Abstract

A fast spectral sum-of-Gaussians method for electrostatic summation in quasi-2D systems

X. Gao, S. Jiang, J. Liang, Zhenli Xu, Qi Zhou

The quasi-2D electrostatic systems, characterized by periodicity in two dimensions with a free third dimension, have garnered significant interest in many fields. We apply the sum-of-Gaussians (SOG) approximation to the Laplace kernel, dividing the interactions into near-field, mid-range, and long-range components. The near-field component, singular but compactly supported in a local domain, is directly calculated. The mid-range component is managed using a procedure similar to nonuniform fast Fourier transforms in three dimensions. The long-range component, which includes Gaussians of large variance, is treated with polynomial interpolation/anterpolation in the free dimension and Fourier spectral solver in the other two dimensions on proxy points. Unlike the fast Ewald summation, which requires extensive zero padding in the case of high aspect ratios, the separability of Gaussians allows us to handle such case without any zero padding in the free direction. Furthermore, while NUFFTs typically rely on certain upsampling in each dimension, and the truncated kernel method introduces an additional factor of upsampling due to kernel oscillation, our scheme eliminates the need for upsampling in any direction due to the smoothness of Gaussians, significantly reducing computational cost for large-scale problems. Finally, whereas all periodic fast multipole methods require dividing the periodic tiling into a smooth far part and a near part containing its nearest neighboring cells, our scheme operates directly on the fundamental cell, resulting in better performance with simpler implementation. We provide a rigorous error analysis showing that upsampling is not required in NUFFT-like steps, achieving O(N N) complexity with a small prefactor. The performance of the scheme is demonstrated via extensive numerical experiments.

Show Abstract

seekrflow: Towards end-to-end automated simulation pipeline with machine-learned force fields for accelerated drug-target kinetic and thermodynamic predictions

A. A. Ojha, Lane W. Votapka, S. Hanson, et al.

Accurate prediction of drug-target binding and unbinding kinetics and thermodynamics is essential for guiding drug discovery and lead optimization. However, traditional atomistic simulations are often too computationally expensive to capture rare events that govern ligand (un)binding. Several enhanced sampling methods exist to overcome these limitations, but they require extensive manual intervention and introduce variability and artifacts in free energy and kinetic estimates that limit high-throughput scalability. The present work introduces seekrflow, an automated multiscale milestoning simulation pipeline that streamlines the entire workflow from a single receptor-ligand input structure to kinetic and thermodynamic predictions in a single step. This integrated approach minimizes manual intervention, reduces computational overhead, and enhances the reproducibility and accuracy of kinetic and thermodynamic predictions. The accuracy and efficiency of the pipeline is demonstrated on multiple receptor-ligand complexes, including inhibitors of heat shock protein 90, threonine-tyrosine kinase, and the trypsin protein, with predicted kinetic parameters closely matching experimental estimates. seekrflow establishes a new benchmark for automated and high-throughput physics-based predictions of kinetics and thermodynamics.

Show Abstract

Study of Protein-Protein Interactions in Septin Assembly: Multiple amphipathic helix domains cooperate in binding to the lipid membrane

Septins are a conserved family of cytoskeletal proteins known for sensing micron-scale membrane curvature via amphipathic helix (AH) domains. While cooperative interactions in septin assembly have been suggested, the molecular mechanisms governing membrane binding and assembly remain unclear. Building on prior findings, we use all-atom molecular dynamics simulations to examine how single and paired extended AH domains, derived from Cdc12, interact with lipid bilayers. A single membrane-bound AH adopts a curved conformation. In solution, a second AH peptide preferentially interacts with the bound peptide through conserved salt bridges, favoring an antiparallel arrangement. Simulations of covalently linked AH tandems confirm this configuration. Dual membrane-bound peptides induce lipid packing defects, reduce tail order, and exhibit slight membrane displacement, suggesting curved membranes may better accommodate multiple AH domains. Our findings advance the mechanistic understanding of septin-membrane interactions and highlight the role of cooperative AH domain binding in stabilizing higher-order structures.

Show Abstract
August 12, 2025

Study of Protein-Protein Interactions in Septin Assembly: Multiple amphipathic helix domains cooperate in binding to the lipid membrane

Septins are a conserved family of cytoskeletal proteins known for sensing micron-scale membrane curvature via amphipathic helix (AH) domains. While cooperative interactions in septin assembly have been suggested, the molecular mechanisms governing membrane binding and assembly remain unclear. Building on prior findings, we use all-atom molecular dynamics simulations to examine how single and paired extended AH domains, derived from Cdc12, interact with lipid bilayers. A single membrane-bound AH adopts a curved conformation. In solution, a second AH peptide preferentially interacts with the bound peptide through conserved salt bridges, favoring an antiparallel arrangement. Simulations of covalently linked AH tandems confirm this configuration. Dual membrane-bound peptides induce lipid packing defects, reduce tail order, and exhibit slight membrane displacement, suggesting curved membranes may better accommodate multiple AH domains. Our findings advance the mechanistic understanding of septin-membrane interactions and highlight the role of cooperative AH domain binding in stabilizing higher-order structures.

Show Abstract

Velocity optimization of self-equilibrated obstacles in a two-dimensional viscous flow

G. Francfort, Alessandro Giacomini, S. Weady

An obstacle is immersed in an externally driven 2D Stokes or Navier-Stokes fluid. We study the self-equilibration conditions for that obstacle under steady state assumptions on the flow. We then seek to optimize the translational and/or angular velocity of the obstacle by varying its shape. To allow general variations, we must consider a very large class of obstacles for which the notion of trace is meaningless. This forces us to revisit the notion of self-equilibration for both Stokes and Navier-Stokes in a measure theoretic environment.

Show Abstract

Query Efficient Structured Matrix Learning

Noah Amsel, Pratyush Avi, Tyler Chen, Feyza Duman Keles, Chinmay Hegde, Cameron Musco, Christopher Musco, D. Persson

We study the problem of learning a structured approximation (low-rank, sparse, banded, etc.) to an unknown matrix $A$ given access to matrix-vector product (matvec) queries of the form $x \rightarrow Ax$ and $x \rightarrow A^Tx$. This problem is of central importance to algorithms across scientific computing and machine learning, with applications to fast multiplication and inversion for structured matrices, building preconditioners for first-order optimization, and as a model for differential operator learning. Prior work focuses on obtaining query complexity upper and lower bounds for learning specific structured matrix families that commonly arise in applications.
We initiate the study of the problem in greater generality, aiming to understand the query complexity of learning approximations from general matrix families. Our main result focuses on finding a near-optimal approximation to $A$ from any finite-sized family of matrices, $\mathcal{F}$. Standard results from matrix sketching show that $O(\log|\mathcal{F}|)$ matvec queries suffice in this setting. This bound can also be achieved, and is optimal, for vector-matrix-vector queries of the form $x,y\rightarrow x^TAy$, which have been widely studied in work on rank-$1$ matrix sensing.
Surprisingly, we show that, in the matvec model, it is possible to obtain a nearly quadratic improvement in complexity, to $\tilde{O}(\sqrt{\log|\mathcal{F}|})$. Further, we prove that this bound is tight up to log-log this http URL covering number arguments, our result extends to well-studied infinite families. As an example, we establish that a near-optimal approximation from any \emph{linear matrix family} of dimension $q$ can be learned with $\tilde{O}(\sqrt{q})$ matvec queries, improving on an $O(q)$ bound achievable via sketching techniques and vector-matrix-vector queries.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates