481 Publications

Accelerating Fast Ewald Summation with Prolates for Molecular Dynamics Simulations

Fast Ewald summation is the most widely used approach for computing long-range Coulomb interactions in molecular dynamics (MD) simulations. While the asymptotic scaling is nearly optimal, its performance on parallel architectures is dominated by the global communication required for the underlying fast Fourier transform (FFT). Here, we develop a novel method, ESP - Ewald summation with prolate spheroidal wave functions (PSWFs) - that, for a fixed precision, sharply reduces the size of this transform by performing the Ewald split via a PSWF. In addition, PSWFs minimize the cost of spreading and interpolation steps that move information between the particles and the underlying uniform grid. We have integrated the ESP method into two widely-used open-source MD packages: LAMMPS and GROMACS. Detailed benchmarks show that this reduces the cost of computing far-field electrostatic interactions by an order of magnitude, leading to better strong scaling with respect to number of cores. The total execution time is reduced by a factor of 2 to 3 when using more than one thousand cores, even after optimally tuning the existing internal parameters in the native codes. We validate the accelerated codes in realistic long-time biological simulations.

Show Abstract

BAnG Bidirectional Anchored Generation for Conditional RNA Design

Roman Klypa, A. Bietti, Sergei Grudinin

Designing RNA molecules that interact with specific proteins is a critical challenge in experimental and computational biology. Existing computational approaches require a substantial amount of experimentally determined RNA sequences for each specific protein or a detailed knowledge of RNA structure, restricting their utility in practice. To address this limitation, we develop RNA-BAnG, a deep learning-based model designed to generate RNA sequences for protein interactions without these requirements. Central to our approach is a novel generative method, Bidirectional Anchored Generation (BAnG), which leverages the observation that protein-binding RNA sequences often contain functional binding motifs embedded within broader sequence contexts. We first validate our method on generic synthetic tasks involving similar localized motifs to those appearing in RNAs, demonstrating its benefits over existing generative approaches. We then evaluate our model on biological sequences, showing its effectiveness for conditional RNA sequence design given a binding protein.

Show Abstract

In-Context Denoising with One-Layer Transformers: Connections between Attention and Associative Memory Retrieval

We introduce in-context denoising, a task that refines the connection between attention-based architectures and dense associative memory (DAM) networks, also known as modern Hopfield networks. Using a Bayesian framework, we show theoretically and empirically that certain restricted denoising problems can be solved optimally even by a single-layer transformer. We demonstrate that a trained attention layer processes each denoising prompt by performing a single gradient descent update on a context-aware DAM energy landscape, where context tokens serve as associative memories and the query token acts as an initial state. This one-step update yields better solutions than exact retrieval of either a context token or a spurious local minimum, providing a concrete example of DAM networks extending beyond the standard retrieval paradigm. Overall, this work solidifies the link between associative memory and attention mechanisms first identified by Ramsauer et al., and demonstrates the relevance of associative memory models in the study of in-context learning.

Show Abstract

Understanding Input Selectivity in Mamba: Impact on Approximation Power, Memorization, and Associative Recall Capacity

T. Huang, Miguel Sarabia, Et al.

State-Space Models (SSMs), and particularly Mamba, have recently emerged as a promising alternative to Transformers. Mamba introduces input selectivity to its SSM layer (S6) and incorporates convolution and gating into its block definition. While these modifications do improve Mamba's performance over its SSM predecessors, it remains largely unclear how Mamba leverages the additional functionalities provided by input selectivity, and how these interact with the other operations in the Mamba architecture. In this work, we demystify the role of input selectivity in Mamba, investigating its impact on function approximation power, long-term memorization, and associative recall capabilities. In particular: (i) we prove that the S6 layer of Mamba can represent projections onto Haar wavelets, providing an edge over its Diagonal SSM (S4D) predecessor in approximating discontinuous functions commonly arising in practice; (ii) we show how the S6 layer can dynamically counteract memory decay; (iii) we provide analytical solutions to the MQAR associative recall task using the Mamba architecture with different mixers --- Mamba, Mamba-2, and S4D. We demonstrate the tightness of our theoretical constructions with empirical results on concrete tasks. Our findings offer a mechanistic understanding of Mamba and reveal opportunities for improvement.

Show Abstract

Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers

Lei Chen, J. Bruna, A. Bietti

Large language models have been successful at tasks involving basic forms of in-context reasoning, such as generating coherent language, as well as storing vast amounts of knowledge. At the core of the Transformer architecture behind such models are feed-forward and attention layers, which are often associated to knowledge and reasoning, respectively. In this paper, we study this distinction empirically and theoretically in a controlled synthetic setting where certain next-token predictions involve both distributional and in-context information. We find that feed-forward layers tend to learn simple distributional associations such as bigrams, while attention layers focus on in-context reasoning. Our theoretical analysis identifies the noise in the gradients as a key factor behind this discrepancy. Finally, we illustrate how similar disparities emerge in pre-trained models through ablations on the Pythia model family on simple reasoning tasks.

Show Abstract

Superfast Direct Inversion of the Nonuniform Discrete Fourier Transform via Hierarchically Semiseparable Least Squares

Heather Wilber, Ethan N. Epperly, A. Barnett

A direct solver is introduced for solving overdetermined linear systems involving nonuniform discrete Fourier transform matrices. Such matrices can be transformed into a Cauchy-like form that has hierarchical low rank structure. The rank structure of this matrix is explained, and it is shown that the ranks of the relevant submatrices grow only logarithmically with the number of columns of the matrix. A fast rank-structured hierarchical approximation method based on this analysis is developed, along with a hierarchical least-squares solver for these and related systems. This result is a direct method for inverting nonuniform discrete transforms with a complexity that is usually nearly linear with respect to the degrees of freedom in the problem. This solver is benchmarked against various iterative and direct solvers in the setting of inverting the one-dimensional type-II (or forward) transform, for a range of condition numbers and problem sizes (up to (4 10

Show Abstract

Charge distribution and helicity tune the binding of septin’s amphipathic helix domain to membranes

C. Edelmaier, Stephen J. Klawa, M. Mofidi, S. Hanson, et al.

Amphipathic helices (AHs) are secondary structures that can facilitate binding of proteins to the membrane by folding into a helix with hydrophobic and hydrophilic faces that interact with the same surfaces in the lipid membrane. Septins are cytoskeletal proteins that preferentially bind to domains of micron-scale curvature on the cell membrane. Studies have shown that AH domains in septin are essential for curvature sensing. We present the first computational study of septin AH interactions with lipid bilayers. Using all-atom simulations and metadynamics-enhanced sampling, we study the effect of charge distribution at the flanking ends of septin AH on the energy for helical folding and its consequences on the binding configuration and affinity to the membrane. This is relevant to septins, since the net positive charge on the flanking C-terminal amino acids is a conserved property across several organisms. Simulations revealed that the energy barrier for folding in the neutral-capped AH is much larger than the charge-capped AH, leading to a small fraction of AH folding and integration to the membrane compared to a significantly folded configuration in the bound charge-capped AH. These observations are consistent with the binding measurements of synthetic AH constructs with variable helicity to lipid vesicles. Additionally, we examined an extended AH sequence including eight amino acids upstream and downstream of the AH to mimic the native protein. Again, simulations and experiments show that the extended peptide, with a net positive charge at C-terminus, adopts a strong helical configuration in solution, giving rise to a higher membrane affinity. Altogether, these results identify the energy cost for folding of AHs as a regulator of AH binding configuration and affinity and provide a basic template for parameterizing AH-membrane interactions as a starting point for the future multiscale simulations for septin-membrane interactions.

Show Abstract

InstaMap: instant-NGP for cryo-EM density maps

Geoffrey Woollard, P. Cossio, S. Hanson, et al.

Despite the parallels between problems in computer vision and cryo-electron microscopy (cryo-EM), many state-of-the-art approaches from computer vision have yet to be adapted for cryo-EM. Within the computer-vision research community, implicits such as neural radiance fields (NeRFs) have enabled the detailed reconstruction of 3D objects from few images at different camera-viewing angles. While other neural implicits, specifically density fields, have been used to map conformational heterogeneity from noisy cryo-EM projection images, most approaches represent volume with an implicit function in Fourier space, which has disadvantages compared with solving the problem in real space, complicating, for instance, masking, constraining physics or geometry, and assessing local resolution. In this work, we build on a recent development in neural implicits, a multi-resolution hash-encoding framework called instant-NGP, that we use to represent the scalar volume directly in real space and apply it to the cryo-EM density-map reconstruction problem (InstaMap). We demonstrate that for both synthetic and real data, InstaMap for homogeneous reconstruction achieves higher resolution at shorter training stages than five other real-spaced representations. We propose a solution to noise overfitting, demonstrate that InstaMap is both lightweight and fast to train, implement masking from a user-provided input mask and extend it to molecular-shape heterogeneity via bending space using a per-image vector field.

Show Abstract

Simulation-based inference of single-molecule experiments

Lars Dingeldein, P. Cossio, Roberto Covino

Single-molecule experiments are a unique tool to characterize the structural dynamics of biomolecules. However, reconstructing molecular details from noisy single-molecule data is challenging. Simulation-based inference (SBI) integrates statistical inference, physics-based simulators, and machine learning and is emerging as a powerful framework for analysing complex experimental data. Recent advances in deep learning have accelerated the development of new SBI methods, enabling the application of Bayesian inference to an ever-increasing number of scientific problems. Here, we review the nascent application of SBI to the analysis of single-molecule experiments. We introduce parametric Bayesian inference and discuss its limitations. We then overview emerging deep-learning-based SBI methods to perform Bayesian inference for complex models encoded in computer simulators. We illustrate the first applications of SBI to single-molecule force-spectroscopy and cryo-electron microscopy experiments. SBI allows us to leverage powerful computer algorithms modeling complex biomolecular phenomena to connect scientific models and experiments in a principled way.

Show Abstract

Solving optimal control problems of rigid-body dynamics with collisions using the hybrid minimum principle

Wei Hu, Jihao Long, Yaohua Zang, Weinan E , J. Han

Collisions are common in many dynamical systems with real applications. They can be formulated as hybrid dynamical systems with discontinuities automatically triggered when states transverse certain manifolds. We present an algorithm for the optimal control problem of such hybrid dynamical systems based on solving the equations derived from the hybrid minimum principle (HMP). The algorithm is an iterative scheme following the spirit of the method of successive approximations (MSA), and it is robust to undesired collisions observed in the initial guesses. We propose several techniques to address the additional numerical challenges introduced by the presence of discontinuities. The algorithm is tested on disc collision problems whose optimal solutions exhibit one or multiple collisions. Linear convergence in terms of iteration steps and asymptotic first-order accuracy in terms of time discretization are observed when the algorithm is implemented with the forward-Euler scheme. The numerical results demonstrate that the proposed algorithm has better accuracy and convergence than direct methods based on gradient descent. Furthermore, the algorithm is also simpler, more accurate, and more stable than a deep reinforcement learning method.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates