349 Publications

Simulation-based inference of single-molecule experiments

Lars Dingeldein, P. Cossio, Roberto Covino

Single-molecule experiments are a unique tool to characterize the structural dynamics of biomolecules. However, reconstructing molecular details from noisy single-molecule data is challenging. Simulation-based inference (SBI) integrates statistical inference, physics-based simulators, and machine learning and is emerging as a powerful framework for analysing complex experimental data. Recent advances in deep learning have accelerated the development of new SBI methods, enabling the application of Bayesian inference to an ever-increasing number of scientific problems. Here, we review the nascent application of SBI to the analysis of single-molecule experiments. We introduce parametric Bayesian inference and discuss its limitations. We then overview emerging deep-learning-based SBI methods to perform Bayesian inference for complex models encoded in computer simulators. We illustrate the first applications of SBI to single-molecule force-spectroscopy and cryo-electron microscopy experiments. SBI allows us to leverage powerful computer algorithms modeling complex biomolecular phenomena to connect scientific models and experiments in a principled way.

Show Abstract

AutoBZ.jl: Automatic, adaptive Brillouin zone integration using Wannier interpolation

Lorenzo Van Munoz, Sophie Beck, J. Kaye

We introduce cppdlr, a C++ library implementing the discrete Lehmann representation (DLR) of functions in imaginary time and Matsubara frequency, such as Green's functions and self-energies. The DLR is based on a low-rank approximation of the analytic continuation kernel, and yields a compact and explicit basis consisting of exponentials in imaginary time and simple poles in Matsubara frequency. cppdlr constructs the DLR basis and associated interpolation grids, and implements standard operations. It provides a flexible yet high-level interface, facilitating the incorporation of the DLR into both small-scale applications and existing large-scale software projects.

Show Abstract

Variational Inference in Location-Scale Families: Exact Recovery of the Mean and Correlation Matrix

Given an intractable target density p, variational inference (VI) attempts to find the best approximation q from a tractable family Q. This is typically done by minimizing the exclusive Kullback-Leibler divergence, KL(q||p). In practice, Q is not rich enough to contain p, and the approximation is misspecified even when it is a unique global minimizer of KL(q||p). In this paper, we analyze the robustness of VI to these misspecifications when p exhibits certain symmetries and Q is a location-scale family that shares these symmetries. We prove strong guarantees for VI not only under mild regularity conditions but also in the face of severe misspecifications. Namely, we show that (i) VI recovers the mean of p when p exhibits an \textit{even} symmetry, and (ii) it recovers the correlation matrix of p when in addition~p exhibits an \textit{elliptical} symmetry. These guarantees hold for the mean even when q is factorized and p is not, and for the correlation matrix even when~q and~p behave differently in their tails. We analyze various regimes of Bayesian inference where these symmetries are useful idealizations, and we also investigate experimentally how VI behaves in their absence.

Show Abstract

Active learning of Boltzmann samplers and potential energies with quantum mechanical accuracy

Ana Molina-Taborda, P. Cossio, et al.

Extracting consistent statistics between relevant free energy minima of a molecular system is essential for physics, chemistry, and biology. Molecular dynamics (MD) simulations can aid in this task but are computationally expensive, especially for systems that require quantum accuracy. To overcome this challenge, we developed an approach combining enhanced sampling with deep generative models and active learning of a machine learning potential (MLP). We introduce an adaptive Markov chain Monte Carlo framework that enables the training of one normalizing flow (NF) and one MLP per state, achieving rapid convergence toward the Boltzmann distribution. Leveraging the trained NF and MLP models, we compute thermodynamic observables such as free energy differences and optical spectra. We apply this method to study the isomerization of an ultrasmall silver nanocluster belonging to a set of systems with diverse applications in the fields of medicine and catalysis.

Show Abstract

Integral formulation of Klein-Gordon singular waveguides

Guillaume Bal, Jeremy Hoskins, S. Quinn, M. Rachh

We consider the analysis of singular waveguides separating insulating phases in two-space dimensions. The insulating domains are modeled by a massive Schrödinger equation and the singular waveguide by appropriate jump conditions along the one-dimensional interface separating the insulators. We present an integral formulation of the problem and analyze its mathematical properties. We also implement a fast multipole and sweeping-accelerated iterative algorithm for solving the integral equations, and demonstrate numerically the fast convergence of this method. Several numerical examples of solutions and scattering effects illustrate our theory.

Show Abstract

New Statistical Metric for Robust Target Detection in Cryo-EM Using 2DTM

Kexin Zhang, P. Cossio, A. Rangan, Bronwyn Lucas, Nikolaus Grigorieff

2D template matching (2DTM) can be used to detect molecules and their assemblies in cellular cryo-EM images with high positional and orientational accuracy. While 2DTM successfully detects spherical targets such as large ribosomal subunits, challenges remain in detecting smaller and more aspherical targets in various environments. In this work, a novel 2DTM metric, referred to as the 2DTM p-value, is developed to extend the 2DTM framework to more complex applications. The 2DTM p-value combines information from two previously used 2DTM metrics, namely the 2DTM signal-to-noise ratio (SNR) and z-score, which are derived from the cross-correlation coefficient between the target and the template. The 2DTM p-value demonstrates robust detection accuracies under various imaging and sample conditions and outperforms the 2DTM SNR and z-score alone. Specifically, the 2DTM p-value improves the detection of aspherical targets such as a modified artificial tubulin patch particle (500 kDa) and a much smaller clathrin monomer (193 kDa) in simulated data. It also accurately recovers mature 60S ribosomes in yeast lamellae samples, even under conditions of increased Gaussian noise. The new metric will enable the detection of a wider variety of targets in both purified and cellular samples through 2DTM.

Show Abstract
2024

A comprehensive exploration of quasisymmetric stellarators and their coil sets

A. Giuliani, Eduardo Rodríguez, M. Spivak

We augment the `QUAsi-symmetric Stellarator Repository' (QUASR) to include vacuum field stellarators with quasihelical symmetry using a globalized optimization workflow. The database now has almost 370,000 quasisaxisymmetry and quasihelically symmetric devices along with coil sets, optimized for a variety of aspect ratios, rotational transforms, and discrete rotational symmetries. This paper outlines a couple of ways to explore and characterize the data set. We plot devices on a near-axis quasisymmetry landscape, revealing close correspondence to this predicted landscape. We also use principal component analysis to reduce the dimensionality of the data so that it can easily be visualized in two or three dimensions. Principal component analysis also gives a mechanism to compare the new devices here to previously published ones in the literature. We are able to characterize the structure of the data, observe clusters, and visualize the progression of devices in these clusters. These techniques reveal that the data has structure, and that typically one, two or three principal components are sufficient to characterize it. QUASR is archived at this https URL and can be explored online at this http URL.

Show Abstract

Classical variational phase-field models cannot predict fracture nucleation

Oscar Lopez-Pamies, John E. Dolbow, G. Francfort, Christopher J. Larsen

Notwithstanding the evidence against them, classical variational phase-field models continue to be used and pursued in an attempt to describe fracture nucleation in elastic brittle materials. In this context, the main objective of this paper is to provide a comprehensive review of the existing evidence against such a class of models as descriptors of fracture nucleation. To that end, a review is first given of the plethora of experimental observations of fracture nucleation in nominally elastic brittle materials under quasi-static loading conditions, as well as of classical variational phase-field models, without and with energy splits. These models are then confronted with the experimental observations. The conclusion is that they cannot possibly describe fracture nucleation in general. This because classical variational phase-field models cannot account for material strength as an independent macroscopic material property. The last part of the paper includes a brief summary of a class of phase-field models that can describe fracture nucleation. It also provides a discussion of how pervasively material strength has been overlooked in the analysis of fracture at large, as well as an outlook into the modeling of fracture nucleation beyond the basic setting of elastic brittle materials.

Show Abstract

Decomposing imaginary time Feynman diagrams using separable basis functions: Anderson impurity model strong coupling expansion

J. Kaye, Hugo Strand, Denis Golez

We present a deterministic algorithm for the efficient evaluation of imaginary time diagrams based on the recently introduced discrete Lehmann representation (DLR) of imaginary time Green's functions. In addition to the efficient discretization of diagrammatic integrals afforded by its approximation properties, the DLR basis is separable in imaginary time, allowing us to decompose diagrams into linear combinations of nested sequences of one-dimensional products and convolutions. Focusing on the strong coupling bold-line expansion of generalized Anderson impurity models, we show that our strategy reduces the computational complexity of evaluating an $M$th-order diagram at inverse temperature $\beta$ and spectral width $\omega_{\max}$ from $\mathcal{O}((\beta \omega_{\max})^{2M-1})$ for a direct quadrature to $\mathcal{O}(M (\log (\beta \omega_{\max}))^{M+1})$, with controllable high-order accuracy. We benchmark our algorithm using third-order expansions for multi-band impurity problems with off-diagonal hybridization and spin-orbit coupling, presenting comparisons with exact diagonalization and quantum Monte Carlo approaches. In particular, we perform a self-consistent dynamical mean-field theory calculation for a three-band Hubbard model with strong spin-orbit coupling representing a minimal model of Ca$_2$RuO$_4$, demonstrating the promise of the method for modeling realistic strongly correlated multi-band materials. For both strong and weak coupling expansions of low and intermediate order, in which diagrams can be enumerated, our method provides an efficient, straightforward, and robust black-box evaluation procedure. In this sense, it fills a gap between diagrammatic approximations of the lowest order, which are simple and inexpensive but inaccurate, and those based on Monte Carlo sampling of high-order diagrams.

Show Abstract

Cosmological constraints from non-Gaussian and nonlinear galaxy clustering using the SimBIG inference framework

ChangHoon Hahn, Pablo Lemos, Liam Parker, B. Régaldo-Saint Blancard, M. Eickenberg, Shirley Ho, Ph.D. , Jiamin Hou, Elena Massara , Chirag Modi , Azadeh Moradinezhad Dizgah, David Spergel, Ph.D.

The standard ΛCDM cosmological model predicts the presence of cold dark matter, with the current accelerated expansion of the Universe driven by dark energy. This model has recently come under scrutiny because of tensions in measurements of the expansion and growth histories of the Universe, parameterized using H0 and S8. The three-dimensional clustering of galaxies encodes key cosmological information that addresses these tensions. Here we present a set of cosmological constraints using simulation-based inference that exploits additional non-Gaussian information on nonlinear scales from galaxy clustering, inaccessible with current analyses. We analyse a subset of the Baryon Oscillation Spectroscopic Survey (BOSS) galaxy survey using SimBIG, a new framework for cosmological inference that leverages high-fidelity simulations and deep generative models. We use two clustering statistics beyond the standard power spectrum: the bispectrum and a summary of the galaxy field based on a convolutional neural network. We constrain H0 and S8 1.5 and 1.9 times more tightly than power spectrum analyses. With this increased precision, our constraints are competitive with those of other cosmological probes, even with only 10% of the full BOSS volume. Future work extending SimBIG to upcoming spectroscopic galaxy surveys (DESI, PFS, Euclid) will produce improved cosmological constraints that will develop understanding of cosmic tensions.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates