443 Publications

Variational Inference with Gaussian Score Matching

C. Modi, C. Margossian, Y. Yao, R. M. Gower, D. Blei, L. Saul

Variational inference (VI) is a method to approximate the computationally intractable posterior distributions that arise in Bayesian statistics. Typically, VI fits a simple parametric distribution to be close to the target posterior, optimizing an appropriate objective such as the evidence lower bound (ELBO). In this work, we present a new approach to VI. Our method is based on the principle of score matching---namely, that if two distributions are equal then their score functions (i.e., gradients of the log density) are equal at every point on their support. With this principle, we develop score-matching VI, an iterative algorithm that seeks to match the scores between the variational approximation and the exact posterior. At each iteration, score-matching VI solves an inner optimization, one that minimally adjusts the current variational estimate to match the scores at a newly sampled value of the latent variables. We show that when the variational family is a Gaussian, this inner optimization enjoys a closed-form solution, which we call Gaussian score matching VI (GSM-VI). GSM-VI is a ``black box'' variational algorithm in that it only requires a differentiable joint distribution, and as such it can be applied to a wide class of models. We compare GSM-VI to black box variational inference (BBVI), which has similar requirements but instead optimizes the ELBO. We first study how GSM-VI behaves as a function of the problem dimensionality, the condition number of the target covariance matrix (when the target is Gaussian), and the degree of mismatch between the approximating and exact posterior distribution. We then study GSM-VI on a collection of real-world Bayesian inference problems from the posteriorDB database of datasets and models. We find that GSM-VI is faster than BBVI and equally or more accurate. Specifically, over a wide range of target posteriors, GSM-VI requires 10-100x fewer gradient evaluations than BBVI to obtain a comparable quality of approximation.

Show Abstract

FMM-accelerated solvers for the Laplace-Beltrami problem on complex surfaces in three dimensions

Dhwanit Agarwal, Michael O'Neil, M. Rachh

The Laplace–Beltrami problem on closed surfaces embedded in three dimensions arises in many areas of physics, including molecular dynamics (surface diffusion), electromagnetics (harmonic vector fields), and fluid dynamics (vesicle deformation). Using classical potential theory, the Laplace–Beltrami operator can be pre-/post-conditioned with an integral operator whose kernel is translation invariant, resulting in well-conditioned Fredholm integral equations of the second-kind. These equations have the standard 1/r kernel from potential theory, and therefore the equations can be solved rapidly and accurately using a combination of fast multipole methods (FMMs) and high-order quadrature corrections. In this work we detail such a scheme, presenting two alternative integral formulations of the Laplace–Beltrami problem, each of whose solution can be obtained via FMM acceleration. We then present several applications of the solvers, focusing on the computation of what are known as harmonic vector fields, relevant for many applications in electromagnetics. A battery of numerical results are presented for each application, detailing the performance of the solver in various geometries.

Show Abstract

A Gentle Introduction to Gradient-Based Optimization and Variational Inequalities for Machine Learning

N. Wadia, Yatin Dandi, Michael I. Jordan

The rapid progress in machine learning in recent years has been based on a highly productive connection to gradient-based optimization. Further progress hinges in part on a shift in focus from pattern recognition to decision-making and multi-agent problems. In these broader settings, new mathematical challenges emerge that involve equilibria and game theory instead of optima. Gradient-based methods remain essential -- given the high dimensionality and large scale of machine-learning problems -- but simple gradient descent is no longer the point of departure for algorithm design. We provide a gentle introduction to a broader framework for gradient-based algorithms in machine learning, beginning with saddle points and monotone games, and proceeding to general variational inequalities. While we provide convergence proofs for several of the algorithms that we present, our main focus is that of providing motivation and intuition.

Show Abstract

Reinforcement learning with function approximation: From linear to nonlinear

Jihao Long, J. Han

Function approximation has been an indispensable component in modern reinforcement learning algorithms designed to tackle problems with large state spaces in high dimensions. This paper reviews recent results on error analysis for these reinforcement learning algorithms in linear or nonlinear approximation settings, emphasizing approximation error and estimation error/sample complexity. We discuss various properties related to approximation error and present concrete conditions on transition probability and reward function under which these properties hold true. Sample complexity analysis in reinforcement learning is more complicated than in supervised learning, primarily due to the distribution mismatch phenomenon. With assumptions on the linear structure of the problem, numerous algorithms in the literature achieve polynomial sample complexity with respect to the number of features, episode length, and accuracy, although the minimax rate has not been achieved yet. These results rely on the $L^∞$ and UCB estimation of estimation error, which can handle the distribution mismatch phenomenon. The problem and analysis become substantially more challenging in the setting of nonlinear function approximation, as both $L^∞$ and UCB estimation are inadequate for bounding the error with a favorable rate in high dimensions. We discuss additional assumption necessary to address the distribution mismatch and derive meaningful results for nonlinear RL problems.

Show Abstract

An equivariant neural operator for developing nonlocal tensorial constitutive models

J. Han, Xu-Hui Zhou, Heng Xiao

Developing robust constitutive models is a fundamental and longstanding problem for accelerating the simulation of complicated physics. Machine learning provides promising tools to construct constitutive models based on various calibration data. In this work, we propose a neural operator to develop nonlocal constitutive models for tensorial quantities through a vector-cloud neural network with equivariance (VCNN-e). The VCNN-e respects all the invariance properties desired by constitutive models, faithfully reflects the region of influence in physics, and is applicable to different spatial resolutions. By design, the model guarantees that the predicted tensor is invariant to the frame translation and ordering (permutation) of the neighboring points. Furthermore, it is equivariant to the frame rotation, i.e., the output tensor co-rotates with the coordinate frame. We evaluate the VCNN-e by using it to emulate the Reynolds stress transport model for turbulent flows, which directly computes the Reynolds stress tensor to close the Reynolds-averaged Navier--Stokes (RANS) equations. The evaluation is performed in two situations: (1) emulating the Reynolds stress model through synthetic data generated from the Reynolds stress transport equations with closure models, and (2) predicting the Reynolds stress by learning from data generated from direct numerical simulations. Such a priori evaluations of the proposed network pave the way for developing and calibrating robust and nonlocal, non-equilibrium closure models for the RANS equations.

Show Abstract

Automatic, high-order, and adaptive algorithms for Brillouin zone integration

J. Kaye, Sophie Beck, A. Barnett, Lorenzo Van Muñoz, Olivier Parcollet

We present efficient methods for Brillouin zone integration with a non-zero but possibly very small broadening factor η, focusing on cases in which downfolded Hamiltonians can be evaluated efficiently using Wannier interpolation. We describe robust, high-order accurate algorithms automating convergence to a user-specified error tolerance ϵ, emphasizing an efficient computational scaling with respect to η. After analyzing the standard equispaced integration method, applicable in the case of large broadening, we describe a simple iterated adaptive integration algorithm effective in the small η regime. Its computational cost scales as O(log3(η−1)) as η → 0+ in three dimensions, as opposed to O(η−3) for equispaced integration. We argue that, by contrast, tree-based adaptive integration methods scale only as O(log(η−1)/η2) for typical Brillouin zone integrals. In addition to its favorable scaling, the iterated adaptive algorithm is straightforward to implement, particularly for integration on the irreducible Brillouin zone, for which it avoids the tetrahedral meshes required for tree-based schemes. We illustrate the algorithms by calculating the spectral function of SrVO3 with broadening on the meV scale.

Show Abstract

DeePMD-kit v2: A software package for deep potential models

Jinzhe Zeng, Duo Zhang, J. Han

DeePMD-kit is a powerful open-source software package that facilitates molecular dynamics simulations using machine learning potentials known as Deep Potential (DP) models. This package, which was released in 2017, has been widely used in the fields of physics, chemistry, biology, and material science for studying atomistic systems. The current version of DeePMD-kit offers numerous advanced features, such as DeepPot-SE, attention-based and hybrid descriptors, the ability to fit tensile properties, type embedding, model deviation, DP-range correction, DP long range, graphics processing unit support for customized operators, model compression, non-von Neumann molecular dynamics, and improved usability, including documentation, compiled binary packages, graphical user interfaces, and application programming interfaces. This article presents an overview of the current major version of the DeePMD-kit package, highlighting its features and technical details. Additionally, this article presents a comprehensive procedure for conducting molecular dynamics as a representative application, benchmarks the accuracy and efficiency of different models, and discusses ongoing developments.

Show Abstract

Normative framework for deriving neural networks with multi-compartmental neurons and non-Hebbian plasticity

D. Lipshutz, Y. Bahroun, S. Golkar, A. Sengupta, D. Chklovskii

An established normative approach for understanding the algorithmic basis of neural computation is to derive online algorithms from principled computational objectives and evaluate their compatibility with anatomical and physiological observations. Similarity matching objectives have served as successful starting points for deriving online algorithms that map onto neural networks (NNs) with point neurons and Hebbian/anti-Hebbian plasticity. These NN models account for many anatomical and physiological observations; however, the objectives have limited computational power and the derived NNs do not explain multi-compartmental neuronal structures and non-Hebbian forms of plasticity that are prevalent throughout the brain. In this article, we unify and generalize recent extensions of the similarity matching approach to address more complex objectives, including a large class of unsupervised and self-supervised learning tasks that can be formulated as symmetric generalized eigenvalue problems or nonnegative matrix factorization problems. Interestingly, the online algorithms derived from these objectives naturally map onto NNs with multi-compartmental neurons and local, non-Hebbian learning rules. Therefore, this unified extension of the similarity matching approach provides a normative framework that facilitates understanding multi-compartmental neuronal structures and non-Hebbian plasticity found throughout the brain.

Show Abstract

A fast time domain solver for the equilibrium Dyson equation

J. Kaye, Hugo U. R. Strand

We consider the numerical solution of the real time equilibrium Dyson equation, which is used in calculations of the dynamical properties of quantum many-body systems. We show that this equation can be written as a system of coupled, nonlinear, convolutional Volterra integro-differential equations, for which the kernel depends self-consistently on the solution. As is typical in the numerical solution of Volterra-type equations, the computational bottleneck is the quadratic-scaling cost of history integration. However, the structure of the nonlinear Volterra integral operator precludes the use of standard fast algorithms. We propose a quasilinear-scaling FFT-based algorithm which respects the structure of the nonlinear integral operator. The resulting method can reach large propagation times, and is thus well-suited to explore quantum many-body phenomena at low energy scales. We demonstrate the solver with two standard model systems: the Bethe graph, and the Sachdev-Ye-Kitaev model.

Show Abstract

Conformational heterogeneity and probability distributions from single-particle cryo-electron microscopy

W. S. Wai Shing, Ellen D. Zhong, S. Hanson, E. Thiede, P. Cossio

Single-particle cryo-electron microscopy (cryo-EM) is a technique that takes projection images of biomolecules frozen at cryogenic temperatures. A major advantage of this technique is its ability to image single biomolecules in heterogeneous conformations. While this poses a challenge for data analysis, recent algorithmic advances have enabled the recovery of heterogeneous conformations from the noisy imaging data. Here, we review methods for the reconstruction and heterogeneity analysis of cryo-EM images, ranging from linear-transformation-based methods to nonlinear deep generative models. We overview the dimensionality-reduction techniques used in heterogeneous 3D reconstruction methods and specify what information each method can infer from the data. Then, we review the methods that use cryo-EM images to estimate probability distributions over conformations in reduced subspaces or predefined by atomistic simulations. We conclude with the ongoing challenges for the cryo-EM community.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates