178 Publications

Algorithms for solving high dimensional PDEs: from nonlinear Monte Carlo to machine learning

Weinan E, J. Han, Arnulf Jentzen

In recent years, tremendous progress has been made on numerical algorithms for solving partial differential equations (PDEs) in a very high dimension, using ideas from either nonlinear (multilevel) Monte Carlo or deep learning. They are potentially free of the curse of dimensionality for many different applications and have been proven to be so in the case of some nonlinear Monte Carlo methods for nonlinear parabolic PDEs. In this paper, we review these numerical and theoretical advances. In addition to algorithms based on stochastic reformulations of the original problem, such as the multilevel Picard iteration and the deep backward stochastic differential equations method, we also discuss algorithms based on the more traditional Ritz, Galerkin, and least square formulations. We hope to demonstrate to the reader that studying PDEs as well as control and variational problems in very high dimensions might very well be among the most promising new directions in mathematics and scientific computing in the near future.

Show Abstract

Neural optimal feedback control with local learning rules

A major problem in motor control is understanding how the brain plans and executes proper movements in the face of delayed and noisy stimuli. A prominent framework for addressing such control problems is Optimal Feedback Control (OFC). OFC generates control actions that optimize behaviorally relevant criteria by integrating noisy sensory stimuli and the predictions of an internal model using the Kalman filter or its extensions. However, a satisfactory neural model of Kalman filtering and control is lacking because existing proposals have the following limitations: not considering the delay of sensory feedback, training in alternating phases, and requiring knowledge of the noise covariance matrices, as well as that of systems dynamics. Moreover, the majority of these studies considered Kalman filtering in isolation, and not jointly with control. To address these shortcomings, we introduce a novel online algorithm which combines adaptive Kalman filtering with a model free control approach (i.e., policy gradient algorithm). We implement this algorithm in a biologically plausible neural network with local synaptic plasticity rules. This network performs system identification and Kalman filtering, without the need for multiple phases with distinct update rules or the knowledge of the noise covariances. It can perform state estimation with delayed sensory feedback, with the help of an internal model. It learns the control policy without requiring any knowledge of the dynamics, thus avoiding the need for weight transport. In this way, our implementation of OFC solves the credit assignment problem needed to produce the appropriate sensory-motor control in the presence of stimulus delay.

Show Abstract

Approximating the Gaussian as a Sum of Exponentials and Its Applications to the Fast Gauss Transform

We develop efficient and accurate sum-of-exponential (SOE) approximations for the Gaussian using rational approximation of the exponential function on the negative real axis. Six digit accuracy can be obtained with eight terms and ten digit accuracy can be obtained with twelve terms. This representation is of potential interest in approximation theory but we focus here on its use in accelerating the fast Gauss transform (FGT) in one and two dimensions. The one-dimensional scheme is particularly straightforward and easy to implement, requiring only twenty-four lines of MATLAB code. The two-dimensional version requires some care with data structures, but is significantly more efficient than existing FGTs. Following a detailed presentation of the theoretical foundations, we demonstrate the performance of the fast transforms with several numerical experiments.

Show Abstract

Gauge equivariant neural networks for quantum lattice gauge theories

Di Luo, G. Carleo, Bryan K. Clark, J. Stokes

Gauge symmetries play a key role in physics appearing in areas such as quantum field theories of the fundamental particles and emergent degrees of freedom in quantum materials. Motivated by the desire to efficiently simulate many-body quantum systems with exact local gauge invariance, gauge equivariant neural-network quantum states are introduced, which exactly satisfy the local Hilbert space constraints necessary for the description of quantum lattice gauge theory with Zd gauge group on different geometries. Focusing on the special case of Z2 gauge group on a periodically identified square lattice, the equivariant architecture is analytically shown to contain the loop-gas solution as a special case. Gauge equivariant neural-network quantum states are used in combination with variational quantum Monte Carlo to obtain compact descriptions of the ground state wavefunction for the Z2 theory away from the exactly solvable limit, and to demonstrate the confining/deconfining phase transition of the Wilson loop order parameter.

Show Abstract

A Normative and Biologically Plausible Algorithm for Independent Component Analysis

The brain effortlessly solves blind source separation (BSS) problems, but the algorithm it uses remains elusive. In signal processing, linear BSS problems are often solved by Independent Component Analysis (ICA). To serve as a model of a biological circuit, the ICA neural network (NN) must satisfy at least the following requirements: 1. The algorithm must operate in the online setting where data samples are streamed one at a time, and the NN computes the sources on the fly without storing any significant fraction of the data in memory. 2. The synaptic weight update is local, i.e., it depends only on the biophysical variables present in the vicinity of a synapse. Here, we propose a novel objective function for ICA from which we derive a biologically plausible NN, including both the neural architecture and the synaptic learning rules. Interestingly, our algorithm relies on modulating synaptic plasticity by the total activity of the output neurons. In the brain, this could be accomplished by neuromodulators, extracellular calcium, local field potential, or nitric oxide.

Show Abstract

Shadow tomography based on informationally complete positive operator-valued measure

Atithi Acharya, Siddhartha Saha, A. Sengupta

Recently introduced shadow tomography protocols use “classical shadows” of quantum states to predict many target functions of an unknown quantum state. Unlike full quantum state tomography, shadow tomography does not insist on accurate recovery of the density matrix for high rank mixed states. Yet, such a protocol makes multiple accurate predictions with high confidence, based on a moderate number of quantum measurements. One particular influential algorithm, proposed by Huang et al. [Huang, Kueng, and Preskill, Nat. Phys. 16, 1050 (2020)], requires additional circuits for performing certain random unitary transformations. In this paper, we avoid these transformations but employ an arbitrary informationally complete positive operator-valued measure and show that such a procedure can compute k-bit correlation functions for quantum states reliably. We also show that, for this application, we do not need the median of means procedure of Huang et al. Finally, we discuss the contrast between the computation of correlation functions and fidelity of reconstruction of low rank density matrices.

Show Abstract

Perturbational Complexity by Distribution Mismatch: A Systematic Analysis of Reinforcement Learning in Reproducing Kernel Hilbert Space

Jihao Long, J. Han

Most existing theoretical analysis of reinforcement learning (RL) is limited to the tabular setting or linear models due to the difficulty in dealing with function approximation in high dimensional space with an uncertain environment. This work offers a fresh perspective into this challenge by analyzing RL in a general reproducing kernel Hilbert space (RKHS). We consider a family of Markov decision processes  of which the reward functions lie in the unit ball of an RKHS and transition probabilities lie in a given arbitrary set. We define a quantity called perturbational complexity by distribution mismatch Δ(ϵ) to characterize the complexity of the admissible state-action distribution space in response to a perturbation in the RKHS with scale ϵ. We show that Δ(ϵ) gives both the lower bound of the error of all possible algorithms and the upper bound of two specific algorithms (fitted reward and fitted Q-iteration) for the RL problem. Hence, the decay of Δ(ϵ) with respect to ϵ measures the difficulty of the RL problem on . We further provide some concrete examples and discuss whether Δ(ϵ) decays fast or not in these examples. As a byproduct, we show that when the reward functions lie in a high dimensional RKHS, even if the transition probability is known and the action space is finite, it is still possible for RL problems to suffer from the curse of dimensionality.

Show Abstract
November 5, 2021

Encoding Priors with Group Sparsity for Model Learning From Limited and Noisy Biological Data

Suryanarayana Maddu, Dominik Sturm, Bevan L Cheeseman, C. Müller, Ivo F Sbalzarini

Numerical methods for approximately solving partial differential equations (PDE) are at the core of scientific computing. Often, this requires high-resolution or adaptive discretization grids to capture relevant spatio-temporal features in the PDE solution, e.g., in applications like turbulence, combustion, and shock propagation. Numerical approximation also requires knowing the PDE in order to construct problem-specific discretizations. Systematically deriving such solution-adaptive discrete operators, however, is a current challenge. Here we present an artificial neural network architecture for data-driven learning of problemand resolution-specific local discretizations of nonlinear PDEs. Our proposed method achieves numerically stable discretization of the operators in an unknown nonlinear PDE by spatially and temporally adaptive parametric pooling on regular Cartesian grids, and by incorporating knowledge about discrete time integration. Knowing the actual PDE is not necessary, as solution data is sufficient to train the network to learn the discrete operators. A once-trained neural architecture model can be used to predict solutions of the PDE on larger spatial domains and for longer times than it was trained for, hence addressing the problem of PDE-constrained extrapolation from data. We present demonstrative examples on long-term forecasting of hard numerical problems including equation-free forecasting of non-linear dynamics of forced Burgers problem on coarse spatio-temporal grids.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates