381 Publications

Comment on “Stepped pressure profile equilibria in cylindrical plasmas via partial Taylor relaxation” [J. Plasma Physics (2006), vol. 72, part 6, pp. 1167–1171]

Yuanfan Wang, D. Malhotra, Antoine J. Cerfon

In an early study of the properties and capabilities of the multiregion, relaxed magnetohydrodynamic model, Hole, Hudson & Dewar claim that they are able to construct a multiregion stepped pressure cylindrical equilibrium which does not require the existence of surface currents. We present a brief argument showing that this claim is incorrect, and clarify the meaning of their statement. Furthermore, even with the statement clarified, we demonstrate that it is not possible to find solutions to reproduce the equilibrium corresponding to the parameters given in the article. We invite the authors to provide a corrigendum with the correct values of the equilibrium they constructed.

Show Abstract

Fast computation of latent correlations

Grace Yoon, C. Müller, Irina Gaynanova

Latent Gaussian copula models provide a powerful means to perform multi-view data integration since these models can seamlessly express dependencies between mixed variable types (binary, continuous, zero-inflated) via latent Gaussian correlations. The estimation of these latent correlations, however, comes at considerable computational cost, having prevented the routine use of these models on high-dimensional data. Here, we propose a new computational approach for estimating latent correlations via a hybrid multi-linear interpolation and optimization scheme. Our approach speeds up the current state of the art computation by several orders of magnitude, thus allowing fast computation of latent Gaussian copula models even when the number of variables p is large. We provide theoretical guarantees for the approximation error of our numerical scheme and support its excellent performance on simulated and real-world data. We illustrate the practical advantages of our method on high-dimensional sparse quantitative and relative abundance microbiome data as well as multi-view data from The Cancer Genome Atlas Project. Our method is implemented in the R package mixedCCA, available at https://github.com/irinagain/mixedCCA this https URL.

Show Abstract

More data or more parameters? Investigating the effect of data structure on generalization

Stéphane d'Ascoli, M. Gabrié, Levent Sagun, G. Biroli

One of the central features of deep learning is the generalization abilities of neural networks, which seem to improve relentlessly with over-parametrization. In this work, we investigate how properties of data impact the test error as a function of the number of training examples and number of training parameters; in other words, how the structure of data shapes the "generalization phase space". We first focus on the random features model trained in the teacher-student scenario. The synthetic input data is composed of independent blocks, which allow us to tune the saliency of low-dimensional structures and their relevance with respect to the target function. Using methods from statistical physics, we obtain an analytical expression for the train and test errors for both regression and classification tasks in the high-dimensional limit. The derivation allows us to show that noise in the labels and strong anisotropy of the input data play similar roles on the test error. Both promote an asymmetry of the phase space where increasing the number of training examples improves generalization further than increasing the number of training parameters. Our analytical insights are confirmed by numerical experiments involving fully-connected networks trained on MNIST and CIFAR10.

Show Abstract
March 9, 2021

Aliasing error of the exp \(β\sqrt{1-z^2}\) kernel in the nonuniform fast Fourier transform

The most popular algorithm for the nonuniform fast Fourier transform (NUFFT) uses the dilation of a kernel $\phi$ to spread (or interpolate) between given nonuniform points and a uniform upsampled grid, combined with an FFT and diagonal scaling (deconvolution) in frequency space. The high performance of the recent FINUFFT library is in part due to its use of a new ``exponential of semicircle'' kernel $\phi(z)=e^{\beta \sqrt{1-z^2}}$, for $z\in[-1,1]$, zero otherwise, whose Fourier transform $\hat\phi$ is unknown analytically. We place this kernel on a rigorous footing by proving an aliasing error estimate which bounds the error of the one-dimensional NUFFT of types 1 and 2 in exact arithmetic. Asymptotically in the kernel width measured in upsampled grid points, the error is shown to decrease with an exponential rate arbitrarily close to that of the popular Kaiser--Bessel kernel. This requires controlling a conditionally-convergent sum over the tails of $\hat\phi$, using steepest descent, other classical estimates on contour integrals, and a phased sinc sum. We also draw new connections between the above kernel, Kaiser--Bessel, and prolate spheroidal wavefunctions of order zero, which all appear to share an optimal exponential convergence rate.

Show Abstract

A randomization-based causal inference framework for uncovering environmental exposure effects on human gut microbiota

Alice J Sommer, Annette Peters, Martina Rommel, Josef Cyrys, Harald Grallert, Dirk Haller, C. Müller, Marie-Abèle C Bind

Statistical analysis of microbial genomic data within epidemiological cohort studies holds the promise to assess the influence of environmental exposures on both the host and the host-associated microbiome. The observational character of prospective cohort data and the intricate characteristics of microbiome data make it, however, challenging to discover causal associations between environment and microbiome. Here, we introduce a causal inference framework based on the Rubin Causal Model that can help scientists to investigate such environment-host microbiome relationships, to capitalize on existing, possibly powerful, test statistics, and test plausible sharp null hypotheses. Using data from the German KORA cohort study, we illustrate our framework by designing two hypothetical randomized experiments with interventions of (i) air pollution reduction and (ii) smoking prevention. We study the effects of these interventions on the human gut microbiome by testing shifts in microbial diversity, changes in individual microbial abundances, and microbial network wiring between groups of matched subjects via randomization-based inference. In the smoking prevention scenario, we identify a small interconnected group of taxa worth further scrutiny, including Christensenellaceae and Ruminococcaceae genera, that have been previously associated with blood metabolite changes. These findings demonstrate that our framework may uncover potentially causal links between environmental exposure and the gut microbiome from observational data. We anticipate the present statistical framework to be a good starting point for further discoveries on the role of the gut microbiome in environmental health.

Show Abstract
February 24, 2021

A fast spectral method for electrostatics in doubly-periodic slit channels

Ondrej Maxian, Raul P. Peláez, L. Greengard, Aleksandar Donev

We develop a fast method for computing the electrostatic energy and forces for a collection of charges in doubly-periodic slabs with jumps in the dielectric permittivity at the slab boundaries. Our method achieves spectral accuracy by using Ewald splitting to replace the original Poisson equation for nearly-singular sources with a smooth far-field Poisson equation, combined with a localized near-field correction. Unlike existing spectral Ewald methods, which make use of the Fourier transform in the aperiodic direction, we recast the problem as a two-point boundary value problem in the aperiodic direction for each transverse Fourier mode, for which exact analytic boundary conditions are available. We solve each of these boundary value problems using a fast, well-conditioned Chebyshev method. In the presence of dielectric jumps, combining Ewald splitting with the classical method of images results in smoothed charge distributions which overlap the dielectric boundaries themselves. We show how to preserve high order accuracy in this case through the use of a harmonic correction which involves solving a simple Laplace equation with smooth boundary data. We implement our method on Graphical Processing Units, and combine our doubly-periodic Poisson solver with Brownian Dynamics to study the equilibrium structure of double layers in binary electrolytes confined by dielectric boundaries. Consistent with prior studies, we find strong charge depletion near the interfaces due to repulsive interactions with image charges, which points to the need for incorporating polarization effects in understanding confined electrolytes, both theoretically and computationally.

Show Abstract

STENCIL-NET: Data-driven solution-adaptive discretization of partial differential equations

Suryanarayana Maddu, Dominik Sturm, Bevan L Cheeseman, C. Müller, Ivo F Sbalzarini

Numerical methods for approximately solving partial differential equations (PDE) are at the core of scientific computing. Often, this requires high-resolution or adaptive discretization grids to capture relevant spatio-temporal features in the PDE solution, e.g., in applications like turbulence, combustion, and shock propagation. Numerical approximation also requires knowing the PDE in order to construct problem-specific discretizations. Systematically deriving such solution-adaptive discrete operators, however, is a current challenge. Here we present STENCIL-NET, an artificial neural network architecture for data-driven learning of problem- and resolution-specific local discretizations of nonlinear PDEs. STENCIL-NET achieves numerically stable discretization of the operators in an unknown nonlinear PDE by spatially and temporally adaptive parametric pooling on regular Cartesian grids, and by incorporating knowledge about discrete time integration. Knowing the actual PDE is not necessary, as solution data is sufficient to train the network to learn the discrete operators. A once-trained STENCIL-NET model can be used to predict solutions of the PDE on larger spatial domains and for longer times than it was trained for, hence addressing the problem of PDE-constrained extrapolation from data. To support this claim, we present numerical experiments on long-term forecasting of chaotic PDE solutions on coarse spatio-temporal grids. We also quantify the speed-up achieved by substituting base-line numerical methods with equation-free STENCIL-NET predictions on coarser grids with little compromise on accuracy.

Show Abstract
January 18, 2021

Efficient high-order accurate Fresnel diffraction via areal quadrature and the nonuniform FFT

We present a fast algorithm for computing the diffracted field from arbitrary binary (sharp-edged) planar apertures and occulters in the scalar Fresnel approximation, for up to moderately high Fresnel numbers ($\lesssim 10^3$). It uses a high-order areal quadrature over the aperture, then exploits a single 2D nonuniform fast Fourier transform (NUFFT) to evaluate rapidly at target points (of order $10^7$ such points per second, independent of aperture complexity). It thus combines the high accuracy of edge integral methods with the high speed of Fourier methods. Its cost is ${\mathcal O}(n^2 \log n)$, where $n$ is the linear resolution required in source and target planes, to be compared with ${\mathcal O}(n^3)$ for edge integral methods. In tests with several aperture shapes, this translates to between 2 and 5 orders of magnitude acceleration. In starshade modeling for exoplanet astronomy, we find that it is roughly $10^4 \times$ faster than the state of the art in accurately computing the set of telescope pupil wavefronts. We provide a documented, tested MATLAB/Octave implementation.
An appendix shows the mathematical equivalence of the boundary diffraction wave, angular integration, and line integral formulae, then analyzes a new non-singular reformulation that eliminates their common difficulties near the geometric shadow edge. This supplies a robust edge integral reference against which to validate the main proposal.

Show Abstract

An integral equation method for the simulation of doubly-periodic suspensions of rigid bodies in a shearing viscous flow

J. Wang, Ehssan Nazockdast, A. Barnett

With rheology applications in mind, we present a fast solver for the time-dependent effective viscosity of an infinite lattice containing one or more neutrally buoyant smooth rigid particles per unit cell, in a two-dimensional Stokes fluid with given shear rate. At each time, the mobility problem is reformulated as a 2nd-kind boundary integral equation, then discretized to spectral accuracy by the Nyström method and solved iteratively, giving typically 10 digits of accuracy. Its solution controls the evolution of particle locations and angles in a first-order system of ordinary differential equations. The formulation is placed on a rigorous footing by defining a generalized periodic Green's function for the skew lattice. Numerically, the periodized integral operator is split into a near image sum—applied in linear time via the fast multipole method—plus a correction field solved cheaply via proxy Stokeslets. We use barycentric quadratures to evaluate particle interactions and velocity fields accurately, even at distances much closer than the node spacing. Using first-order time-stepping we simulate, for example, 25 ellipses per unit cell to 3-digit accuracy on a desktop in 1 hour per shear time. Our examples show equilibration at long times, force chains, and two types of blow-ups (jamming) whose power laws match lubrication theory asymptotics.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates

privacy consent banner

Privacy preference

We use cookies to provide you with the best online experience. By clicking "Accept All," you help us understand how our site is used and enhance its performance. You can change your choice at any time here. To learn more, please visit our Privacy Policy.