443 Publications

Learning physically consistent differential equation models from data using group sparsity

Suryanarayana Maddu, Bevan L Cheeseman, Ivo F Sbalzarini, C. Müller

We propose a statistical learning framework based on group-sparse regression that can be used to 1) enforce conservation laws, 2) ensure model equivalence, and 3) guarantee symmetries when learning or inferring differential-equation models from measurement data. Directly learning interpretable mathematical models from data has emerged as a valuable modeling approach. However, in areas like biology, high noise levels, sensor-induced correlations, and strong inter-system variability can render data-driven models nonsensical or physically inconsistent without additional constraints on the model structure. Hence, it is important to leverage prior knowledge from physical principles to learn "biologically plausible and physically consistent" models rather than models that simply fit the data best. We present a novel group Iterative Hard Thresholding (gIHT) algorithm and use stability selection to infer physically consistent models with minimal parameter tuning. We show several applications from systems biology that demonstrate the benefits of enforcing priors in data-driven modeling.

Show Abstract

deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression

David Rügamer, Ruolin Shen, Christina Bukas, Dominik Thalmeier, Nadja Klein, Chris Kolb, Florian Pfisterer, Philipp Kopper, Bernd Bischl, others, C. Müller

This paper describes the implementation of semi-structured deep distributional regression, a flexible framework to learn distributions based on a combination of additive regression models and deep neural networks. deepregression is implemented in both R and Python, using the deep learning libraries TensorFlow and PyTorch, respectively. The implementation consists of (1) a modular neural network building system for the combination of various statistical and deep learning approaches, (2) an orthogonalization cell to allow for an interpretable combination of different subnetworks as well as (3) pre-processing steps necessary to initialize such models. The software package allows to define models in a user-friendly manner using distribution definitions via a formula environment that is inspired by classical statistical model frameworks such as mgcv. The packages' modular design and functionality provides a unique resource for rapid and reproducible prototyping of complex statistical and deep learning models while simultaneously retaining the indispensable interpretability of classical statistical models.

Show Abstract
April 6, 2021

Comment on “Stepped pressure profile equilibria in cylindrical plasmas via partial Taylor relaxation” [J. Plasma Physics (2006), vol. 72, part 6, pp. 1167–1171]

Yuanfan Wang, D. Malhotra, Antoine J. Cerfon

In an early study of the properties and capabilities of the multiregion, relaxed magnetohydrodynamic model, Hole, Hudson & Dewar claim that they are able to construct a multiregion stepped pressure cylindrical equilibrium which does not require the existence of surface currents. We present a brief argument showing that this claim is incorrect, and clarify the meaning of their statement. Furthermore, even with the statement clarified, we demonstrate that it is not possible to find solutions to reproduce the equilibrium corresponding to the parameters given in the article. We invite the authors to provide a corrigendum with the correct values of the equilibrium they constructed.

Show Abstract

Fast computation of latent correlations

Grace Yoon, C. Müller, Irina Gaynanova

Latent Gaussian copula models provide a powerful means to perform multi-view data integration since these models can seamlessly express dependencies between mixed variable types (binary, continuous, zero-inflated) via latent Gaussian correlations. The estimation of these latent correlations, however, comes at considerable computational cost, having prevented the routine use of these models on high-dimensional data. Here, we propose a new computational approach for estimating latent correlations via a hybrid multi-linear interpolation and optimization scheme. Our approach speeds up the current state of the art computation by several orders of magnitude, thus allowing fast computation of latent Gaussian copula models even when the number of variables p is large. We provide theoretical guarantees for the approximation error of our numerical scheme and support its excellent performance on simulated and real-world data. We illustrate the practical advantages of our method on high-dimensional sparse quantitative and relative abundance microbiome data as well as multi-view data from The Cancer Genome Atlas Project. Our method is implemented in the R package mixedCCA, available at https://github.com/irinagain/mixedCCA this https URL.

Show Abstract

More data or more parameters? Investigating the effect of data structure on generalization

Stéphane d'Ascoli, M. Gabrié, Levent Sagun, G. Biroli

One of the central features of deep learning is the generalization abilities of neural networks, which seem to improve relentlessly with over-parametrization. In this work, we investigate how properties of data impact the test error as a function of the number of training examples and number of training parameters; in other words, how the structure of data shapes the "generalization phase space". We first focus on the random features model trained in the teacher-student scenario. The synthetic input data is composed of independent blocks, which allow us to tune the saliency of low-dimensional structures and their relevance with respect to the target function. Using methods from statistical physics, we obtain an analytical expression for the train and test errors for both regression and classification tasks in the high-dimensional limit. The derivation allows us to show that noise in the labels and strong anisotropy of the input data play similar roles on the test error. Both promote an asymmetry of the phase space where increasing the number of training examples improves generalization further than increasing the number of training parameters. Our analytical insights are confirmed by numerical experiments involving fully-connected networks trained on MNIST and CIFAR10.

Show Abstract
March 9, 2021

Aliasing error of the exp \(β\sqrt{1-z^2}\) kernel in the nonuniform fast Fourier transform

The most popular algorithm for the nonuniform fast Fourier transform (NUFFT) uses the dilation of a kernel $\phi$ to spread (or interpolate) between given nonuniform points and a uniform upsampled grid, combined with an FFT and diagonal scaling (deconvolution) in frequency space. The high performance of the recent FINUFFT library is in part due to its use of a new ``exponential of semicircle'' kernel $\phi(z)=e^{\beta \sqrt{1-z^2}}$, for $z\in[-1,1]$, zero otherwise, whose Fourier transform $\hat\phi$ is unknown analytically. We place this kernel on a rigorous footing by proving an aliasing error estimate which bounds the error of the one-dimensional NUFFT of types 1 and 2 in exact arithmetic. Asymptotically in the kernel width measured in upsampled grid points, the error is shown to decrease with an exponential rate arbitrarily close to that of the popular Kaiser--Bessel kernel. This requires controlling a conditionally-convergent sum over the tails of $\hat\phi$, using steepest descent, other classical estimates on contour integrals, and a phased sinc sum. We also draw new connections between the above kernel, Kaiser--Bessel, and prolate spheroidal wavefunctions of order zero, which all appear to share an optimal exponential convergence rate.

Show Abstract

A randomization-based causal inference framework for uncovering environmental exposure effects on human gut microbiota

Alice J Sommer, Annette Peters, Martina Rommel, Josef Cyrys, Harald Grallert, Dirk Haller, C. Müller, Marie-Abèle C Bind

Statistical analysis of microbial genomic data within epidemiological cohort studies holds the promise to assess the influence of environmental exposures on both the host and the host-associated microbiome. The observational character of prospective cohort data and the intricate characteristics of microbiome data make it, however, challenging to discover causal associations between environment and microbiome. Here, we introduce a causal inference framework based on the Rubin Causal Model that can help scientists to investigate such environment-host microbiome relationships, to capitalize on existing, possibly powerful, test statistics, and test plausible sharp null hypotheses. Using data from the German KORA cohort study, we illustrate our framework by designing two hypothetical randomized experiments with interventions of (i) air pollution reduction and (ii) smoking prevention. We study the effects of these interventions on the human gut microbiome by testing shifts in microbial diversity, changes in individual microbial abundances, and microbial network wiring between groups of matched subjects via randomization-based inference. In the smoking prevention scenario, we identify a small interconnected group of taxa worth further scrutiny, including Christensenellaceae and Ruminococcaceae genera, that have been previously associated with blood metabolite changes. These findings demonstrate that our framework may uncover potentially causal links between environmental exposure and the gut microbiome from observational data. We anticipate the present statistical framework to be a good starting point for further discoveries on the role of the gut microbiome in environmental health.

Show Abstract
February 24, 2021

A fast spectral method for electrostatics in doubly-periodic slit channels

Ondrej Maxian, Raul P. Peláez, L. Greengard, Aleksandar Donev

We develop a fast method for computing the electrostatic energy and forces for a collection of charges in doubly-periodic slabs with jumps in the dielectric permittivity at the slab boundaries. Our method achieves spectral accuracy by using Ewald splitting to replace the original Poisson equation for nearly-singular sources with a smooth far-field Poisson equation, combined with a localized near-field correction. Unlike existing spectral Ewald methods, which make use of the Fourier transform in the aperiodic direction, we recast the problem as a two-point boundary value problem in the aperiodic direction for each transverse Fourier mode, for which exact analytic boundary conditions are available. We solve each of these boundary value problems using a fast, well-conditioned Chebyshev method. In the presence of dielectric jumps, combining Ewald splitting with the classical method of images results in smoothed charge distributions which overlap the dielectric boundaries themselves. We show how to preserve high order accuracy in this case through the use of a harmonic correction which involves solving a simple Laplace equation with smooth boundary data. We implement our method on Graphical Processing Units, and combine our doubly-periodic Poisson solver with Brownian Dynamics to study the equilibrium structure of double layers in binary electrolytes confined by dielectric boundaries. Consistent with prior studies, we find strong charge depletion near the interfaces due to repulsive interactions with image charges, which points to the need for incorporating polarization effects in understanding confined electrolytes, both theoretically and computationally.

Show Abstract

STENCIL-NET: Data-driven solution-adaptive discretization of partial differential equations

Suryanarayana Maddu, Dominik Sturm, Bevan L Cheeseman, C. Müller, Ivo F Sbalzarini

Numerical methods for approximately solving partial differential equations (PDE) are at the core of scientific computing. Often, this requires high-resolution or adaptive discretization grids to capture relevant spatio-temporal features in the PDE solution, e.g., in applications like turbulence, combustion, and shock propagation. Numerical approximation also requires knowing the PDE in order to construct problem-specific discretizations. Systematically deriving such solution-adaptive discrete operators, however, is a current challenge. Here we present STENCIL-NET, an artificial neural network architecture for data-driven learning of problem- and resolution-specific local discretizations of nonlinear PDEs. STENCIL-NET achieves numerically stable discretization of the operators in an unknown nonlinear PDE by spatially and temporally adaptive parametric pooling on regular Cartesian grids, and by incorporating knowledge about discrete time integration. Knowing the actual PDE is not necessary, as solution data is sufficient to train the network to learn the discrete operators. A once-trained STENCIL-NET model can be used to predict solutions of the PDE on larger spatial domains and for longer times than it was trained for, hence addressing the problem of PDE-constrained extrapolation from data. To support this claim, we present numerical experiments on long-term forecasting of chaotic PDE solutions on coarse spatio-temporal grids. We also quantify the speed-up achieved by substituting base-line numerical methods with equation-free STENCIL-NET predictions on coarser grids with little compromise on accuracy.

Show Abstract
January 18, 2021
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates