298 Publications

Shedding a PAC-Bayesian Light on Adaptive Sliced-Wasserstein Distances

R. Ohana, Kimia Nadjahi, Alain Rakotomamonjy, Liva Ralaivola

The Sliced-Wasserstein distance (SW) is a computationally efficient and theoretically grounded alternative to the Wasserstein distance. Yet, the literature on its statistical properties -- or, more accurately, its generalization properties -- with respect to the distribution of slices, beyond the uniform measure, is scarce. To bring new contributions to this line of research, we leverage the PAC-Bayesian theory and a central observation that SW may be interpreted as an average risk, the quantity PAC-Bayesian bounds have been designed to characterize. We provide three types of results: i) PAC-Bayesian generalization bounds that hold on what we refer as adaptive Sliced-Wasserstein distances, i.e. SW defined with respect to arbitrary distributions of slices (among which data-dependent distributions), ii) a principled procedure to learn the distribution of slices that yields maximally discriminative SW, by optimizing our theoretical bounds, and iii) empirical illustrations of our theoretical findings.

Show Abstract

A geometrical connection between sparse and low-rank matrices and its application to manifold learning

We consider when a sparse nonnegative matrix \(\mathbf{S}\) can be recovered, via an elementwise nonlinearity, from a real-valued matrix~ \(\mathbf{S}\) of significantly lower rank. Of particular interest is the setting where the positive elements of \( \mathbf{S}\) encode the similarities of nearby points on a low dimensional manifold. The recovery can then be posed as a problem in manifold learning---in this case, how to learn a norm-preserving and neighborhood-preserving mapping of high dimensional inputs into a lower dimensional space. We describe an algorithm for this problem based on a generalized low-rank decomposition of sparse matrices. This decomposition has the interesting property that it can be encoded by a neural network with one layer of rectified linear units; since the algorithm discovers this encoding, it can also be viewed as a layerwise primitive for deep learning. The algorithm regards the inputs \(\mathbf{x}_i|)\) and \(\mathbf{x}_j\)\) as similar whenever the cosine of the angle between them exceeds some threshold \(\tau\in(0,1)\). Given this threshold, the algorithm attempts to discover a mapping \(\mathbf{x}_i\mapsto\mathbf{y}_i\) by matching the elements of two sparse matrices; in particular, it seeks a mapping for which \(\mathbf{S}=\max(0,\mathbf{L})\), where \(S_{ij} = \max(0,\mathbf{x}_i\cdot\mathbf{x}_j - \tau\|\mathbf{x}_i\|\|\mathbf{x}_j\|)\) and \(L_{ij} = \mathbf{y}_i\cdot\mathbf{y}_j - \tau\|\mathbf{y}_i\|\|\mathbf{y}_j\|\). We apply the algorithm to data sets where vector magnitudes and small cosine distances have interpretable meanings (e.g., the brightness of an image, the similarity to other words). On these data sets, the algorithm is able to discover much lower dimensional representations that preserve these meanings

Show Abstract

Integral formulation of Klein-Gordon singular waveguides

Guillaume Bal, Jeremy Hoskins, M. Rachh, Solomon Quinn

We consider the analysis of singular waveguides separating insulating phases in two-space dimensions. The insulating domains are modeled by a massive Schrödinger equation and the singular waveguide by appropriate jump conditions along the one-dimensional interface separating the insulators. We present an integral formulation of the problem and analyze its mathematical properties. We also implement a fast multipole and sweeping-accelerated iterative algorithm for solving the integral equations, and demonstrate numerically the fast convergence of this method. Several numerical examples of solutions and scattering effects illustrate our theory.

Show Abstract
December 23, 2022

PELICAN: Permutation Equivariant and Lorentz Invariant or Covariant Aggregator Network for Particle Physics

A. Bogatskii, Timothy Hoffman, David W. Miller, Jan T. Offermann

Many current approaches to machine learning in particle physics use generic architectures that require large numbers of parameters and disregard underlying physics principles, limiting their applicability as scientific modeling tools. In this work, we present a machine learning architecture that uses a set of inputs maximally reduced with respect to the full 6-dimensional Lorentz symmetry, and is fully permutation-equivariant throughout. We study the application of this network architecture to the standard task of top quark tagging and show that the resulting network outperforms all existing competitors despite much lower model complexity. In addition, we present a Lorentz-covariant variant of the same network applied to a 4-momentum regression task.

Show Abstract

Replacement dynamics and the pathogenesis of the Alpha, Delta and Omicron variants of SARS-CoV-2

Thomas Ward, Alex Glaser, Christopher E Overton, B. Carpenter, Nick Gent, Anna C Seale

New SARS-CoV-2 variants causing COVID-19 are a major risk to public health worldwide due to the potential for phenotypic change and increases in pathogenicity, transmissibility and/or vaccine escape. Recognising signatures of new variants in terms of replacing growth and severity are key to informing the public health response. To assess this, we aimed to investigate key time periods in the course of infection, hospitalisation and death, by variant. We linked datasets on contact tracing (Contact Tracing Advisory Service), testing (the Second-Generation Surveillance System) and hospitalisation (the Admitted Patient Care dataset) for the entire length of contact tracing in the England - from March 2020 to March 2022. We modelled, for England, time delay distributions using a Bayesian doubly interval censored modelling approach for the SARS-CoV-2 variants Alpha, Delta, Delta Plus (AY.4.2), Omicron BA.1 and Omicron BA.2. This was conducted for the incubation period, the time from infection to hospitalisation and hospitalisation to death. We further modelled the growth of novel variant replacement using a generalised additive model with a negative binomial error structure and the relationship between incubation period length and the risk of a fatality using a Bernoulli generalised linear model with a logit link. The mean incubation periods for each variant were: Alpha 4.19 (95% credible interval (CrI) 4.13-4.26) days; Delta 3.87 (95% CrI 3.82-3.93) days; Delta Plus 3.92 (95% CrI 3.87-3.98) days; Omicron BA.1 3.67 (95% CrI 3.61-3.72) days and Omicron BA.2 3.48 (95% CrI 3.43-3.53) days. The mean time from infection to hospitalisation was for Alpha 11.31 (95% CrI 11.20-11.41) days, Delta 10.36 (95% CrI 10.26-10.45) days and Omicron BA.1 11.54 (95% CrI 11.38-11.70) days. The mean time from hospitalisation to death was, for Alpha 14.31 (95% CrI 14.00-14.62) days; Delta 12.81 (95% CrI 12.62-13.00) days and Omicron BA.2 16.02 (95% CrI 15.46-16.60) days. The 95th percentile of the incubation periods were: Alpha 11.19 (95% CrI 10.92-11.48) days; Delta 9.97 (95% CrI 9.73-10.21) days; Delta Plus 9.99 (95% CrI 9.78-10.24) days; Omicron BA.1 9.45 (95% CrI 9.23-9.67) days and Omicron BA.2 8.83 (95% CrI 8.62-9.05) days. Shorter incubation periods were associated with greater fatality risk when adjusted for age, sex, variant, vaccination status, vaccination manufacturer and time since last dose with an odds ratio of 0.83 (95% confidence interval 0.82-0.83) (P value < 0.05). Variants of SARS-CoV-2 that have replaced previously dominant variants have had shorter incubation periods. Conversely co-existing variants have had very similar and non-distinct incubation period distributions. Shorter incubation periods reflect generation time advantage, with a reduction in the time to the peak infectious period, and may be a significant factor in novel variant replacing growth. Shorter times for admission to hospital and death were associated with variant severity - the most severe variant, Delta, led to significantly earlier hospitalisation, and death. These measures are likely important for future risk assessment of new variants, and their potential impact on population health.

Show Abstract

Approximating the Gaussian as a Sum of Exponentials and Its Applications to the Fast Gauss Transform

We develop efficient and accurate sum-of-exponential (SOE) approximations for the Gaussian using rational approximation of the exponential function on the negative real axis. Six digit accuracy can be obtained with eight terms and ten digit accuracy can be obtained with twelve terms. This representation is of potential interest in approximation theory but we focus here on its use in accelerating the fast Gauss transform (FGT) in one and two dimensions. The one-dimensional scheme is particularly straightforward and easy to implement, requiring only twenty-four lines of MATLAB code. The two-dimensional version requires some care with data structures, but is significantly more efficient than existing FGTs. Following a detailed presentation of the theoretical foundations, we demonstrate the performance of the fast transforms with several numerical experiments.

Show Abstract

Quantum Initial Conditions for Curved Inflating Universes

Mary I Letey, Zakhar Shumaylov, F. Agocs, Will J Handley, Michael P Hobson, Anthony N Lasenby

We discuss the challenges of motivating, constructing, and quantising a canonically-normalised inflationary perturbation in spatially curved universes. We show that this has historically proved challenging due to the interaction of non-adiabaticity with spatial curvature. We propose a novel curvature perturbation which is canonically normalised, unique up to a single scalar parameter. This corrected quantisation has potentially observational consequences via modifications to the primordial power spectrum at large angular scales, as well as theoretical implications for quantisation procedures in curved cosmologies filled with a scalar field.

Show Abstract
November 30, 2022

Corner Cases of the Generalized Tau Method

Keaton J. Burns, D. Fortunato , Keith Julien, Geoffrey M. Vasil

Polynomial spectral methods provide fast, accurate, and flexible solvers for broad ranges of PDEs with one bounded dimension, where the incorporation of general boundary conditions is well understood. However, automating extensions to domains with multiple bounded dimensions is challenging because of difficulties in implementing boundary conditions and imposing compatibility conditions at shared edges and corners. Past work has included various workarounds, such as the anisotropic inclusion of partial boundary data at shared edges or approaches that only work for specific boundary conditions. Here we present a general system for imposing boundary and compatibility conditions for elliptic equations on hypercubes. We take an approach based on the generalized tau method, which allows for a wide range of boundary conditions for many types of spectral methods. The generalized tau method has the distinct advantage that the specified polynomial residual determines the exact algebraic solution; afterwards, any stable numerical scheme will find the same result. We can, therefore, provide one-to-one comparisons to traditional collocation and Galerkin methods within the tau framework. As an essential requirement, we add specific tau corrections to the boundary conditions in addition to the bulk PDE. We then impose additional mutual compatibility conditions to ensure boundary conditions match at shared subsurfaces. Our approach works with general boundary conditions that commute on intersecting subsurfaces, including Dirichlet, Neumann, Robin, and any combination of these on all boundaries. The tau corrections and compatibility conditions can be fully isotropic and easily incorporated into existing solvers. We present the method explicitly for the Poisson equation in two and three dimensions and describe its extension to arbitrary elliptic equations (e.g. biharmonic) in any dimension.

Show Abstract

An FMM Accelerated Poisson Solver for Complicated Geometries in the Plane using Function Extension

Fredrik Fryklund, L. Greengard

We describe a new, adaptive solver for the two-dimensional Poisson equation in complicated geometries. Using classical potential theory, we represent the solution as the sum of a volume potential and a double layer potential. Rather than evaluating the volume potential over the given domain, we first extend the source data to a geometrically simpler region with high order accuracy. This allows us to accelerate the evaluation of the volume potential using an efficient, geometry-unaware fast multipole-based algorithm. To impose the desired boundary condition, it remains only to solve the Laplace equation with suitably modified boundary data. This is accomplished with existing fast and accurate boundary integral methods. The novelty of our solver is the scheme used for creating the source extension, assuming it is provided on an adaptive quad-tree. For leaf boxes intersected by the boundary, we construct a universal "stencil" and require that the data be provided at the subset of those points that lie within the domain interior. This universality permits us to precompute and store an interpolation matrix which is used to extrapolate the source data to an extended set of leaf nodes with full tensor-product grids on each. We demonstrate the method's speed, robustness and high-order convergence with several examples, including domains with piecewise smooth boundaries.

Show Abstract

Differentiable Cosmological Simulation with Adjoint Method

Y. Li, C. Modi, Drew Jamieson, Yucheng Zhang, L. Lu, Yu Feng, François Lanusse, L. Greengard

Rapid advances in deep learning have brought not only myriad powerful neural networks, but also breakthroughs that benefit established scientific research. In particular, automatic differentiation (AD) tools and computational accelerators like GPUs have facilitated forward modeling of the Universe with differentiable simulations. Current differentiable cosmological simulations are limited by memory, thus are subject to a trade-off between time and space/mass resolution. They typically integrate for only tens of time steps, unlike the standard non-differentiable simulations. We present a new approach free of such constraints, using the adjoint method and reverse time integration. It enables larger and more accurate forward modeling, and will improve gradient based optimization and inference. We implement it in a particle-mesh (PM) N-body library pmwd (particle-mesh with derivatives). Based on the powerful AD system JAX, pmwd is fully differentiable, and is highly performant on GPUs.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates