298 Publications

Spatial Transformer K-Means

Romain Cosentino, Randall Balestriero, Y. Bahroun, A. Sengupta, Richard Baraniuk, Behnaam Aazhang

K-means defines one of the most employed centroid-based clustering algorithms with performances tied to the data's embedding. Intricate data embeddings have been designed to push K-means performances at the cost of reduced theoretical guarantees and interpretability of the results. Instead, we propose preserving the intrinsic data space and augment K-means with a similarity measure invariant to non-rigid transformations. This enables (i) the reduction of intrinsic nuisances associated with the data, reducing the complexity of the clustering task and increasing performances and producing state-of-the-art results, (ii) clustering in the input space of the data, leading to a fully interpretable clustering algorithm, and (iii) the benefit of convergence guarantees.

Show Abstract

Towards practical holographic coherent diffraction imaging via maximum likelihood estimation

D. Barmherzig, Ju Sun

A new algorithmic framework is developed for holographic coherent diffraction imaging (HCDI) based on maximum likelihood estimation (MLE). This method provides superior image reconstruction results for various practical HCDI settings, such as when data is highly corrupted by Poisson shot noise and when low-frequency data is missing due to occlusion from a beamstop apparatus. This method is also highly robust in that it can be implemented using a variety of standard numerical optimization algorithms, and requires fewer constraints on the physical HCDI setup compared to current algorithms. The mathematical framework developed using MLE is also applicable beyond HCDI to any holographic imaging setup where data is corrupted by Poisson shot noise.

Show Abstract

Prolonged and Pervasive Perturbations in the Composition of the Southern Hemisphere Midlatitude Lower Stratosphere From the Australian New Year’s Fires

M. L. Santee, A. Lambert, G. L. Manney, N. J. Livesey, L. Froidevaux, J. L. Neu, M. J. Schwartz, L. F. Millán, F. Werner, W. G. Read, M. Park, R. A. Fuller, B. Ward

The 2019/2020 Australian New Year’s wildfires injected record amounts of smoke and biomass burning products into the lower stratosphere. The Aura Microwave Limb Sounder (MLS) tracked the evolution of distinct plumes of fire-influenced air as they rapidly spiraled up to the mid-stratosphere. In the months following the fires, smoke spread throughout the Southern Hemisphere (SH) stratosphere. We contrast the evolution of the SH midlatitude lower stratosphere in 2020 with the 17-year MLS record. Long after the coherent plumes dispersed, data from MLS and other satellite instruments show unprecedented persistent and pervasive depletion in HCl (50–60% below climatology) and enhancements in ClO and ClONO2 that were not transport related; peak anomalies occurred in mid-2020. We conclude that the observed perturbations likely arose from heterogeneous chlorine activation on widespread smoke particles. The sustained chlorine activation was far weaker than in typical winter polar vortices, inducing at most minor changes in ozone.

Show Abstract

FMM-LU: A fast direct solver for multiscale boundary integral equations in three dimensions

Daria Sushnikova, L. Greengard, Michael O'Neil, M. Rachh

We present a fast direct solver for boundary integral equations on complex surfaces in three dimensions using an extension of the recently introduced recursive strong skeletonization scheme. For problems that are not highly oscillatory, our algorithm computes an LU-like hierarchical factorization of the dense system matrix, permitting application of the inverse in (n) time, where n is the number of unknowns on the surface. The factorization itself also scales linearly with the system size, albeit with a somewhat larger constant. The scheme is built on a level-restricted adaptive octree data structure, and therefore it is compatible with highly nonuniform discretizations. Furthermore, the scheme is coupled with high-order accurate locally-corrected Nyström quadrature methods to integrate the singular and weakly-singular Green's functions used in the integral representations. Our method has immediate applications to a variety of problems in computational physics. We concentrate here on studying its performance in acoustic scattering (governed by the Helmholtz equation) at low to moderate frequencies, and provide rigorous justification for compression of submatrices via proxy surfaces.

Show Abstract
January 18, 2022

Neural Circuits for Dynamics-Based Segmentation of Time Series

Samaneh Nasiri, Dmitri B. Chklovskii, A. Sengupta, Tiberiu Tesileanu , Siavash Golkar

The brain must extract behaviorally relevant latent variables from the signals streamed by the sensory organs. Such latent variables are often encoded in the dynamics that generated the signal rather than in the specific realization of the waveform. Therefore, one problem faced by the brain is to segment time series based on underlying dynamics. We present two algorithms for performing this segmentation task that are biologically plausible, which we define as acting in a streaming setting and all learning rules being local. One algorithm is model based and can be derived from an optimization problem involving a mixture of autoregressive processes. This algorithm relies on feedback in the form of a prediction error and can also be used for forecasting future samples. In some brain regions, such as the retina, the feedback connections necessary to use the prediction error for learning are absent. For this case, we propose a second, model-free algorithm that uses a running estimate of the autocorrelation structure of the signal to perform the segmentation. We show that both algorithms do well when tasked with segmenting signals drawn from autoregressive models with piecewise-constant parameters. In particular, the segmentation accuracy is similar to that obtained from oracle-like methods in which the ground-truth parameters of the autoregressive models are known. We also test our methods on data sets generated by alternating snippets of voice recordings. We provide implementations of our algorithms at https://github.com/ttesileanu/bio-time-series.

Show Abstract

The PETSc Community Is the Infrastructure

Mark Adams, Satish Balay, Oana Marin, Lois Curfman McInnes, Richard Tran Mills, Todd Munson, Hong Zhang, Junchao Zhang, Jed Brown, Victor Eijkhout, Jacob Faibussowitsch, Matthew Knepley, Fande Kong, Scott Kruger, Patrick Sanan, B. Smith, Hong Zhang

The communities who develop and support open source scientific software packages are crucial to the utility and success of such packages. Moreover, these communities form an important part of the human infrastructure that enables scientific progress. This paper discusses aspects of the PETSc (Portable Extensible Toolkit for Scientific Computation) community, its organization, and technical approaches that enable community members to help each other efficiently

Show Abstract
January 4, 2022

How exponentially ill-conditioned are contiguous submatrices of the Fourier matrix?

Linear systems involving contiguous submatrices of the discrete Fourier transform (DFT)matrix arise in many applications, such as Fourier extension, superresolution, and coherent diffraction imaging. We show that the condition number of any such p\times q submatrix of the N\times NDFT matrix is at least exp\bigl( \pi 2\bigl[ min(p,q) - pqN\bigr] \bigr) , up to algebraic prefactors.That is, fixing the shape parameters (\alpha ,\beta ) := (p/N,q/N)\in (0,1)2, the growth ise\rho NasN\rightarrow \infty , the exponential rate being\rho =\pi 2[min(\alpha ,\beta ) - \alpha \beta ]. Our proof uses theKaiser--Bessel transform pair (of which we give a self-contained proof), plus estimates on sums over distorted sinc functions, to construct a localized trial vector whose DFT is also localized. We warm up with an elementary proof of the above but with half the rate, via a periodized Gaussian trial vector. Using low-rank approximation of the kerneleixt, we also prove another lower bound (4/e\pi \alpha )q, up to algebraic prefactors, which is stronger than the above for small\alpha and\beta . When combined, the bounds are within a factor of two ofthe empirical asymptotic rate, uniformly over (0,1)2, and become sharp in certain regions.However, the results are not asymptotic: they apply to essentially allN,p, andq, and with all constants explicit.

Show Abstract

Robust regression with compositional covariates

A. Mishra, C. Müller

Many high-throughput sequencing data sets in biology are compositional in nature. A prominent example is microbiome profiling data, including targeted amplicon-based and metagenomic sequencing data. These profiling data comprises surveys of microbial communities in their natural habitat and sparse proportional (or compositional) read counts that represent operational taxonomic units or genes. When paired measurements of other covariates, including physicochemical properties of the habitat or phenotypic variables of the host, are available, inference of parsimonious and robust statistical relationships between the microbial abundance data and the covariate measurements is often an important first step in exploratory data analysis. To this end, we propose a sparse robust statistical regression framework that considers compositional and non-compositional measurements as predictors and identifies outliers in continuous response variables. Our model extends the seminal log-contrast model of Aitchison and Bacon-Shone (1984) by a mean shift formulation for capturing outliers, sparsity-promoting convex and non-convex penalties for parsimonious model selection, and data-driven robust initialization procedures adapted to the compositional setting. We show, in theory and simulations, the ability of our approach to jointly select a sparse set of predictive microbial features and identify outliers in the response. We illustrate the viability of our method by robustly predicting human body mass indices from American Gut Project amplicon data and non-compositional covariate data. We believe that the robust estimators introduced here and available in the R package RobRegCC can serve as a practical tool for reliable statistical regression analysis of compositional data, including microbiome survey data.

Show Abstract

Solving Fredholm second-kind integral equations with singular right-hand sides on non-smooth boundaries

Johan Helsing, S. Jiang

A numerical scheme is presented for the solution of Fredholm second-kind boundary integral equations with right-hand sides that are singular at a finite set of boundary points. The boundaries themselves may be non-smooth. The scheme, which builds on recursively compressed inverse preconditioning (RCIP), is universal as it is independent of the nature of the singularities. Strong right-hand-side singularities, such as $1/|r|^\alpha$ with $\alpha$ close to $1$, can be treated in full machine precision. Adaptive refinement is used only in the recursive construction of the preconditioner, leading to an optimal number of discretization points and superior stability in the solve phase. The performance of the scheme is illustrated via several numerical examples, including an application to an integral equation derived from the linearized BGKW kinetic equation for the steady Couette flow.

Show Abstract

Economic growth and happiness in China: A Bayesian multilevel age-period-cohort analysis based on the CGSS data 2005–2015

Yu-Sung Su, Donald Lien, Y. Yao

This paper introduces a Bayesian multilevel model based on the age-period-cohort framework to examine Chinese happiness. Using 8 waves of the Chinese General Social Survey (CGSS) data between 2005–2015, the model not only solves the co-linearity problem with weakly informative priors and explicit assumptions, it also produces more computationally stable results. Our estimation results show how Chinese happiness changes in an individual’s life circle and how one’s life experience is accumulated to her/his happiness with cognitive development. We identify some different generation patterns and explain generation differences in happiness across the various birth years with narratives of historical events. This paper contributes to existing studies both theoretically and methodologically. The novel modeling strategy and the analytical framework which assisted with historical narratives altogether explain better the age, period, and cohort effects on Chinese happiness.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates