Publications

Bursting Bubbles: Feedback from Clustered Supernovae and the Trade-off Between Turbulence and Outflows

M. Orr, D. Fielding, C. Hayward, B. Burkart

We present an analytic model for clustered supernovae (SNe) feedback in galaxy disks, incorporating the dynamical evolution of superbubbles formed from spatially overlapping SNe remnants. We propose two realistic outcomes for the evolution of superbubbles in galactic disks: (1) the expansion velocity of the shock front falls below the turbulent velocity dispersion of the ISM in the galaxy disk, whereupon the superbubble stalls and fragments, depositing its momentum entirely within the galaxy disk, or (2) the superbubble grows in size to reach the gas scale height, breaking out of the galaxy disk and driving galactic outflows/fountains. In either case, we find that superbubble breakup/breakout almost always occurs before the last Type-II SN (≲40 Myr) in the recently formed star cluster, assuming a standard high-end IMF slope, and scalings between stellar lifetimes and masses. The threshold between these two cases implies a break in the effective strength of feedback in driving turbulence within galaxies, and a resulting change in the scalings of, e.g., star formation rates with gas surface density (the Kennicutt-Schmidt relation) and the star formation efficiency in galaxy disks.

Show Abstract

Stability selection enables robust learning of differential equations from limited noisy data

S. Maddu, Bevan L. Cheeseman , Ivo F. Sbalzarini, C. Müller

We present a statistical learning framework for robust identification of differential equations from noisy spatio-temporal data. We address two issues that have so far limited the application of such methods, namely their robustness against noise and the need for manual parameter tuning, by proposing stability-based model selection to determine the level of regularization required for reproducible inference. This avoids manual parameter tuning and improves robustness against noise in the data. Our stability selection approach, termed PDE-STRIDE, can be combined with any sparsity-promoting regression method and provides an interpretable criterion for model component importance. We show that the particular combination of stability selection with the iterative hard-thresholding algorithm from compressed sensing provides a fast and robust framework for equation inference that outperforms previous approaches with respect to accuracy, amount of data required, and robustness. We illustrate the performance of PDE-STRIDE on a range of simulated benchmark problems, and we demonstrate the applicability of PDE-STRIDE on real-world data by considering purely data-driven inference of the protein interaction network for embryonic polarization in Caenorhabditis elegans. Using fluorescence microscopy images of C. elegans zygotes as input data, PDE-STRIDE is able to learn the molecular interactions of the proteins.

Show Abstract

A Bayesian Population Model for the Observed Dust Attenuation in Galaxies

Gautam Nagaraj, J. Forbes, Joel Leja, D. Foreman-Mackey, C. Hayward

Dust plays a pivotal role in determining the observed spectral energy distribution (SED) of galaxies. Yet our understanding of dust attenuation is limited and our observations suffer from the dust-metallicity-age degeneracy in SED fitting (single galaxies), large individual variances (ensemble measurements), and the difficulty in properly dealing with uncertainties (statistical considerations). In this study, we create a population Bayesian model to rigorously account for correlated variables and non-Gaussian error distributions and demonstrate the improvement over a simple Bayesian model. We employ a flexible 5-D linear interpolation model for the parameters that control dust attenuation curves as a function of stellar mass, star formation rate (SFR), metallicity, redshift, and inclination. Our setup allows us to determine the complex relationships between dust attenuation and these galaxy properties simultaneously. Using Prospector fits of nearly 30,000 3D-HST galaxies, we find that the attenuation slope (n) flattens with increasing optical depth (τ), though less so than in previous studies. τ increases strongly with SFR, though when log SFR≲0, τ remains roughly constant over a wide range of stellar masses. Edge-on galaxies tend to have larger τ than face-on galaxies, but only for log M∗≳10, reflecting the lack of triaxiality for low-mass galaxies. Redshift evolution of dust attenuation is strongest for low-mass, low-SFR galaxies, with higher optical depths but flatter curves at high redshift. Finally, n has a complex relationship with stellar mass, highlighting the intricacies of the star-dust geometry. We have publicly released software (this https URL) for users to access our population model.

Show Abstract

Simple lessons from complex learning: what a neural network model learns about cosmic structure formation

D. Jamieson, Y. Li, S. He, F. Villaescusa-Navarro, S. Ho, R. Alves de Oliveira, D. Spergel

We train a neural network model to predict the full phase space evolution of cosmological N-body simulations. Its success implies that the neural network model is accurately approximating the Green's function expansion that relates the initial conditions of the simulations to its outcome at later times in the deeply nonlinear regime. We test the accuracy of this approximation by assessing its performance on well understood simple cases that have either known exact solutions or well understood expansions. These scenarios include spherical configurations, isolated plane waves, and two interacting plane waves: initial conditions that are very different from the Gaussian random fields used for training. We find our model generalizes well to these well understood scenarios, demonstrating that the networks have inferred general physical principles and learned the nonlinear mode couplings from the complex, random Gaussian training data. These tests also provide a useful diagnostic for finding the model's strengths and weaknesses, and identifying strategies for model improvement. We also test the model on initial conditions that contain only transverse modes, a family of modes that differ not only in their phases but also in their evolution from the longitudinal growing modes used in the training set. When the network encounters these initial conditions that are orthogonal to the training set, the model fails completely. In addition to these simple configurations, we evaluate the model's predictions for the density, displacement, and momentum power spectra with standard initial conditions for N-body simulations. We compare these summary statistics against N-body results and an approximate, fast simulation method called COLA. Our model achieves percent level accuracy at nonlinear scales of $$k ∼ 1 Mpc −1 h,$$ representing a significant improvement over COLA.

Show Abstract

Parallel Discrete Convolutions on Adaptive Particle Representations of Images

Joel Jonsson, S. Maddu, et al.

We present data structures and algorithms for native implementations of discrete convolution operators over Adaptive Particle Representations (APR) of images on parallel computer architectures. The APR is a content-adaptive image representation that locally adapts the sampling resolution to the image signal. It has been developed as an alternative to pixel representations for large, sparse images as they typically occur in fluorescence microscopy. It has been shown to reduce the memory and runtime costs of storing, visualizing, and processing such images. This, however, requires that image processing natively operates on APRs, without intermediately reverting to pixels. Designing efficient and scalable APR-native image processing primitives, however, is complicated by the APR’s irregular memory structure. Here, we provide the algorithmic building blocks required to efficiently and natively process APR images using a wide range of algorithms that can be formulated in terms of discrete convolutions. We show that APR convolution naturally leads to scale-adaptive algorithms that efficiently parallelize on multi-core CPU and GPU architectures. We quantify the speedups in comparison to pixel-based algorithms and convolutions on evenly sampled data. We achieve pixel-equivalent throughputs of up to 1TB/s on a single Nvidia GeForce RTX 2080 gaming GPU, requiring up to two orders of magnitude less memory than a pixel-based implementation.

Show Abstract

Discrete Lehmann representation of imaginary time Green’s functions

J. Kaye, K. Chen, O. Parcollet

We present an efficient basis for imaginary time Green's functions based on a low rank decomposition of the spectral Lehmann representation. The basis functions are simply a set of well-chosen exponentials, so the corresponding expansion may be thought of as a discrete form of the Lehmann representation using an effective spectral density which is a sum of δ functions. The basis is determined only by an upper bound on the product βωmax, with β the inverse temperature and ωmax an energy cutoff, and a user-defined error tolerance ϵ. The number r of basis functions scales as (log(βωmax)log(1/ϵ)). The discrete Lehmann representation of a particular imaginary time Green's function can be recovered by interpolation at a set of r imaginary time nodes. Both the basis functions and the interpolation nodes can be obtained rapidly using standard numerical linear algebra routines. Due to the simple form of the basis, the discrete Lehmann representation of a Green's function can be explicitly transformed to the Matsubara frequency domain, or obtained directly by interpolation on a Matsubara frequency grid. We benchmark the efficiency of the representation on simple cases, and with a high precision solution of the Sachdev-Ye-Kitaev equation at low temperature. We compare our approach with the related intermediate representation method, and introduce an improved algorithm to build the intermediate representation basis and a corresponding sampling grid.

Show Abstract

HoloML in Stan: Low-photon Image Reconstruction

B. Ward, B. Carpenter, D. Barmherzig

This case study is a reimplementation of the algorithm described in Barmherzig and Sun (2022) [1] as a Stan model. This requires the new features available in Stan 2.30

Show Abstract

Field Level Neural Network Emulator for Cosmological N-body Simulations

D. Jamieson, Y. Li, R. Alves de Oliveira, F. Villaescusa-Navarro, S. Ho, D. Spergel

We build a field level emulator for cosmic structure formation that is accurate in the nonlinear regime. Our emulator consists of two convolutional neural networks trained to output the nonlinear displacements and velocities of N-body simulation particles based on their linear inputs. Cosmology dependence is encoded in the form of style parameters at each layer of the neural network, enabling the emulator to effectively interpolate the outcomes of structure formation between different flat ΛCDM cosmologies over a wide range of background matter densities. The neural network architecture makes the model differentiable by construction, providing a powerful tool for fast field level inference. We test the accuracy of our method by considering several summary statistics, including the density power spectrum with and without redshift space distortions, the displacement power spectrum, the momentum power spectrum, the density bispectrum, halo abundances, and halo profiles with and without redshift space distortions. We compare these statistics from our emulator with the full N-body results, the COLA method, and a fiducial neural network with no cosmological dependence. We find our emulator gives accurate results down to scales of $$k ∼ 1 Mpc −1 h,$$ representing a considerable improvement over both COLA and the fiducial neural network. We also demonstrate that our emulator generalizes well to initial conditions containing primordial non-Gaussianity, without the need for any additional style parameters or retraining.

Show Abstract

A reference tissue atlas for the human kidney

Jens Hansen, R. Sealfon, O. Troyanskaya, et al.

Kidney Precision Medicine Project (KPMP) is building a spatially specified human kidney tissue atlas in health and disease with single-cell resolution. Here, we describe the construction of an integrated reference map of cells, pathways, and genes using unaffected regions of nephrectomy tissues and undiseased human biopsies from 56 adult subjects. We use single-cell/nucleus transcriptomics, subsegmental laser microdissection transcriptomics and proteomics, near-single-cell proteomics, 3D and CODEX imaging, and spatial metabolomics to hierarchically identify genes, pathways, and cells. Integrated data from these different technologies coherently identify cell types/subtypes within different nephron segments and the interstitium. These profiles describe cell-level functional organization of the kidney following its physiological functions and link cell subtypes to genes, proteins, metabolites, and pathways. They further show that messenger RNA levels along the nephron are congruent with the subsegmental physiological activity. This reference atlas provides a framework for the classification of kidney disease when multiple molecular mechanisms underlie convergent clinical phenotypes.

Show Abstract

Randomized Iterative Methods for Low-Complexity Large-Scale MIMO Detection

Zheng Wang, R. M. Gower, Yili Xia, Lanxin He, Yongming Huang

In this paper, we introduce a randomized iterative method for signal detection in uplink large-scale multiple-input multiple-output (MIMO) systems, which not only achieves a low computational complexity but also enjoys a global and exponentially fast convergence. First of all, by adopting the random sampling into the iterations, the randomized iterative detection algorithm (RIDA) is proposed for large-scale MIMO systems. We show that RIDA converges exponentially fast in terms of mean squared error (MSE). Furthermore, this global convergence always holds, and does not depend on the standard requirements such as N≫K , where N and K denote the numbers of antennas at the sides of base station and users. This broadly extends the applications of low-complexity detection in uplink large-scale MIMO systems. Then, based on a new conditional sampling, optimization and enhancements are given to further improve both the convergence and efficiency of RIDA, resulting in the modified randomized iterative detection algorithm (MRIDA). Meanwhile, with respect to MRIDA, further complexity reduction by exploiting the matrix structure is given while its implementation by deep neural networks (DNN) is also presented for a better detection performance.

Show Abstract