2697 Publications

latentcor: An R Package for estimating latent correlations from mixed data types

Mingze Huang, C. Müller, Irina Gaynanova

We present `latentcor`, an R package for correlation estimation from data with mixed variable types. Mixed variables types, including continuous, binary, ordinal, zero-inflated, or truncated data are routinely collected in many areas of science. Accurate estimation of correlations among such variables is often the first critical step in statistical analysis workflows. Pearson correlation as the default choice is not well suited for mixed data types as the underlying normality assumption is violated. The concept of semi-parametric latent Gaussian copula models, on the other hand, provides a unifying way to estimate correlations between mixed data types. The R package `latentcor` comprises a comprehensive list of these models, enabling the estimation of correlations between any of continuous/binary/ternary/zero-inflated (truncated) variable types. The underlying implementation takes advantage of a fast multi-linear interpolation scheme with an efficient choice of interpolation grid points, thus giving the package a small memory footprint without compromising estimation accuracy. This makes latent correlation estimation readily available for modern high-throughput data analysis.

Show Abstract
August 20, 2021

Evaluating the Arrhenius equation for developmental processes

Joseph Crapse, Nishant Pappireddi, S. Shvartsman, et al.

The famous Arrhenius equation is well suited to describing the temperature dependence of chemical reactions but has also been used for complicated biological processes. Here, we evaluate how well the simple Arrhenius equation predicts complex multi-step biological processes, using frog and fruit fly embryogenesis as two canonical models. We find that the Arrhenius equation provides a good approximation for the temperature dependence of embryogenesis, even though individual developmental intervals scale differently with temperature. At low and high temperatures, however, we observed significant departures from idealized Arrhenius Law behavior. When we model multi-step reactions of idealized chemical networks, we are unable to generate comparable deviations from linearity. In contrast, we find the two enzymes GAPDH and β-galactosidase show non-linearity in the Arrhenius plot similar to our observations of embryonic development. Thus, we find that complex embryonic development can be well approximated by the simple Arrhenius equation regardless of non-uniform developmental scaling and propose that the observed departure from this law likely results more from non-idealized individual steps rather than from the complexity of the system.

Show Abstract

From complex datasets to predictive models of embryonic development

Sayantan Dutta , Aleena L. Patel, Shannon E. Keenan, S. Shvartsman

Modern studies of embryogenesis are increasingly quantitative, powered by rapid advances in imaging, sequencing and genome manipulation technologies. Deriving mechanistic insights from the complex datasets generated by these new tools requires systematic approaches for data-driven analysis of the underlying developmental processes. Here, we use data from our work on signal-dependent gene repression in the Drosophila embryo to illustrate how computational models can compactly summarize quantitative results of live imaging, chromatin immunoprecipitation and optogenetic perturbation experiments. The presented computational approach is ideally suited for integrating rapidly accumulating quantitative data and for guiding future studies of embryogenesis.

Show Abstract

A Holistic Review of a Galactic Interaction

Douglas Grion Filho, K. Johnston, Eloisa Poggio, Chervin F. P. Laporte, Ronald Drimmel, Elena D'Onghia

Our situation as occupants of the Milky Way (MW) Galaxy, bombarded by the Sagittarius dwarf galaxy, provides an intimate view of physical processes that can lead to the dynamical heating of a galactic disc. While this evolution is instigated by Sagittarius, it is also driven by the intertwined influences of the dark matter halo and the disc itself. We analyse an N-body simulation following a Sagittarius-like galaxy interacting with a MW-like host to disentangle these different influences during the stages of a minor merger. The accelerations in the disc plane from each component are calculated for each snapshot in the simulation, and then decomposed into Fourier series on annuli. The analysis maps quantify and compare the scales of the individual contributions over space and through time: (i) accelerations due to the satellite are only important around disc passages; (ii) the influence around these passages is enhanced and extended by the distortion of the dark matter halo; (iii) the interaction drives disc asymmetries within and perpendicular to the plane and the self-gravity of these distortions increase in importance with time eventually leading to the formation of a bar. These results have interesting implications for identifying different influences within our own Galaxy. Currently, Sagittarius is close enough to a plane crossing to search for localized signatures of its effect at intermediate radii, the distortion of the MW's dark matter halo should leave its imprint in the outer disc and the disc's own self-consistent response is sculpting the intermediate and inner disc.

Show Abstract

A Biologically Plausible Neural Network for Multichannel Canonical Correlation Analysis

David Lipshutz, Y. Bahroun, Siavash Golkar, A. Sengupta, Dmitri B. Chklovskii

Cortical pyramidal neurons receive inputs from multiple distinct neural populations and integrate these inputs in separate dendritic compartments. We explore the possibility that cortical microcircuits implement canonical correlation analysis (CCA), an unsupervised learning method that projects the inputs onto a common subspace so as to maximize the correlations between the projections. To this end, we seek a multichannel CCA algorithm that can be implemented in a biologically plausible neural network. For biological plausibility, we require that the network operates in the online setting and its synaptic update rules are local. Starting from a novel CCA objective function, we derive an online optimization algorithm whose optimization steps can be implemented in a single-layer neural network with multicompartmental neurons and local non-Hebbian learning rules. We also derive an extension of our online CCA algorithm with adaptive output rank and output whitening. Interestingly, the extension maps onto a neural network whose neural architecture and synaptic updates resemble neural circuitry and non-Hebbian plasticity observed in the cortex.

Show Abstract

An Improved and Physically Motivated Scheme for Matching Galaxies with Dark Matter Halos

S. Tonnesen, Jeremiah P. Ostriker

The simplest scheme for predicting real galaxy properties after performing a dark matter simulation is to rank order the real systems by stellar mass and the simulated systems by halo mass and then simply assume monotonicity - that the more massive halos host the more massive galaxies. This has had some success, but we study here if a better motivated and more accurate matching scheme is easily constructed by looking carefully at how well one could predict the simulated IllustrisTNG galaxy sample from its dark matter computations. We find that using the dark matter rotation curve peak velocity, vmax, for normal galaxies reduces the error of the prediction by 30% (18% for central galaxies and 60% for satellite systems) - following expectations from the physics of monolithic collapse. For massive systems with halo mass > 1012.5 M⊙ hierarchical merger driven formation is the better model and dark matter halo mass remains the best single metric. Using a new single variable that combines these effects, ϕ = vmax/vmax,12.7 + Mpeak/(1012.7 M⊙) allows further improvement and reduces the error, as compared to ranking by dark matter mass at z=0 by another 6% from vmax ranking. Two parameter fits -- including environmental effects produce only minimal further impact.

Show Abstract

A Solar System formation analogue in the Ophiuchus star-forming complex

J. Forbes, João Alves, Douglas N. C. Lin

Anomalies among the daughter nuclei of the extinct short-lived radionuclides (SLRs) in the calcium-aluminum-rich inclusions (CAIs) indicate that the Solar System must have been born near a source of the SLRs so that they could be incorporated before they decayed away. γ-rays from one such living SLR, 26Al, are detected in only a few nearby star-forming regions. Here we employ multi-wavelength observations to demonstrate that one such region, Ophiuchus, containing many pre-stellar cores that may serve as analogs for the emerging Solar System, is inundated with 26Al from the neighboring Upper-Scorpius association, and so may provide concrete guidance for how SLR enrichment proceeded in the Solar System complementary to the meteoritics. We demonstrate via Bayesian forward modeling drawing on a wide range of observational and theoretical results that this 26Al likely 1) arises from supernova explosions, 2) arises from multiple stars, 3) has enriched the gas prior to the formation of the cores, and 4) gives rise to a broad distribution of core enrichment spanning about two orders of magnitude. This means that if the spread in CAI ages is small, as it is in the Solar System, protoplanetary disks must suffer a global heating event.

Show Abstract

Phase Retrieval with Holography and Untrained Priors: Tackling the Challenges of Low-Photon Nanoscale Imaging

H. Lawrence, D. Barmherzig, Henry Li, M. Eickenberg, M. Gabrié

Phase retrieval is the inverse problem of recovering a signal from magnitude-only Fourier measurements, and underlies numerous imaging modalities, such as Coherent Diffraction Imaging (CDI). A variant of this setup, known as holography, includes a reference object that is placed adjacent to the specimen of interest before measurements are collected. The resulting inverse problem, known as holographic phase retrieval, is well-known to have improved problem conditioning relative to the original. This innovation, i.e. Holographic CDI, becomes crucial at the nanoscale, where imaging specimens such as viruses, proteins, and crystals require low-photon measurements. This data is highly corrupted by Poisson shot noise, and often lacks low-frequency content as well. In this work, we introduce a dataset-free deep learning framework for holographic phase retrieval adapted to these challenges. The key ingredients of our approach are the explicit and flexible incorporation of the physical forward model into an automatic differentiation procedure, the Poisson log-likelihood objective function, and an optional untrained deep image prior. We perform extensive evaluation under realistic conditions. Compared to competing classical methods, our method recovers signal from higher noise levels and is more resilient to suboptimal reference design, as well as to large missing regions of low frequencies in the observations. To the best of our knowledge, this is the first work to consider a dataset-free machine learning approach for holographic phase retrieval.

Show Abstract

A characteristic optical variability time scale in astrophysical accretion disks

Colin J. Burke, Yue Shen, Omer Blaes..., Y. Jiang, et. al.

Accretion disks around supermassive black holes in active galactic nuclei produce continuum radiation at ultraviolet and optical wavelengths. Physical processes in the accretion flow lead to stochastic variability of this emission on a wide range of time scales. We measured the optical continuum variability observed in 67 active galactic nuclei and the characteristic time scale at which the variability power spectrum flattens. We found a correlation between this time scale and the black hole mass extending over the entire mass range of supermassive black holes. This time scale is consistent with the expected thermal time scale at the ultraviolet-emitting radius in standard accretion disk theory. Accreting white dwarfs lie close to this correlation, suggesting a common process for all accretion disks.

Show Abstract

Lévy Walks and Path Chaos in the Dispersal of Elongated Structures Moving across Cellular Vortical Flows

Shi-Yuan Hu, Jun-Jun Chu, M. Shelley, Jun Zhang

In cellular vortical flows, namely arrays of counterrotating vortices, short but flexible filaments can show simple random walks through their stretch-coil interactions with flow stagnation points. Here, we study the dynamics of semirigid filaments long enough to broadly sample the vortical field. Using simulation, we find a surprising variety of long-time transport behavior—random walks, ballistic transport, and trapping—depending upon the filament’s relative length and effective flexibility. Moreover, we find that filaments execute Lévy walks whose diffusion exponents generally decrease with increasing filament length, until transitioning to Brownian walks. Lyapunov exponents likewise increase with length. Even completely rigid filaments, whose dynamics is finite dimensional, show a surprising variety of transport states and chaos. Fast filament dispersal is related to an underlying geometry of “conveyor belts.” Evidence for these various transport states is found in experiments using arrays of counterrotating rollers, immersed in a fluid and transporting a flexible ribbon.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates