2573 Publications

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

A. Mathuriya, D. Bard, P. Mendygral, L. Meadows, J. Arnemann, L. Shao, S. He, T. Karna, D. Moise, S. Pennycook, K. Maschoff, J. Sewall, N. Kumar, S. Ho, M. Ringenburg, Prabhat, V. Lee

Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many element-wise operations, to improve training performance on Intel(C) Xeon Phi(TM) processors. We also utilize the Cray PE Machine Learning Plugin for efficient scaling to multiple nodes. We demonstrate fully synchronous data-parallel training on 8192 nodes of Cori with 77% parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our knowledge, this is the first large-scale science application of the TensorFlow framework at supercomputer scale with fully-synchronous training. These enhancements enable us to process large 3D dark matter distribution and predict the cosmological parameters ΩM, σ8 and ns with unprecedented accuracy.

Show Abstract
August 14, 2018

Transcriptome analysis of adult Caenorhabditis elegans cells reveals tissue-specific gene and isoform expression.

R. Kaletsky, V. Yao, A. Williams, A. Runnels, A. Tadych, S. Zhou, O. Troyanskaya, C. Murphy

The biology and behavior of adults differ substantially from those of developing animals, and cell-specific information is critical for deciphering the biology of multicellular animals. Thus, adult tissue-specific transcriptomic data are critical for understanding molecular mechanisms that control their phenotypes. We used adult cell-specific isolation to identify the transcriptomes of C. elegans' four major tissues (or "tissue-ome"), identifying ubiquitously expressed and tissue-specific "enriched" genes. These data newly reveal the hypodermis' metabolic character, suggest potential worm-human tissue orthologies, and identify tissue-specific changes in the Insulin/IGF-1 signaling pathway. Tissue-specific alternative splicing analysis identified a large set of collagen isoforms. Finally, we developed a machine learning-based prediction tool for 76 sub-tissue cell types, which we used to predict cellular expression differences in IIS/FOXO signaling, stage-specific TGF-β activity, and basal vs. memory-induced CREB transcription. Together, these data provide a rich resource for understanding the biology governing multicellular adult animals.

Show Abstract
August 10, 2018

Bridging Star-forming Galaxy and AGN Ultraviolet Luminosity Functions at z = 4 with the SHELA Wide-field Survey

Matthew L. Stevans, Steven L. Finkelstein, Isak Wold, ..., R. Somerville, et. al.

We present a joint analysis of the rest-frame ultraviolet (UV) luminosity functions of continuum-selected star-forming galaxies and galaxies dominated by active galactic nuclei (AGNs) at z∼ 4. These 3,740 z∼ 4 galaxies are selected from broad-band imaging in nine photometric bands over 18 deg2 in the \textit{Spitzer}/HETDEX Exploratory Large Area Survey (SHELA) field. The large area and moderate depth of our survey provide a unique view of the intersection between the bright end of the galaxy UV luminosity function (MAB<−22) and the faint end of the AGN UV luminosity function. We do not separate AGN-dominated galaxies from star-formation-dominated galaxies, but rather fit both luminosity functions simultaneously. These functions are best fit with a double power-law (DPL) for both the galaxy and AGN components, where the galaxy bright-end slope has a power-law index of −3.80±0.10, and the corresponding AGN faint-end slope is αAGN=−1.49+0.30−0.21. We cannot rule out a Schechter-like exponential decline for the galaxy UV luminosity function, and in this scenario the AGN luminosity function has a steeper faint-end slope of −2.08+0.18−0.11. Comparison of our galaxy luminosity function results with a representative cosmological model of galaxy formation suggests that the molecular gas depletion time must be shorter, implying that star formation is more efficient in bright galaxies at z=4 than at the present day. If the galaxy luminosity function does indeed have a power-law shape at the bright end, the implied ionizing emissivity from AGNs is not inconsistent with previous observations. However, if the underlying galaxy distribution is Schechter, it implies a significantly higher ionizing emissivity from AGNs at this epoch.

Show Abstract

Detection of the Milky Way spiral arms in dust from 3D mapping

Sara Rezaei Kh., Coryn A.L. Bailer-Jones, D. Hogg, Mathias Schultheis

Large stellar surveys are sensitive to interstellar dust through the effects of reddening. Using extinctions measured from photometry and spectroscopy, together with three-dimensional (3D) positions of individual stars, it is possible to construct a three-dimensional dust map. We present the first continuous map of the dust distribution in the Galactic disk out to 7 kpc within 100 pc of the Galactic midplane, using red clump and giant stars from SDSS APOGEE DR14. We use a non-parametric method based on Gaussian Processes to map the dust density, which is the local property of the ISM rather than an integrated quantity. This method models the dust correlation between points in 3D space and can capture arbitrary variations, unconstrained by a pre-specified functional form. This produces a continuous map without line-of-sight artefacts. Our resulting map traces some features of the local Galactic spiral arms, even though the model contains no prior suggestion of spiral arms, nor any underlying model for the Galactic structure. This is the first time that such evident arm structures have been captured by a dust density map in the Milky Way. Our resulting map also traces some of the known giant molecular clouds in the Galaxy and puts some constraints on their distances, some of which were hitherto relatively uncertain.

Show Abstract
July 31, 2018

Single-atom-resolved probing of lattice gases in momentum space

Hugo Cayla, Cécile Carcy, Quentin Bouton, Rockson Chang, G. Carleo, Marco Mancini, David Clément

Measuring the full distribution of individual particles is of fundamental importance to characterize many-body quantum systems through correlation functions at any order. Here, we demonstrate the possibility to reconstruct the momentum-space distribution of three-dimensional interacting lattice gases atom by atom. This is achieved by detecting individual metastable $$^4He*$$ atoms in the far-field regime of expansion, when released from an optical lattice. We benchmark our technique with quantum Monte Carlo calculations, demonstrating the ability to resolve momentum distributions of superfluids occupying $$10^5$$ lattice sites. It permits a direct measure of the condensed fraction across phase transitions, as we illustrate on the superfluid-to-normal transition. Our single-atom-resolved approach opens a route to investigate interacting lattice gases through momentum correlations.

Show Abstract

Voltage induced metal-insulator transition in 1D charge density wave

Giuliano Chiriacò, A. Millis

We present a theoretical investigation of the voltage-driven metal insulator transition based on solving coupled Boltzmann and Hartree-Fock equations to determine the insulating gap and the electron distribution in a model system -- a one dimensional charge density wave. Electric fields that are parametrically small relative to energy gaps can shift the electron distribution away from the momentum-space region where interband relaxation is efficient, leading to a highly non-equilibrium quasiparticle distribution even in the absence of Zener tunnelling. The gap equation is found to have regions of multistability; a non-equilibrium analogue of the free energy is constructed and used to determine which phase is preferred.

Show Abstract

Foreground Biases on Primordial Non-Gaussianity Measurements from the CMB Temperature Bispectrum: Implications for Planck and Beyond

The cosmic microwave background (CMB) temperature bispectrum is currently the most precise tool for constraining primordial non-Gaussianity (NG). The Planck temperature data tightly constrain the amplitude of local-type NG: flocNL=2.5±5.7. Here, we compute previously-neglected foreground biases in temperature-based flocNL measurements. We consider the integrated Sachs-Wolfe (ISW) effect, gravitational lensing, the thermal (tSZ) and kinematic Sunyaev-Zel'dovich (kSZ) effects, and the cosmic infrared background (CIB). In standard analyses, a significant foreground bias arising from the ISW-lensing bispectrum is subtracted from the flocNL measurement. However, many other terms sourced by the ISW, lensing, tSZ, kSZ, and CIB fields are also present in the temperature bispectrum. We compute the dominant biases on flocNL arising from these signals. Most of the biases are non-blackbody, and are thus reduced by multifrequency component separation methods; however, recent analyses have found that extragalactic foregrounds are present at non-negligible levels in the Planck component-separated maps. Moreover, the Planck FFP8 simulations do not include the correlations amongst components that are responsible for these biases. We compute the biases for individual frequencies, finding that some are comparable to the statistical error bar on flocNL, even for the main CMB channels (100, 143, and 217 GHz). For future experiments, they can greatly exceed the statistical error bar (considering temperature only). A full assessment will require calculations in tandem with component separation, ideally using simulations. Similar biases will also afflict measurements of equilateral and orthogonal NG, as well as trispectrum NG. We conclude that the search for primordial NG using Planck data may not yet be over.

Show Abstract
July 19, 2018

Nature of the metal-insulator transition in few-unit-cell-thick LaNiO3 films

Maryam Golalikhani, Qun-li Lei, Ravini U. Chandrasena, Leila Kasaei, Hyunggyu Park, Jianming Bai, Pasquale Origiani, Jim Ciston, George E. Sterbinsky, Dario A. Arena, Padraic Shafer, Elke Arenholz, Bruce A. Davidson, A. Millis, Alexander X. Gray, Xiaoxing Xi

The nature of the metal insulator transition in thin films and superlattices of LaNiO3 with only few unit cells in thickness remains elusive despite tremendous effort. Quantum confinement and epitaxial strain have been evoked as the mechanisms, although other factors such as growth-induced disorder, cation non-stoichiometry, oxygen vacancies, and substrate-film interface quality may also affect the observable properties in the ultrathin films. Here we report results obtained for near-ideal LaNiO3 films with different thicknesses and terminations grown by atomic layer-by-layer laser molecular beam epitaxy on LaAlO3 substrates. We find that the room-temperature metallic behavior persists until the film thickness is reduced to an unprecedentedly small 1.5 unit cells (NiO2 termination). Electronic structure measurements using x-ray absorption spectroscopy and first-principles calculation suggest that oxygen vacancies existing in the films also contribute to the metal insulator transition.

Show Abstract

A unified integral equation scheme for doubly-periodic Laplace and Stokes boundary value problems in two dimensions

A. Barnett, G Marple, S. Veerapaneni, L Zhao

We present a spectrally-accurate scheme to turn a boundary integral formulation for an elliptic PDE on a single unit cell geometry into one for the fully periodic problem. Applications include computing the effective permeability of composite media (homogenization), and microfluidic chip design. Our basic idea is to exploit a small least squares solve to apply periodicity without ever handling periodic Green's functions. We exhibit fast solvers for the two-dimensional (2D) doubly-periodic Neumann Laplace problem (flow around insulators), and Stokes non-slip fluid flow problem, that for inclusions with smooth boundaries achieve 12-digit accuracy, and can handle thousands of inclusions per unit cell. We split the infinite sum over the lattice of images into a directly-summed "near" part plus a small number of auxiliary sources which represent the (smooth) remaining "far" contribution. Applying physical boundary conditions on the unit cell walls gives an expanded linear system, which, after a rank-1 or rank-3 correction and a Schur complement, leaves a well-conditioned square system which can be solved iteratively using fast multipole acceleration plus a low-rank term. We are rather explicit about the consistency and nullspaces of both the continuous and discretized problems. The scheme is simple (no lattice sums, Ewald methods, nor particle meshes are required), allows adaptivity, and is essentially dimension- and PDE-independent, so would generalize without fuss to 3D and to other non-oscillatory elliptic problems such as elastostatics. We incorporate recently developed spectral quadratures that accurately handle close-to-touching geometries. We include many numerical examples, and provide a software implementation.

Show Abstract

Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk

J. Zhou, Chandra L. Theesfeld, K. Yao, K. Chen, A. Wong, O. Troyanskaya

Key challenges for human genetics, precision medicine and evolutionary biology include deciphering the regulatory code of gene expression and understanding the transcriptional effects of genome variation. However, this is extremely difficult because of the enormous scale of the noncoding mutation space. We developed a deep learning–based framework, ExPecto, that can accurately predict, ab initio from a DNA sequence, the tissue-specific transcriptional effects of mutations, including those that are rare or that have not been observed. We prioritized causal variants within disease- or trait-associated loci from all publicly available genome-wide association studies and experimentally validated predictions for four immune-related diseases. By exploiting the scalability of ExPecto, we characterized the regulatory mutation space for human RNA polymerase II–transcribed genes by in silico saturation mutagenesis and profiled > 140 million promoter-proximal mutations. This enables probing of evolutionary constraints on gene expression and ab initio prediction of mutation disease effects, making ExPecto an end-to-end computational framework for the in silico prediction of expression and disease risk.

Show Abstract
July 16, 2018
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates

privacy consent banner

Privacy preference

We use cookies to provide you with the best online experience. By clicking "Accept All," you help us understand how our site is used and enhance its performance. You can change your choice at any time here. To learn more, please visit our Privacy Policy.