2005 Publications

A Multimodal and Integrated Approach to Interrogate Human Kidney Biopsies with Rigor and Reproducibility: Guidelines from the Kidney Precision Medicine Project

T El-Achkar, C. Park, R. Sealfon, O. Troyanskaya, et al.

Comprehensive and spatially mapped molecular atlases of organs at a cellular level are a critical resource to gain insights into pathogenic mechanisms and personalized therapies for diseases. The Kidney Precision Medicine Project (KPMP) is an endeavor to generate 3-dimensional (3D) molecular atlases of healthy and diseased kidney biopsies using multiple state-of-the-art OMICS and imaging technologies across several institutions. Obtaining rigorous and reproducible results from disparate methods and at different sites to interrogate biomolecules at a single cell level or in 3D space is a significant challenge that can be a futile exercise if not well controlled. We describe a "follow the tissue" pipeline for generating a reliable and authentic single cell/region 3D molecular atlas of human adult kidney. Our approach emphasizes quality assurance, quality control, validation and harmonization across different OMICS and imaging technologies from sample procurement, processing, storage, shipping to data generation, analysis and sharing. We established benchmarks for quality control, rigor, reproducibility and feasibility across multiple technologies through a pilot experiment using common source tissue that was processed and analyzed at different institutions and different technologies. A peer review system was established to critically review quality control measures and the reproducibility of data generated by each technology before being approved to interrogate clinical biopsy specimens. The process established economizes the use of valuable biopsy tissue for multi-OMICS and imaging analysis with stringent quality control to ensure rigor and reproducibility of results and serves as a model for precision medicine projects across laboratories, institutions and consortia.

Show Abstract

Efficient high-order accurate Fresnel diffraction via areal quadrature and the nonuniform FFT

We present a fast algorithm for computing the diffracted field from arbitrary binary (sharp-edged) planar apertures and occulters in the scalar Fresnel approximation, for up to moderately high Fresnel numbers ($\lesssim 10^3$). It uses a high-order areal quadrature over the aperture, then exploits a single 2D nonuniform fast Fourier transform (NUFFT) to evaluate rapidly at target points (of order $10^7$ such points per second, independent of aperture complexity). It thus combines the high accuracy of edge integral methods with the high speed of Fourier methods. Its cost is ${\mathcal O}(n^2 \log n)$, where $n$ is the linear resolution required in source and target planes, to be compared with ${\mathcal O}(n^3)$ for edge integral methods. In tests with several aperture shapes, this translates to between 2 and 5 orders of magnitude acceleration. In starshade modeling for exoplanet astronomy, we find that it is roughly $10^4 \times$ faster than the state of the art in accurately computing the set of telescope pupil wavefronts. We provide a documented, tested MATLAB/Octave implementation.
An appendix shows the mathematical equivalence of the boundary diffraction wave, angular integration, and line integral formulae, then analyzes a new non-singular reformulation that eliminates their common difficulties near the geometric shadow edge. This supplies a robust edge integral reference against which to validate the main proposal.

Show Abstract

Identifying intracellular signaling modules and exploring pathways associated with breast cancer recurrence

X. Chen, J. Gu, A. Neuwald, L. Hilakivi-Clarke, R. Clarke, J. Xuan

Exploring complex modularization of intracellular signal transduction pathways is critical to understanding aberrant cellular responses during disease development and drug treatment. IMPALA (Inferred Modularization of PAthway LAndscapes) integrates information from high throughput gene expression experiments and genome-scale knowledge databases to identify aberrant pathway modules, thereby providing a powerful sampling strategy to reconstruct and explore pathway landscapes. Here IMPALA identifies pathway modules associated with breast cancer recurrence and Tamoxifen resistance. Focusing on estrogen-receptor (ER) signaling, IMPALA identifies alternative pathways from gene expression data of Tamoxifen treated ER positive breast cancer patient samples. These pathways were often interconnected through cytoplasmic genes such as IRS1/2, JAK1, YWHAZ, CSNK2A1, MAPK1 and HSP90AA1 and significantly enriched with ErbB, MAPK, and JAK-STAT signaling components. Characterization of the pathway landscape revealed key modules associated with ER signaling and with cell cycle and apoptosis signaling. We validated IMPALA-identified pathway modules using data from four different breast cancer cell lines including sensitive and resistant models to Tamoxifen. Results showed that a majority of genes in cell cycle/apoptosis modules that were up-regulated in breast cancer patients with short survivals (< 5 years) were also over-expressed in drug resistant cell lines, whereas the transcription factors JUN, FOS, and STAT3 were down-regulated in both patient and drug resistant cell lines. Hence, IMPALA identified pathways were associated with Tamoxifen resistance and an increased risk of breast cancer recurrence. The IMPALA package is available at https://dlrl.ece.vt.edu/software/.

Show Abstract

Identifying intracellular signaling modules and exploring pathways associated with breast cancer recurrence

X. Chen, J. Gu, A. Neuwald, L. Hilakivi-Clarke, R. Clarke, J. Xuan

Exploring complex modularization of intracellular signal transduction pathways is critical to understanding aberrant cellular responses during disease development and drug treatment. IMPALA (Inferred Modularization of PAthway LAndscapes) integrates information from high throughput gene expression experiments and genome-scale knowledge databases to identify aberrant pathway modules, thereby providing a powerful sampling strategy to reconstruct and explore pathway landscapes. Here IMPALA identifies pathway modules associated with breast cancer recurrence and Tamoxifen resistance. Focusing on estrogen-receptor (ER) signaling, IMPALA identifies alternative pathways from gene expression data of Tamoxifen treated ER positive breast cancer patient samples. These pathways were often interconnected through cytoplasmic genes such as IRS1/2, JAK1, YWHAZ, CSNK2A1, MAPK1 and HSP90AA1 and significantly enriched with ErbB, MAPK, and JAK-STAT signaling components. Characterization of the pathway landscape revealed key modules associated with ER signaling and with cell cycle and apoptosis signaling. We validated IMPALA-identified pathway modules using data from four different breast cancer cell lines including sensitive and resistant models to Tamoxifen. Results showed that a majority of genes in cell cycle/apoptosis modules that were up-regulated in breast cancer patients with short survivals (< 5 years) were also over-expressed in drug resistant cell lines, whereas the transcription factors JUN, FOS, and STAT3 were down-regulated in both patient and drug resistant cell lines. Hence, IMPALA identified pathways were associated with Tamoxifen resistance and an increased risk of breast cancer recurrence. The IMPALA package is available at https://dlrl.ece.vt.edu/software/ .

Show Abstract
Scientific Reports , 11(1): 385
January 11, 2021

Identifying intracellular signaling modules and exploring pathways associated with breast cancer recurrence

X. Chen, A. Neuwald, L. Hilakivi-Clarke, R. Clarke, J. Xuan

Exploring complex modularization of intracellular signal transduction pathways is critical to understanding aberrant cellular responses during disease development and drug treatment. IMPALA (Inferred Modularization of PAthway LAndscapes) integrates information from high throughput gene expression experiments and genome-scale knowledge databases to identify aberrant pathway modules, thereby providing a powerful sampling strategy to reconstruct and explore pathway landscapes. Here IMPALA identifies pathway modules associated with breast cancer recurrence and Tamoxifen resistance. Focusing on estrogen-receptor (ER) signaling, IMPALA identifies alternative pathways from gene expression data of Tamoxifen treated ER positive breast cancer patient samples. These pathways were often interconnected through cytoplasmic genes such as IRS1/2, JAK1, YWHAZ, CSNK2A1, MAPK1 and HSP90AA1 and significantly enriched with ErbB, MAPK, and JAK-STAT signaling components. Characterization of the pathway landscape revealed key modules associated with ER signaling and with cell cycle and apoptosis signaling. We validated IMPALA-identified pathway modules using data from four different breast cancer cell lines including sensitive and resistant models to Tamoxifen. Results showed that a majority of genes in cell cycle/apoptosis modules that were up-regulated in breast cancer patients with short survivals (< 5 years) were also over-expressed in drug resistant cell lines, whereas the transcription factors JUN, FOS, and STAT3 were down-regulated in both patient and drug resistant cell lines. Hence, IMPALA identified pathways were associated with Tamoxifen resistance and an increased risk of breast cancer recurrence. The IMPALA package is available at https://dlrl.ece.vt.edu/software/.

Show Abstract

An “individualist” model of an active genome in a developing embryo

S. Huang, S. Dutta, P. Whitney, S. Shvartsman, C. Rushlow

The early Drosophila embryo provides unique experimental advantages for addressing fundamental questions of gene regulation at multiple levels of organization, from individual gene loci to the whole genome. Using Drosophila embryos undergoing the first wave of genome activation, we detected discrete “speckles” of RNA Polymerase II (Pol II), and showed that they overlap with transcribing loci. We characterized the spatial distribution of Pol II speckles and quantified how this distribution changes in the absence of the primary driver of Drosophila genome activation, the pioneer factor Zelda. Although the number and size of Pol II speckles were reduced, indicating that Zelda promotes Pol II speckle formation, we observed a uniform distribution of distances between active genes in the nuclei of both wildtype and zelda mutant embryos. This suggests that the topologically associated domains identified by Hi-C studies do little to spatially constrain groups of transcribed genes at this time. We provide evidence that linear genomic distance between transcribed genes is the primary determinant of measured physical distance between the active loci. Furthermore, we show active genes can have distinct Pol II pools even if the active loci are in close proximity. In contrast to the emerging model whereby active genes are clustered to facilitate co-regulation and sharing of transcriptional resources, our data support an “individualist” model of gene control at early genome activation in Drosophila. This model is in contrast to a “collectivist” model where active genes are spatially clustered and share transcriptional resources, motivating rigorous tests of both models in other experimental systems.

Show Abstract
January 9, 2021

Quantum generative model for sampling many-body spectral functions

Quantum phase estimation is at the heart of most quantum algorithms with exponential speedup. In this letter we demonstrate how to utilize it to compute the dynamical response functions of many-body quantum systems. Specifically, we design a circuit that acts as an efficient quantum generative model, providing samples out of the spectral function of high rank observables in polynomial time. This includes many experimentally relevant spectra such as the dynamic structure factor, the optical conductivity or the NMR spectrum. Experimental realization of the algorithm, apart from logarithmic overhead, requires doubling the number of qubits as compared to a simple analog simulator.

Show Abstract

Wave functions, electronic localization, and bonding properties for correlated materials beyond the Kohn-Sham formalism

A. D. N. James, E. I. Harris-Lee, A. Hampel, M. Aichhorn, S. B. Dugdale

Many-body theories such as dynamical mean field theory (DMFT) have enabled the description of the electron exchange-correlation interactions that are missing in current density functional theory (DFT) calculations. However, there has been relatively little focus on the wavefunctions from these theories. We present the methodology of the newly developed Elk-TRIQS interface and how to calculate the DFT with DMFT (DFT+DMFT) wavefunctions, which can be used to calculate DFT+DMFT wavefunction dependent quantities. We illustrate this by calculating the electron localized function (ELF) in monolayer SrVO

Show Abstract

Capturing the complexity of topologically associating domains through multi-feature optimization

N. Sauerwald, C. Kingsford

The three-dimensional structure of human chromosomes is tied to gene regulation and replication timing, but there is still a lack of consensus on the computational and biological definitions for chromosomal substructures such as topologically associating domains (TADs). TADs are described and identified by various computational properties leading to different TAD sets with varying compatibility with biological properties such as boundary occupancy of structural proteins. We unify many of these computational and biological targets into one algorithmic framework that jointly maximizes several computational TAD definitions and optimizes TAD selection for a quantifiable biological property. Using this framework, we explore the variability of TAD sets optimized for six different desirable properties of TAD sets: high occupancy of CTCF, RAD21, and H3K36me3 at boundaries, reproducibility between replicates, high intra- vs inter-TAD difference in contact frequencies, and many CTCF binding sites at boundaries. The compatibility of these biological targets varies by cell type, and our results suggest that these properties are better reflected as subpopulations or families of TADs rather than a singular TAD set fitting all TAD definitions and properties. We explore the properties that produce similar TAD sets (reproducibility and inter- vs intra-TAD difference, for example) and those that lead to very different TADs (such as CTCF binding sites and inter- vs intra-TAD contact frequency difference).

Show Abstract
January 5, 2021

A design framework for actively crosslinked filament networks

S. Fürthauer, D. Needleman, M. Shelley

Living matter moves, deforms, and organizes itself. In cells this is made possible by networks of polymer filaments and crosslinking molecules that connect filaments to each other and that act as motors to do mechanical work on the network. For the case of highly cross-linked filament networks, we discuss how the material properties of assemblies emerge from the forces exerted by microscopic agents. First, we introduce a phenomenological model that characterizes the forces that crosslink populations exert between filaments. Second, we derive a theory that predicts the material properties of highly crosslinked filament networks, given the crosslinks present. Third, we discuss which properties of crosslinks set the material properties and behavior of highly crosslinked cytoskeletal networks. The work presented here, will enable the better understanding of cytoskeletal mechanics and its molecular underpinnings. This theory is also a first step toward a theory of how molecular perturbations impact cytoskeletal organization, and provides a framework for designing cytoskeletal networks with desirable properties in the lab.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates