726 Publications

Genome-wide landscape of RNA-binding protein target site dysregulation reveals a major impact on psychiatric disorder risk

C. Park, J. Zhou, A. Wong, K. Chen, C. Theesfeld, R. Darnell , O. Troyanskaya

Despite the strong genetic basis of psychiatric disorders, the underlying molecular mechanisms are largely unmapped. RNA-binding proteins (RBPs) are responsible for most post-transcriptional regulation, from splicing to translation to localization. RBPs thus act as key gatekeepers of cellular homeostasis, especially in the brain. However, quantifying the pathogenic contribution of noncoding variants impacting RBP target sites is challenging. Here, we leverage a deep learning approach that can accurately predict the RBP target site dysregulation effects of mutations and discover that RBP dysregulation is a principal contributor to psychiatric disorder risk. RBP dysregulation explains a substantial amount of heritability not captured by large-scale molecular quantitative trait loci studies and has a stronger impact than common coding region variants. We share the genome-wide profiles of RBP dysregulation, which we use to identify DDHD2 as a candidate schizophrenia risk gene. This resource provides a new analytical framework to connect the full range of RNA regulation to complex disease.

Show Abstract
Nature Genetics, 53(2): 166-173
January 18, 2021

Swirling Instability of the Microtubule Cytoskeleton

D. Stein, G. De Canio, E. Lauga, M. Shelley, R. Goldstein

In the cellular phenomena of cytoplasmic streaming, molecular motors carrying cargo along a network of microtubules entrain the surrounding fluid. The piconewton forces produced by individual motors are sufficient to deform long microtubules, as are the collective fluid flows generated by many moving motors. Studies of streaming during oocyte development in the fruit fly Drosophila melanogaster have shown a transition from a spatially disordered cytoskeleton, supporting flows with only short-ranged correlations, to an ordered state with a cell-spanning vortical flow. To test the hypothesis that this transition is driven by fluid-structure interactions, we study a discrete-filament model and a coarse-grained continuum theory for motors moving on a deformable cytoskeleton, both of which are shown to exhibit a swirling instability to spontaneous large-scale rotational motion, as observed.

Show Abstract

From heterogeneous datasets to predictive models of embryonic development

S. Dutta, A. Patel, S. Keenan, S. Shvartsman

Modern studies of embryogenesis are increasingly quantitative, powered by rapid advances in imaging, sequencing, and genome manipulation technologies. Deriving mechanistic insights from the complex datasets generated by these new tools requires systematic approaches for data-driven analysis of the underlying developmental processes. Here we use data from our work on signal-dependent gene repression in the fruit fly, Drosophila melanogaster, to illustrate how computational models can compactly summarize quantitative results of live imaging, chromatin immunoprecipitation, and optogenetic perturbation experiments. The presented computational approach is ideally suited for integrating rapidly accumulating quantitative data and for guiding future studies of embryogenesis.

Show Abstract
January 13, 2021

A Micromachined Picocalorimeter Sensor for Liquid Samples with Application to Chemical Reactions and Biochemistry

Jinhye Bae, Juanjuan Zheng, D. Needleman

Calorimetry has long been used to probe the physical state of a system by measuring the heat exchanged with the environment as a result of chemical reactions or phase transitions. Application of calorimetry to microscale biological samples, however, is hampered by insufficient sensitivity and the difficulty of handling liquid samples at this scale. Here, a micromachined calorimeter sensor that is capable of resolving picowatt levels of power is described. The sensor consists of low-noise thermopiles on a thin silicon nitride membrane that allow direct differential temperature measurements between a sample and four coplanar references, which significantly reduces thermal drift. The partial pressure of water in the ambient around the sample is maintained at saturation level using a small hydrogel-lined enclosure. The materials used in the sensor and its geometry are optimized to minimize the noise equivalent power generated by the sensor in response to the temperature field that develops around a typical sample. The experimental response of the sensor is characterized as a function of thermopile dimensions and sample volume, and its capability is demonstrated by measuring the heat dissipated during an enzymatically catalyzed biochemical reaction in a microliter-sized liquid droplet. The sensor offers particular promise for quantitative measurements on biological systems.

Show Abstract
January 12, 2021

Identifying intracellular signaling modules and exploring pathways associated with breast cancer recurrence

X. Chen, J. Gu, A. Neuwald, L. Hilakivi-Clarke, R. Clarke, J. Xuan

Exploring complex modularization of intracellular signal transduction pathways is critical to understanding aberrant cellular responses during disease development and drug treatment. IMPALA (Inferred Modularization of PAthway LAndscapes) integrates information from high throughput gene expression experiments and genome-scale knowledge databases to identify aberrant pathway modules, thereby providing a powerful sampling strategy to reconstruct and explore pathway landscapes. Here IMPALA identifies pathway modules associated with breast cancer recurrence and Tamoxifen resistance. Focusing on estrogen-receptor (ER) signaling, IMPALA identifies alternative pathways from gene expression data of Tamoxifen treated ER positive breast cancer patient samples. These pathways were often interconnected through cytoplasmic genes such as IRS1/2, JAK1, YWHAZ, CSNK2A1, MAPK1 and HSP90AA1 and significantly enriched with ErbB, MAPK, and JAK-STAT signaling components. Characterization of the pathway landscape revealed key modules associated with ER signaling and with cell cycle and apoptosis signaling. We validated IMPALA-identified pathway modules using data from four different breast cancer cell lines including sensitive and resistant models to Tamoxifen. Results showed that a majority of genes in cell cycle/apoptosis modules that were up-regulated in breast cancer patients with short survivals (< 5 years) were also over-expressed in drug resistant cell lines, whereas the transcription factors JUN, FOS, and STAT3 were down-regulated in both patient and drug resistant cell lines. Hence, IMPALA identified pathways were associated with Tamoxifen resistance and an increased risk of breast cancer recurrence. The IMPALA package is available at https://dlrl.ece.vt.edu/software/.

Show Abstract

Identifying intracellular signaling modules and exploring pathways associated with breast cancer recurrence

X. Chen, J. Gu, A. Neuwald, L. Hilakivi-Clarke, R. Clarke, J. Xuan

Exploring complex modularization of intracellular signal transduction pathways is critical to understanding aberrant cellular responses during disease development and drug treatment. IMPALA (Inferred Modularization of PAthway LAndscapes) integrates information from high throughput gene expression experiments and genome-scale knowledge databases to identify aberrant pathway modules, thereby providing a powerful sampling strategy to reconstruct and explore pathway landscapes. Here IMPALA identifies pathway modules associated with breast cancer recurrence and Tamoxifen resistance. Focusing on estrogen-receptor (ER) signaling, IMPALA identifies alternative pathways from gene expression data of Tamoxifen treated ER positive breast cancer patient samples. These pathways were often interconnected through cytoplasmic genes such as IRS1/2, JAK1, YWHAZ, CSNK2A1, MAPK1 and HSP90AA1 and significantly enriched with ErbB, MAPK, and JAK-STAT signaling components. Characterization of the pathway landscape revealed key modules associated with ER signaling and with cell cycle and apoptosis signaling. We validated IMPALA-identified pathway modules using data from four different breast cancer cell lines including sensitive and resistant models to Tamoxifen. Results showed that a majority of genes in cell cycle/apoptosis modules that were up-regulated in breast cancer patients with short survivals (< 5 years) were also over-expressed in drug resistant cell lines, whereas the transcription factors JUN, FOS, and STAT3 were down-regulated in both patient and drug resistant cell lines. Hence, IMPALA identified pathways were associated with Tamoxifen resistance and an increased risk of breast cancer recurrence. The IMPALA package is available at https://dlrl.ece.vt.edu/software/ .

Show Abstract
Scientific Reports , 11(1): 385
January 11, 2021

Identifying intracellular signaling modules and exploring pathways associated with breast cancer recurrence

X. Chen, A. Neuwald, L. Hilakivi-Clarke, R. Clarke, J. Xuan

Exploring complex modularization of intracellular signal transduction pathways is critical to understanding aberrant cellular responses during disease development and drug treatment. IMPALA (Inferred Modularization of PAthway LAndscapes) integrates information from high throughput gene expression experiments and genome-scale knowledge databases to identify aberrant pathway modules, thereby providing a powerful sampling strategy to reconstruct and explore pathway landscapes. Here IMPALA identifies pathway modules associated with breast cancer recurrence and Tamoxifen resistance. Focusing on estrogen-receptor (ER) signaling, IMPALA identifies alternative pathways from gene expression data of Tamoxifen treated ER positive breast cancer patient samples. These pathways were often interconnected through cytoplasmic genes such as IRS1/2, JAK1, YWHAZ, CSNK2A1, MAPK1 and HSP90AA1 and significantly enriched with ErbB, MAPK, and JAK-STAT signaling components. Characterization of the pathway landscape revealed key modules associated with ER signaling and with cell cycle and apoptosis signaling. We validated IMPALA-identified pathway modules using data from four different breast cancer cell lines including sensitive and resistant models to Tamoxifen. Results showed that a majority of genes in cell cycle/apoptosis modules that were up-regulated in breast cancer patients with short survivals (< 5 years) were also over-expressed in drug resistant cell lines, whereas the transcription factors JUN, FOS, and STAT3 were down-regulated in both patient and drug resistant cell lines. Hence, IMPALA identified pathways were associated with Tamoxifen resistance and an increased risk of breast cancer recurrence. The IMPALA package is available at https://dlrl.ece.vt.edu/software/.

Show Abstract

A Multimodal and Integrated Approach to Interrogate Human Kidney Biopsies with Rigor and Reproducibility: Guidelines from the Kidney Precision Medicine Project

T El-Achkar, C. Park, R. Sealfon, O. Troyanskaya, et al.

Comprehensive and spatially mapped molecular atlases of organs at a cellular level are a critical resource to gain insights into pathogenic mechanisms and personalized therapies for diseases. The Kidney Precision Medicine Project (KPMP) is an endeavor to generate 3-dimensional (3D) molecular atlases of healthy and diseased kidney biopsies using multiple state-of-the-art OMICS and imaging technologies across several institutions. Obtaining rigorous and reproducible results from disparate methods and at different sites to interrogate biomolecules at a single cell level or in 3D space is a significant challenge that can be a futile exercise if not well controlled. We describe a "follow the tissue" pipeline for generating a reliable and authentic single cell/region 3D molecular atlas of human adult kidney. Our approach emphasizes quality assurance, quality control, validation and harmonization across different OMICS and imaging technologies from sample procurement, processing, storage, shipping to data generation, analysis and sharing. We established benchmarks for quality control, rigor, reproducibility and feasibility across multiple technologies through a pilot experiment using common source tissue that was processed and analyzed at different institutions and different technologies. A peer review system was established to critically review quality control measures and the reproducibility of data generated by each technology before being approved to interrogate clinical biopsy specimens. The process established economizes the use of valuable biopsy tissue for multi-OMICS and imaging analysis with stringent quality control to ensure rigor and reproducibility of results and serves as a model for precision medicine projects across laboratories, institutions and consortia.

Show Abstract

An “individualist” model of an active genome in a developing embryo

S. Huang, S. Dutta, P. Whitney, S. Shvartsman, C. Rushlow

The early Drosophila embryo provides unique experimental advantages for addressing fundamental questions of gene regulation at multiple levels of organization, from individual gene loci to the whole genome. Using Drosophila embryos undergoing the first wave of genome activation, we detected discrete “speckles” of RNA Polymerase II (Pol II), and showed that they overlap with transcribing loci. We characterized the spatial distribution of Pol II speckles and quantified how this distribution changes in the absence of the primary driver of Drosophila genome activation, the pioneer factor Zelda. Although the number and size of Pol II speckles were reduced, indicating that Zelda promotes Pol II speckle formation, we observed a uniform distribution of distances between active genes in the nuclei of both wildtype and zelda mutant embryos. This suggests that the topologically associated domains identified by Hi-C studies do little to spatially constrain groups of transcribed genes at this time. We provide evidence that linear genomic distance between transcribed genes is the primary determinant of measured physical distance between the active loci. Furthermore, we show active genes can have distinct Pol II pools even if the active loci are in close proximity. In contrast to the emerging model whereby active genes are clustered to facilitate co-regulation and sharing of transcriptional resources, our data support an “individualist” model of gene control at early genome activation in Drosophila. This model is in contrast to a “collectivist” model where active genes are spatially clustered and share transcriptional resources, motivating rigorous tests of both models in other experimental systems.

Show Abstract
January 9, 2021

Capturing the complexity of topologically associating domains through multi-feature optimization

N. Sauerwald, C. Kingsford

The three-dimensional structure of human chromosomes is tied to gene regulation and replication timing, but there is still a lack of consensus on the computational and biological definitions for chromosomal substructures such as topologically associating domains (TADs). TADs are described and identified by various computational properties leading to different TAD sets with varying compatibility with biological properties such as boundary occupancy of structural proteins. We unify many of these computational and biological targets into one algorithmic framework that jointly maximizes several computational TAD definitions and optimizes TAD selection for a quantifiable biological property. Using this framework, we explore the variability of TAD sets optimized for six different desirable properties of TAD sets: high occupancy of CTCF, RAD21, and H3K36me3 at boundaries, reproducibility between replicates, high intra- vs inter-TAD difference in contact frequencies, and many CTCF binding sites at boundaries. The compatibility of these biological targets varies by cell type, and our results suggest that these properties are better reflected as subpopulations or families of TADs rather than a singular TAD set fitting all TAD definitions and properties. We explore the properties that produce similar TAD sets (reproducibility and inter- vs intra-TAD difference, for example) and those that lead to very different TADs (such as CTCF binding sites and inter- vs intra-TAD contact frequency difference).

Show Abstract
January 5, 2021
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates