2697 Publications

Understanding multicellular function and disease with human tissue-specific networks

C.S. Greene, et al.

Tissue and cell-type identity lie at the core of human physiology and disease. Understanding the genetic underpinnings of complex tissues and individual cell lineages is crucial for developing improved diagnostics and therapeutics. We present genome-wide functional interaction networks for 144 human tissues and cell types developed using a data-driven Bayesian methodology that integrates thousands of diverse experiments spanning tissue and disease states. Tissue-specific networks predict lineage-specific responses to perturbation, identify the changing functional roles of genes across tissues and illuminate relationships among diseases. We introduce NetWAS, which combines genes with nominally significant genome-wide association study (GWAS) P values and tissue-specific networks to identify disease-gene associations more accurately than GWAS alone. Our webserver, GIANT, provides an interface to human tissue networks through multi-gene queries, network visualization, analysis tools including NetWAS and downloadable networks. GIANT enables systematic exploration of the landscape of interacting genes that shape specialized cellular functions across more than a hundred human tissues and cell types.

Show Abstract
April 27, 2015

Interpreting 4C-Seq data: how far can we go?

R. Raviram, P.P. Rocha, R. Bonneau, J.A. Skok

The linear sequence of the genome has been extremely valuable in mapping regulatory elements relative to the genes they control. However, it has become increasingly evident that characterizing the three-dimensional organization of the genome is critical to get a better understanding of long-range regulation. Early studies using fluorescent in-situ hybridization (FISH) revealed that individual chromosomes occupy distinct spaces in the nucleus with minimal intermingling between territories[1]. Recent advances using chromosome conformation capture (3C) techniques have confirmed these findings and further improved the depth at which we can determine the organization of chromosomes and the physical interactions that occur within and between them[2, 3]. Variations of the 3C technique include (i) Hi-C, to capture all pairwise interactions, (ii) 5C, to capture interactions within and between loci of interest and (iii) 4C-Seq, to capture all interactions with a single locus of interest. The choice of technique depends on the biological question being asked and the scale at which this needs to be examined. While Hi-C has been instrumental in characterizing higher-order organization of chromosomes in the nucleus, it lacks the resolution that is required for analysis of specific interactions, such as between enhancers and promoters. This can be achieved with 4C-Seq, which allows interrogation of interactions from a single viewpoint or bait, to the rest of the genome. Several studies have used 4C-Seq to better understand phenomena such as X chromosome inactivation[4], enhancer-promoter interactions[5, 6], organization of antigen receptor loci[7], choice of translocation partners[8, 9] and collinear transcriptional regulation[10]. Here we aim to focus on the current state of the 4C-Seq method and the limitations and challenges of the associated computational analysis.

Show Abstract

Tissue-Aware Data Integration Approach for the Inference of Pathway Interactions in Metazoan Organisms

C. Park, A. Krishnan , Q. Zhu , A. Wong, Y. Lee, O. Troyanskaya

MOTIVATION:
Leveraging the large compendium of genomic data to predict biomedical pathways and specific mechanisms of protein interactions genome-wide in metazoan organisms has been challenging. In contrast to unicellular organisms, biological and technical variation originating from diverse tissues and cell-lineages is often the largest source of variation in metazoan data compendia. Therefore, a new computational strategy accounting for the tissue heterogeneity in the functional genomic data is needed to accurately translate the vast amount of human genomic data into specific interaction-level hypotheses.

RESULTS:
We developed an integrated, scalable strategy for inferring multiple human gene interaction types that takes advantage of data from diverse tissue and cell-lineage origins. Our approach specifically predicts both the presence of a functional association and also the most likely interaction type among human genes or its protein products on a whole-genome scale. We demonstrate that directly incorporating tissue contextual information improves the accuracy of our predictions, and further, that such genome-wide results can be used to significantly refine regulatory interactions from primary experimental datasets (e.g. ChIP-Seq, mass spectrometry).

AVAILABILITY AND IMPLEMENTATION:
An interactive website hosting all of our interaction predictions is publically available at http://pathwaynet.princeton.edu. Software was implemented using the open-source Sleipnir library, which is available for download at https://bitbucket.org/libsleipnir/libsleipnir.bitbucket.org.

Show Abstract

A Hebbian/Anti-Hebbian Network Derived from Online Non-Negative Matrix Factorization Can Cluster and Discover Sparse Features

C. Pehlevan, D. Chklovskii

Despite our extensive knowledge of biophysical properties of neurons, there is no commonly accepted algorithmic theory of neuronal function. Here we explore the hypothesis that single-layer neuronal networks perform online symmetric nonnegative matrix factorization (SNMF) of the similarity matrix of the streamed data. By starting with the SNMF cost function we derive an online algorithm, which can be implemented by a biologically plausible network with local learning rules. We demonstrate that such network performs soft clustering of the data as well as sparse feature discovery. The derived algorithm replicates many known aspects of sensory anatomy and biophysical properties of neurons including unipolar nature of neuronal activity and synaptic weights, local synaptic plasticity rules and the dependence of learning rate on cumulative neuronal activity. Thus, we make a step towards an algorithmic theory of neuronal function, which should facilitate large-scale neural circuit simulations and biologically inspired artificial intelligence.

Show Abstract
March 2, 2015

A Hebbian/Anti-Hebbian Neural Network for Linear Subspace Learning: A Derivation from Multidimensional Scaling of Streaming Data

C. Pehlevan, T. Hu, D. Chklovskii

Neural network models of early sensory processing typically reduce the dimensionality of streaming input data. Such networks learn the principal subspace, in the sense of principal component analysis (PCA), by adjusting synaptic weights according to activity-dependent learning rules. When derived from a principled cost function these rules are nonlocal and hence biologically implausible. At the same time, biologically plausible local rules have been postulated rather than derived from a principled cost function. Here, to bridge this gap, we derive a biologically plausible network for subspace learning on streaming data by minimizing a principled cost function. In a departure from previous work, where cost was quantified by the representation, or reconstruction, error, we adopt a multidimensional scaling (MDS) cost function for streaming data. The resulting algorithm relies only on biologically plausible Hebbian and anti-Hebbian local learning rules. In a stochastic setting, synaptic weights converge to a stationary state which projects the input data onto the principal subspace. If the data are generated by a nonstationary distribution, the network can track the principal subspace. Thus, our result makes a step towards an algorithmic theory of neural computation.

Show Abstract
March 2, 2015

Lymphocyte Invasion in IC10/Basal-Like Breast Tumors Is Associated with Wild-Type TP53

D. Quigley, L. Silwal-Pandit, R. Dannenfelser , A. Langerød , H. Vollan , C. Vaske , J. Siegel , O. Troyanskaya, S. Chin , C. Caldas , A. Balmain , A. Børresen-Dale , V. Kristensen

Lymphocytic infiltration is associated with better prognosis in several epithelial malignancies including breast cancer. The tumor suppressor TP53 is mutated in approximately 30% of breast adenocarcinomas, with varying frequency across molecular subtypes. In this study of 1,420 breast tumors, we tested for interaction between TP53 mutation status and tumor subtype determined by PAM50 and integrative cluster analysis. In integrative cluster 10 (IC10)/basal-like breast cancer, we identify an association between lymphocytic infiltration, determined by an expression score, and retention of wild-type TP53. The expression-derived score agreed with the degree of lymphocytic infiltration assessed by pathologic review, and application of the Nanodissect algorithm was suggestive of this infiltration being primarily of cytotoxic T lymphocytes (CTL). Elevated expression of this CTL signature was associated with longer survival in IC10/Basal-like tumors. These findings identify a new link between the TP53 pathway and the adaptive immune response in estrogen receptor (ER)-negative breast tumors, suggesting a connection between TP53 inactivation and failure of tumor immunosurveillance.

Show Abstract

Inter-species pathway perturbation prediction via data-driven detection of functional homology

C. Hafemeister, R. Romero, E. Bilal, P. Meyer, R. Norel, K. Rhrissorrakrai, R. Bonneau, A.L. Tarca

Experiments in animal models are often conducted to infer how humans will respond to stimuli by assuming that the same biological pathways will be affected in both organisms. The limitations of this assumption were tested in the IMPROVER Species Translation Challenge, where 52 stimuli were applied to both human and rat cells and perturbed pathways were identified. In the Inter-species Pathway Perturbation Prediction sub-challenge, multiple teams proposed methods to use rat transcription data from 26 stimuli to predict human gene set and pathway activity under the same perturbations. Submissions were evaluated using three performance metrics on data from the remaining 26 stimuli.

Show Abstract

Inverse Obstacle Scattering in Two Dimensions with Multiple Frequency Data and Multiple Angles of Incidence

Carlos Borges, L. Greengard

We consider the problem of reconstructing the shape of an impenetrable sound-soft obstacle from scattering measurements. The input data is assumed to be the far-field pattern generated when a plane wave impinges on an unknown obstacle from one or more directions and at one or more frequencies. It is well known that this inverse scattering problem is both ill posed and nonlinear. It is common practice to overcome the ill posedness through the use of a penalty method or Tikhonov regularization. Here, we present a more physical regularization, based simply on restricting the unknown boundary to be band-limited in a suitable sense. To overcome the nonlinearity of the problem, we use a variant of Newton's method. When multiple frequency data is available, we supplement Newton's method with the recursive linearization approach due to Chen. During the course of solving the inverse problem, we need to compute the solution to a large number of forward scattering problems. For this, we use high-order accurate integral equation discretizations, coupled with fast direct solvers when the problem is sufficiently large.

Show Abstract

Targeted exploration and analysis of large cross-platform human transcriptomic compendia

Q. Zhu, A. Wong, A. Krishnan, M. Aure, A. Tadych, R. Zhang, D. Corney, C. Greene, L. Bongo, V. Kristensen, M. Charikar, K. Li, O. Troyanskaya

We present SEEK (search-based exploration of expression compendia; http://seek.princeton.edu/), a query-based search engine for very large transcriptomic data collections, including thousands of human data sets from many different microarray and high-throughput sequencing platforms. SEEK uses a query-level cross-validation–based algorithm to automatically prioritize data sets relevant to the query and a robust search approach to identify genes, pathways and processes co-regulated with the query. SEEK provides multigene query searching with iterative metadata-based search refinement and extensive visualization-based analysis options.

Show Abstract
January 12, 2015
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates