CCB: Publications

Accurate de novo design of hyperstable constrained peptides

G Bhardwaj, V Mulligan, C Bahl, J Gilmore, P Harvey, O Cheneval, G Buchko, S Pulavarti, Q Kaas, A Eletsky, P Huang, W Johnsen, PGreisen, G Rocklin, Y Song, T Linsky, A Watkins, S Rettie, X Xu, L Carter, R. Bonneau, J Olson, E Coutsias, C Correnti, T Szyperski, D Craik, D Baker

Naturally occurring, pharmacologically active peptides constrained with covalent crosslinks generally have shapes evolved to fit precisely into binding pockets on their targets. Such peptides can have excellent pharmaceutical properties, combining the stability and tissue penetration of small molecule drugs with the specificity of much larger protein therapeutics. The ability to design constrained peptides with precisely specified tertiary structures would enable the design of shape-complementary inhibitors of arbitrary targets. Here we describe the development of computational methods for de novo design of conformationally-restricted peptides, and the use of these methods to design 15–50 residue disulfide-crosslinked and heterochiral N-C backbone-cyclized peptides. These peptides are exceptionally stable to thermal and chemical denaturation, and twelve experimentally-determined X-ray and NMR structures are nearly identical to the computational models. The computational design methods and stable scaffolds presented here provide the basis for development of a new generation of peptide-based drugs.

Show Abstract

A Global Genetic Interaction Network Maps a Wiring Diagram of Cellular Function

M Costanzo, Benjamin VanderSluis, Ph.D., E Koch, A Baryshnikova, C Pons, G Tan, W Wang, M Usaj, J Hanchard, S Lee, O. Troyanskaya, I Stagljar, T Xia, Y Ohya, A Gingras, B Raught, M Boutros, L Steinmetz, C Moore, A Rosebrock, A Caudy, C Myers, B Andrews, C Boone

We generated a global genetic interaction network for Saccharomyces cerevisiae, constructing more than 23 million double mutants, identifying about 550,000 negative and about 350,000 positive genetic interactions. This comprehensive network maps genetic interactions for essential gene pairs, highlighting essential genes as densely connected hubs. Genetic interaction profiles enabled assembly of a hierarchical model of cell function, including modules corresponding to protein complexes and pathways, biological processes, and cellular compartments. Negative interactions connected functionally related genes, mapped core bioprocesses, and identified pleiotropic genes, whereas positive interactions often mapped general regulatory connections among gene pairs, rather than shared functionality. The global network illustrates how coherent sets of genetic interactions connect protein complex and pathway modules to map a functional wiring diagram of the cell.

Show Abstract

EGRINs (Environmental Gene Regulatory Influence Networks) in rice that function in the response to water deficit, high temperature, and agricultural environments

O Wilkins, C Hafemeister, A Plessis, M Holloway-Phillips, G Pham, A Nicotra, G Gregorio, K Jagadish, E Septiningsih, R. Bonneau, M Purugganan

Environmental Gene Regulatory Influence Networks (EGRINs) coordinate the timing and rate of gene expression in response to environmental signals. EGRINs encompass many layers of regulation, which culminate in changes in accumulated transcript levels. Here, we inferred EGRINs for the response of five tropical Asian rice (Oryza sativa) cultivars to high temperatures, water deficit, and agricultural field conditions by systematically integrating time series transcriptome data, patterns of nucleosome-free chromatin, and the occurrence of known cis-regulatory elements. First, we identified 5,447 putative target genes for 445 transcription factors (TFs) by connecting TFs with genes harboring known cis-regulatory motifs in nucleosome-free regions proximal to their transcriptional start sites. We then used network component analysis to estimate the regulatory activity for each TF based on the expression of its putative target genes. Finally, we inferred an EGRIN using the estimated TFA as the regulator. The EGRINs include regulatory interactions between 4,052 target genes regulated by 113 TFs. We resolved distinct regulatory roles for members of the heat shock factor family, including a putative regulatory connection between abiotic stress and the circadian clock. TFA estimation using network component analysis is an effective way of incorporating multiple genome-scale measurements into network inference.

Show Abstract

An expanded evaluation of protein function prediction methods shows an improvement in accuracy

Y Jiang, R. Bonneau, et. al.

Background
A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.

Results
We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2.

Conclusions
The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.

Show Abstract

PPII Helical Peptidomimetics Templated by Cation–π Interactions

T Craven, R. Bonneau, K Kirshenbaum

Poly-proline type II (PPII) helical PXXP motifs are the recognition elements for a variety of protein–protein interactions that are critical for cellular signaling. Despite development of protocols for locking peptides into α-helical and β-strand conformations, there remains a lack of analogous methods for generating mimics of PPII helical structures. We describe herein a strategy to enforce PPII helical secondary structure in the 19-residue TrpPlexus miniature protein. Through sequence variation, we showed that a network of cation–π interactions could drive the formation of PPII helical conformations for both peptide and N-substituted glycine peptoid residues. The achievement of chemically diverse PPII helical scaffolds provides a new route towards discovering peptidomimetic inhibitors of protein–protein interactions mediated by PXXP motifs.

Show Abstract

Side-Chain Conformational Preferences Govern Protein–Protein Interactions

A Watkins, R. Bonneau, P Arora

Protein secondary structures serve as geometrically constrained scaffolds for the display of key interacting residues at protein interfaces. Given the critical role of secondary structures in protein folding and the dependence of folding propensities on backbone dihedrals, secondary structure is expected to influence the identity of residues that are important for complex formation. Counter to this expectation, we find that a narrow set of residues dominates the binding energy in protein–protein complexes independent of backbone conformation. This finding suggests that the binding epitope may instead be substantially influenced by the side-chain conformations adopted. We analyzed side-chain conformational preferences in residues that contribute significantly to binding. This analysis suggests that preferred rotamers contribute directly to specificity in protein complex formation and provides guidelines for peptidomimetic inhibitor design.

Show Abstract

Racemization barriers of atropisomeric 3,3′-bipyrroles: an experimental study with theoretical verification

S. Chatterjee, G.L. Butterfoss, M. Mandal, B. Paul, S. Gupta, R. Bonneau, P. Jaisankar

The significant rotational energy barrier about the stereogenic carbon–carbon bond of axially chiral 3,3′-bipyrroles has been investigated by electronic circular dichroism (ECD) spectroscopy, time dependent HPLC analysis, and computational modeling. The results elucidate pathways and transition states involved in configurational inversion, thereby confirming that 3,3′-bipyrrole derivatives can exist in stable and isolable atropisomeric forms.

Show Abstract

GIANT API: An Application Programming Interface for Functional Genomics

A Roberts, A. Wong, I. Fisk, O. Troyanskaya

GIANT API provides biomedical researchers programmatic access to tissue-specific and global networks in humans and model organisms, and associated tools, which includes functional re-prioritization of existing genome-wide association study (GWAS) data. Using tissue-specific interaction networks, researchers are able to predict relationships between genes specific to a tissue or cell lineage, identify the changing roles of genes across tissues and uncover disease-gene associations. Additionally, GIANT API enables computational tools like NetWAS, which leverages tissue-specific networks for re-prioritization of GWAS results. The web services covered by the API include 144 tissue-specific functional gene networks in human, global functional networks for human and six common model organisms and the NetWAS method. GIANT API conforms to the REST architecture, which makes it stateless, cacheable and highly scalable. It can be used by a diverse range of clients including web browsers, command terminals, programming languages and standalone apps for data analysis and visualization. The API is freely available for use at http://giant-api.princeton.edu.

Show Abstract

Metabolic Network Rewiring of Propionate Flux Compensates Vitamin B12 Deficiency in C. elegans

E Watson, V Olin-Sandoval, M Hoy, C Li, T Louisse, V Yao, A Mori, A Holdorf, O. Troyanskaya, M Ralser, A Walhout

Metabolic network rewiring is the rerouting of metabolism through the use of alternate enzymes to adjust pathway flux and accomplish specific anabolic or catabolic objectives. Here, we report the first characterization of two parallel pathways for the breakdown of the short chain fatty acid propionate in Caenorhabditis elegans. Using genetic interaction mapping, gene co-expression analysis, pathway intermediate quantification and carbon tracing, we uncover a vitamin B12-independent propionate breakdown shunt that is transcriptionally activated on vitamin B12 deficient diets, or under genetic conditions mimicking the human diseases propionic- and methylmalonic acidemia, in which the canonical B12-dependent propionate breakdown pathway is blocked. Our study presents the first example of transcriptional vitamin-directed metabolic network rewiring to promote survival under vitamin deficiency. The ability to reroute propionate breakdown according to B12 availability may provide C. elegans with metabolic plasticity and thus a selective advantage on different diets in the wild.

Show Abstract

A damage-independent role for 53BP1 that impacts break order and Igh architecture during class switch recombination

P Rocha, R Raviram, Y Fu, J Kim, V Luo, A Aljoufi, E Swanzey, A Pasquarella, E. Miraldi, R. Bonneau

During class switch recombination (CSR), B cells replace the Igh Cμ or δ exons with another downstream constant region exon (CH), altering the antibody isotype. CSR occurs through the introduction of AID-mediated double-strand breaks (DSBs) in switch regions and subsequent ligation of broken ends. Here, we developed an assay to investigate the dynamics of DSB formation in individual cells. We demonstrate that the upstream switch region Sμ is first targeted during recombination and that the mechanism underlying this control relies on 53BP1. Surprisingly, regulation of break order occurs through residual binding of 53BP1 to chromatin before the introduction of damage and independent of its established role in DNA repair. Using chromosome conformation capture, we show that 53BP1 mediates changes in chromatin architecture that affect break order. Finally, our results explain how changes in Igh architecture in the absence of 53BP1 could promote inversional rearrangements that compromise CSR.

Show Abstract