Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics.

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

Molecular Biomedical Informatics Machine Learning and Bioinformatics Machine Learning & Bioinformatics 1.
Control of Expression In Bacteria –Part 1
Chapter 18 Regulation of Gene Expression in Prokaryotes
Journal Club Jenny Gu October 24, Introduction Defining the subset of Superfamilies in LUCA Examine adaptability and expansion of particular superfamilies.
Using phylogenetic profiles to predict protein function and localization As discussed by Catherine Grasso.
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Medical Genetics & Genomics
13 The Genetics of Viruses and Prokaryotes. 13 The Genetics of Viruses and Prokaryotes 13.1 How Do Viruses Reproduce and Transmit Genes? 13.2 How Is Gene.
Chapter 18 Regulation of Gene Expression.
Four of the many different types of human cells: They all share the same genome. What makes them different?
Research Methodology of Biotechnology: Protein-Protein Interactions Yao-Te Huang Aug 16, 2011.
E.coli aerobic/anaerobic switch study Chao Wang, Mar
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
The construction of cells DNA or RNA Protein Carbohydrates Lipid etc.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
The construction of cells DNA or RNA Protein Carbohydrates Lipid etc. 04.
Protein-protein interactions
Protein domains vs. structure domains - an example.
Protein-protein interactions Ia. A combined algorithm for genome-wide prediction of protein function. Edward M. Marcotte, Matteo Pellegrini, Michael J.
Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al
Protein interaction Computational (inferred) Experimental (observed)
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D. (1999). Detecting protein function and protein-protein interactions from genome sequences.
Protein-Protein Interaction Screens. Bacterial Two-Hybrid System selectable marker RNA polymerase DNA binding protein bait target sequence target.
Affinity chromatography/mass spec Bait protein GST Page 252.
General Microbiology (MICR300)
Protein Interactions and Disease Audry Kang 7/15/2013.
Exploring the Metabolic and Genetic Control of Gene Expression on a Genomic Scale Joseph L. DeRisi, Vishwanath R. Iyer, Patrick O. Brown Science Vol. 278.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
GTL User Facilities Facility II: Whole Proteome Analysis Michelle V. Buchanan.
Cellular Metabolism Chapter 4. Introduction Metabolism is many chemical reactionss Metabolism breaks down nutrients and releases energy= catabolism Metabolism.
Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.
Protein analysis and proteomics (Part 2 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
Data Content of the BioCyc Databases. BioCyc Tier 1 Databases.
Biology 10.2 Gene Regulation and Structure Gene Regulation and Structure.
Functional Associations of Protein in Entire Genomes Sequences Bioinformatics Center of Shanghai Institutes for Biological Sciences Bingding.
Gene structure in prokaryotes * In prokaryotic cells such as bacteria, genes are usually found grouped together in operons. * The operon is a cluster of.
Finish up array applications Move on to proteomics Protein microarrays.
Proteomics and annotation. Definition of proteomics Study of all the proteins in an organism Derived from genomics all the DNA in an organsim On some.
Reconstruction of Transcriptional Regulatory Networks
Proteome and interactome Bioinformatics.
Genome-wide Functional Linkage Maps Methods for inferring functional linkages: Complexes, Pathways Rosetta stone Phylogenetic profiles Gene neighbors Operon.
Anis Karimpour-Fard ‡, Ryan T. Gill †,
How Does A Cell Know? Which Gene To Express Which Gene To Express& Which Gene Should Stay Silent? Which Gene Should Stay Silent?
Control of Gene Expression Chapter Proteins interacting w/ DNA turn Prokaryotic genes on or off in response to environmental changes  Gene Regulation:
PPI team Progress Report PPI team, IDB Lab. Sangwon Yoo, Hoyoung Jeong, Taewhi Lee Mar 2006.
I. Prolinks: a database of protein functional linkage derived from coevolution II. STRING: known and predicted protein-protein associations, integrated.
Introduction to biological molecular networks
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
© 2011 Pearson Education, Inc. Lectures by Stephanie Scher Pandolfi BIOLOGICAL SCIENCE FOURTH EDITION SCOTT FREEMAN 17 Control of Gene Expression in Bacteria.
Controlling Gene Expression. Control Mechanisms Determine when to make more proteins and when to stop making more Cell has mechanisms to control transcription.
Regulation of Gene Expression in Bacteria and Their Viruses
How many interactions are there? ~6,200 genes ~6,200 proteins x 2-10 interactions/protein ~12, ,000 interactions Yeast.
1 Computational functional genomics Lital Haham Sivan Pearl.
Detecting Protein Function and Protein-Protein Interactions from Genome Sequences TuyetLinh Nguyen.
Regulation of Gene expression
Controlling Gene Expression 7.4. Control Mechanisms housekeeping genes code for proteins that are needed all the time; they are constantly being transcribed.
Chapter – 10 Part II Molecular Biology of the Gene - Genetic Transcription and Translation.
Chapter 7: The Blueprint of Life, from DNA to Protein.
Transcription(I) 王之仰.
1st lesson Medical students Medical Biology Molecular Biology
FLiPS Functional Linkage Prediction Service.
How Proteins are Made Biology I: Chapter 10.
Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar
From Mendel to Genomics
Predicting Gene Expression from Sequence
DNA, RNA, & Proteins Vocab review
Presentation transcript:

Protein Targeting by Functional Linkage of Non-Homologous Proteins with examples from M. tuberculosis Genome-wide functional linkage map Structural Genomics of Complexes: Identifying subunits of complexes by analyzing co-evolution of non- homologous proteins, from genome-wide functional linkage maps

Limitations of Relying Entirely on Homology-Based Targeting Many (most ?) proteins function in complexes made up of non-homologous proteins Some (many ?) proteins are crystallizable only with their functional partners

Limitations of Relying Entirely on Homology-Based Targeting Many (most ?) proteins function in complexes made up of non-homologous proteins Some (many ?) proteins are crystallizable only with their functional partners Suggests that targeting of non-homologus, functionally linked proteins may offer a useful shortcut to learning protein structures and functions

Identifying Subunits of Protein Complexes by Analyzing the Co-evolution of Non-homologous Proteins Structural Genomics of Protein Complexes

4 Methods to Infer Non-Homologous Protein Pairs that have Co-evolved and hence are Functionally Linked Rosetta Stone Protein fusion Phylogenetic Profile Protein co-occurrrence Gene neighbor Constant separation Operon Small separation

Figure 7. M. Strong, T. Graeber et al.

Classical graphical representation of protein functional linkages Research of Michael Strong and Morgan Beeby Requiring 2 or more functional linkages: 1,865 genes make 9,766 linkages Functional Linkages Between Genes of M. tuberculosis

Hierarchical Clustering of the Combined Genome-Wide Linkage Map for M. Tb. Reveals Complexes and Pathways Genome-wide functional linkage map based on 4 methods: Clustered linkage map showing complexes and pathways: Cluster similar linkage patterns  ach cluster is a complex or pathway

Detoxification Polyketide and non-ribosomal peptide synthesis Energy Metabolism, oxidoreductase Deg. of Fatty Acids Virulence Energy Metabolism, oxidoreductase Amino acid Biosynthesis Emergy Metab. Respiration Aerobic Lipid Biosynthesis Degradation of Fatty Acids Amino Acid Biosynthesis (Branched) Synthesis and Modif. Of Macromolecules, rpl,rpm, rps Biosynthesis of Cofactors, Prosthetic groups Purine, Pyrimidine nucleotide biosynthesis Novel Group Sugar Metabolism Aromatic Amino Acid Biosynthesis Energy Metabolism, Anaerobic Respiration Two component systemsCell Envelope Cytochrome P450Chaperones Biosynthesis of cofactors Cell Envelope, Cell Division Transport/Binding Proteins Energy Metabolism TCA Broad Regulatory, Serine Threonine Protein Kinase Cell Envelope, Murein Sacculus and Peptidoglycan Transport/Binding Proteins Cations Energy Metabolism, ATP Proton Motive force Fig 4. M. Strong, T. Graeber et al.

Quantitative Assessment of Inferred Protein Complexes

Calculating Probabilities of Co-evolution Phylogenetic Profile Rosetta Stone Gene Neighbor Operon N= number of fully sequenced genomes n= number of homologs of protein A m = number of homologs of protein B k = number of genomes shared in common X= fractional separation of genes n = intergenic separation

Combining Inferences of Co- Evolution from 4 Methods We use a Bayesian approach to combine the probabilities from the four methods to arrive at a single probability that two proteins co-evolve: where positive pairs are proteins with common pathway annotation and negative pairs are proteins with different annotation

Benchmarking this Approach Against Known Complexes Ecocyc: Karp et al. NAR, 30, 56 (2002) True positive interactions are between subunits of known complexes and false positive ones are between subunits of different complexes. For high confidence links, we find 1/3 of true interactions with only one 1/1000 of the false positive ones Random

Benchmarking our Approach Against Known Complexes True positive interactions are between subunits of known complexes and false positive ones are between subunits of different complexes. For the first few hundred pairs of high confidence links, about 50% are between subunits of known complexes

Example Complex: NADH Dehydrogenase I 11 of 13 subunits detected

Example Complex: NADH Dehydrogenase I 11 of 13 subunits detected 3 false positives

Assessing Inferred Linkages for M. Tb Genome (Michael Strong, 2003) Accuracy Conclusions: 100 bp operon threshold is adequate A functional linkage by 2 or more methods is reasonably accurate

CtaD CtaECtaC Functional Linkages Among Cytochrome Oxidase Genes CtaB Functional linkages relate all 3 components of cytochrome oxidase complex and also CtaB, the cytochrome oxidase assembly factor These genes are at four different chromosomal locations Membrane proteins linked to soluble proteins

From Inferred Protein Complexes to their Structures

PE, PE-PGRS, and PPE Proteins in M. tuberculosis 38 PE proteins; 61 PE-PGRS proteins; 68 PPE proteins Together compromise about 5 % of the genome No function is known, but some appear to be membrane bound No structure is known: always insoluble when expressed Goal: use functional linkages to predict a complex between a PE and a PPE protein: express complex, and determine its structure Research of Shuishu Wang and Michael Strong The Problem of PE and PPE Proteins in M. tb

Construction of a co-expression vector to test for protein-protein interactions (Mike Strong) pET 29b(+) T7 promoter lac oper. RBS Nde1 HindIIIKpn1NcoI RBS gene A gene B Thrombin site His tag polycistronic mRNA transcription translation protein A protein B (with His tag) If proteins interact (protein-protein interaction) If proteins do not interact

When co-expressed, the PE and PPE proteins, inferred to interact, do form a soluble complex, Mr = 35,200 Sedimentation equilibrium experiments: Rv2430c + Rv2431c fraction 49, in 20mM HEPES, 150mM NaCl, pH 7.8 Concentration OD , 0.45, 0.15 Expected Mr: Rv 2431c (PE) 10,687 ( from Mass Spec) Rv2430c+His tag (PPE) 24,072 ( from Mass Spec) Possibly suggests a 1:1 complex between these two proteins

Crystallization trials of the Complex Between PE Protein Rv2430c and PPE Protein Rv2431c

Summary Many functional lnkages are revealed from genomic data (high coverage)

Summary Many functional lnkages are revealed from genomic data (high coverage) Clustered genome-wide functional maps can reveal and organize information on complexes (and pathways)

Summary Many functional lnkages are revealed from genomic data (high coverage) Known subunits of E. coli complexes can be identified with high accuracy from functional linkages Clustered genome-wide functional maps can reveal and organize information on complexes (and pathways)

Summary Many functional lnkages are revealed from genomic data (high coverage) Known subunits of E. coli complexes can be identified with high accuracy from functional linkages Clustered genome-wide functional maps can reveal and organize information on complexes (and pathways) A protein complex suitable for structural studies has been revealed from functional linkages

Summary Many functional lnkages are revealed from genomic data (high coverage) Known subunits of E. coli complexes can be identified with high accuracy from functional linkages Clustered genome-wide functional maps can reveal and organize information on complexes (and pathways) A protein complex suitable for structural studies has been revealed from functional linkages The procedures for identifying and producing protein complexes can be adapted for high thruput

Protein Interactions in M. tb. Analysis of M.tb. Genome Michael Strong, Debnath Pal, Sulmin Kim Whole Genome Interaction Maps Michael Strong, Tom Graeber, Huiying Li, Matteo Pellegrini Methods of Inferring Interactions Edward Marcotte, Matteo Pellegrini, Todd Yeates, Michael Thompson PI of Tb Structural Genomics Consortium Tom Terwilliger