Proteome and interactome Bioinformatics.

Slides:



Advertisements
Similar presentations
Molecular Biomedical Informatics Machine Learning and Bioinformatics Machine Learning & Bioinformatics 1.
Advertisements

PPI network construction and false positive detection Jin Chen CSE Fall 1.
Biological networks: Types and sources Protein-protein interactions, Protein complexes, and network properties.
Research Methodology of Biotechnology: Protein-Protein Interactions Yao-Te Huang Aug 16, 2011.
Analysis of ChIP-Seq Data
Bioinformatics and Evolutionary Genomics High throughput “functional” data / functional genomics / Omics.
A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae Article by Peter Uetz, et.al. Presented by Kerstin Obando.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Gene expression analysis summary Where are we now?
Bio 402/502 Section II, Lecture 7 Systems Biology of the Nucleus Dr. Michael C. Yu.
Regulatory networks 10/29/07. Definition of a module Module here has broader meanings than before. A functional module is a discrete entity whose function.
Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.
Biological networks: Types and origin Protein-protein interactions, complexes, and network properties Thomas Skøt Jensen Center for Biological Sequence.
Protein domains vs. structure domains - an example.
Introduction to biological networks. protein-gene interactions protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical reactions.
Chip arrays and gene expression data. Motivation.
Biological networks: Types and origin
Protein-Protein Interaction Screens. Bacterial Two-Hybrid System selectable marker RNA polymerase DNA binding protein bait target sequence target.
Affinity chromatography/mass spec Bait protein GST Page 252.
Expression Profiling Using DNA MicroArrays - Each cell type within an organism expresses a unique combination of genes – this is, in part, what makes cells.
Interaction Networks in Biology: Interface between Physics and Biology, Shekhar C. Mande, August 24, 2009 Interaction Networks in Biology: Interface between.
GTL User Facilities Facility II: Whole Proteome Analysis Michelle V. Buchanan.
Purification of Proteins Associated with Specific Genomic Loci Jérôme Déjardin and Robert E. Kingston.
A highly abbreviated introduction to proteomics
12. Lecture WS 2004/05Bioinformatics III1 Direct comparison of different data sets Bayesian Network approach V12: Reliability of Protein Interaction Networks.
Protein protein interactions
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
9/14/20151 EECS 730 Introduction to Bioinformatics Introduction to Proteomics Luke Huan Electrical Engineering and Computer Science
Research Methodology of Biotechnology: Protein-Protein Interactions
Protein analysis and proteomics (Part 2 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
(D) Crosslinking Interacting proteins can be identified by crosslinking. A labeled crosslinker is added to protein X in vitro and the cell lysate is added.
Interactions and more interactions
Gene expression and DNA microarrays Old methods. New methods based on genome sequence. –DNA Microarrays Reading assignment - handout –Chapter ,
Kristen Horstmann, Tessa Morris, and Lucia Ramirez Loyola Marymount University March 24, 2015 BIOL398-04: Biomathematical Modeling Lee, T. I., Rinaldi,
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
Finish up array applications Move on to proteomics Protein microarrays.
Introduction to Proteomics 1. What is Proteomics? Proteomics - A newly emerging field of life science research that uses High Throughput (HT) technologies.
ChIP-on-Chip and Differential Location Analysis Junguk Hur School of Informatics October 4, 2005.
Vidyadhar Karmarkar Genomics and Bioinformatics 414 Life Sciences Building, Huck Institute of Life Sciences.
Discovering Macromolecular Interactions. An experimental strategy for identifying new molecular actors in a process candidate approach general screen.
Protein-protein interactions “The Interactome” Yeast two-hybrid analysis Yeast two-hybrid analysis Protein chips Protein chips Biochemical purification/Mass.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Protein Microarrays and detection via MALDI-TOF: Project Overview Kyle Nordquist Advisor: Dr. Andrew Link.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
From High Throughput Pull-Downs To Protein Complexes: Building a Model of the Physical Interactome of Yeast Shoshana Wodak Hospital for Sick Children
TAP(Tandem Affinity Purification) Billy Baader Genetics 677.
Genomics II: The Proteome Using high-throughput methods to identify proteins and to understand their function.
Lecture 7. Functional Genomics: Gene Expression Profiling using
Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions.
Proteome and Gene Expression Analysis Chapter 15 & 16.
Announcements: Note that there will be presentations and associated paper summaries for both Thursday and Tuesday classes. The Exam II mean is 81.6 and.
1 Protein-Protein Interactions High-throughput strategy –Prediction from sequence In silico analysis –Protein A from species A: domain 1 and 2 –Protein.
Integrated Genomic and Proteomic Analyses of a Systematically Perturbed Metabolic Network Science, Vol 292, Issue 5518, , 4 May 2001.
How many interactions are there? ~6,200 genes ~6,200 proteins x 2-10 interactions/protein ~12, ,000 interactions Yeast.
Gene expression and DNA microarrays No lab on Thursday. No class on Tuesday or Thursday next week –NCBI training Monday and Tuesday –Feb. 5 during class.
Analysis of ChIP-Seq Data Biological Sequence Analysis BNFO 691/602 Spring 2014 Mark Reimers.
Protein interactions: main methods for detection (all organisms) Two-hybrid8,446 (Co-)Immunoprecipitation567 Interaction adhesion assay225 In vitro binding138.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
Robustness, clustering & evolutionary conservation Stefan Wuchty Center of Network Research Department of Physics University of Notre Dame title.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
PROTEIN INTERACTION NETWORK – INFERENCE TOOL DIVYA RAO CANDIDATE FOR MASTER OF SCIENCE IN BIOINFORMATICS ADVISOR: Dr. FILIPPO MENCZER CAPSTONE PROJECT.
Affinity chromatography based strategies “Fishing for partners” Immunoprecipitation approaches The TAP TAG system Two hybrid assay.
Organellar Proteomics: Turning Inventories into Insights
Functional organization of the yeast proteome by systematic analysis of protein complexes Presented by Nathalie Kirshman and Xinyi Ma.
Biological networks CS 5263 Bioinformatics.
GEB 406 Course Instructor: Sheikh Ahmad Shah Semester: Summer 2016
Today… Review a few items from last class
Protein Complex Discovery
Modeling cells with protein networks
Protein Complex Discovery
Presentation transcript:

Proteome and interactome Bioinformatics

Contents Protein-protein interactions  Two-hybrid assays  Mass spectrometry Cellular localization of proteins  GFP tags Protein-DNA interactions  ChIP-on-chip

Two-hybrid assays Functional genomics

Two-hybrid method DNA-binding ORF A DNA-binding Activation ORF B Activation Transcription factor A RNA pol B Hybrid constructions RNA pol A B RNA pol A B Interaction  reporter gene is expressed No interaction  reporter gene is not expressed Bait Prey Bait Prey Bait Prey

Two-hybrid Ito et al. (2001) PNAS 98: Uetz et al. (2000). Nature 403:

Comparison of the results When the second “comprehensive” analysis was published, the overlap between thee results obtained in the two independent studies was surprisingly low. How to interpret this ?  Problem of coverage ? Each study would only represent a fraction of what remains to be discovered.  Problem of noise ? Either or both studies might contain a large number of false positives.  Differences in experimental conditions ? Ito et al. (2001) PNAS 98:

Connectivity in protein interaction networks Jeong et al (2001) calculate connectivity in the protein interaction network revealed by the two-hybrid analysis of Uetz and co- workers. The connectivity follows a power law:  most proteins have a few connections;  a few proteins are highly connected Highly connected proteins correspond to essential proteins. Jeong, H., S.P. Mason, A.L. Barabasi, and Z.N. Oltvai Lethality and centrality in protein networks. Nature 411:

Mass-spectrometry Functional genomics

1. Construction of a bank of TAG-fused ORFs 2. Expression of the tagged baits in yeast tag Y ORF Y B E A Y C D 4. Affinity purification 3. Cell lysis anti-tag epitope + All cellular proteins,… Other proteins,… Isolation of protein complexes tagged bait Slide from Nicolas Simonis

B E A Y C D B E A Y C D 1 dimension SDS-PAGE B C isolation Mass spectrometry B C E E = YLR258w = YER133w = YER054c A Y D = YPR184w = YKL085w = YPR160w Mass spectrometry - Protein identification Slide from Nicolas Simonis

Protein complexes High-throughput mass-spectrometric protein complex identification (HMS-PCI) MDS proteomics 493 complexes Tandem Affinity Purification (TAP) CELLZOME: 232 complexes Ho et al. (1999). Nature 415: Gavin et al. (1999). Nature 415:

Network of complexes Gavin et al. (1999). Nature 415:

Assessment of interactome data Functional genomics

Assessment of interactome data von Mering et al (2002). Nature.

Comparison of large-scale interaction data von Mering et al (2002) compared the results from  Two-hybrid assays  Mass spectrometry (TAP and HMS-PCI)  Co-expression in microarray experiments  Synthetic lethality  Comparative genomics (conservation of operons, phylogenetic profiles, and gene fusion) Among 80,000 interactions, no more than 2,400 are supported by two different methods. Each method is more specifically related to some  functional classes  cellular location Reference: von Mering et al. (2002). Nature 750

Comparison of pairs of interacting proteins with functional classes von Mering et al (2002). Nature 750.

von Mering et al (2002). Nature. Validation with annotated complexes von Mering et al (2002) collected information on experimentally proven physical protein-protein interactions, and measured the coverage and positive predictive value of each predictive method  Coverage fraction of reference set covered by the data.  Positive predictive value Fraction of data confirmed by reference set. (Note: they call this “accuracy”, but this term is usually not used in this way) Beware: the scale is logarithmic !  This enforces the differences in the lower part of the percentages (0-10), but “compresses” the values between 10 and 50, which gives a false impression of good accuracy.

Cellular localization of proteins Bioinformatics

4156 proteins detected by fluorescence microscopy analysis Nature (2003) 425: Slide adapted from Bruno André

Global analysis of protein localization This analysis allowed to obtain information for thousands of proteins for which the cellular localization was previously unknown. Slide adapted from Bruno André

Localisation and ORF function For historical reasons, the yeast genome is “over-annotated”.  The method used for predicting genes from genome sequences included many false positives, especially among short predicted ORFs. Most of the questionable ORFs were unobserved in the global localization analysis. These mainly correspond to short ORFs. Source: Bruno André

Protein-DNA interactions ChIP-on-chip technology Functional genomics

The ChIP-on-chip method Chromatin Immuno-precipitation (ChIP)  Tagging of a transcription factor of interest with a protein fragment recognized by some antibody.  Immobilization of protein-DNA interactions with a fixative agent.  DNA fragmentation by ultrasonication.  Precipitation of the DNA-protein complexes.  Un-binding of the DNA-protein bounds. Measurement of DNA enrichment.  Two extracts are co-hybridized on a microarray (chip),where each spot contains one DNA fragment where a factor is likely to bind (e.g. an intergenic region, or a smaller fragment).. For the yeast S.cerevisiae, chips have been designed with all the intergenic regions (6000 regions, avg. 500bp/region) Recent technology allows to spot 3e+5 300bp DNA fragments on a single slide.  The first extract (labelled in red) is enriched in DNA fragments bound to the tagged transcription factor.  The second extract (labelled in green) has not been enriched.  The log-ratio between red and green channels indicate the enrichment for each intergenic region.

Lee et al (2002) In 2002, Lee et al publish a systematic characterization of the binding regions of 106 yeast transcription factors. Lee et al Science 298: