Download presentation
Presentation is loading. Please wait.
Published byWilfrid Blake Modified over 9 years ago
1
Modeling Functional Genomics Datasets CVM8890-101 Lesson 6 11 July 2007Bindu Nanduri
2
Lesson 6: Functional genomics modeling II: a pathway analysis example.
3
Introduction to protein interaction networks
5
Cancer Proliferation Differentiation Quiescence Programmed Cell Death Cell Differentiation
6
Proliferation Differentiation Quiescence Programmed Cell Death Anergy Activation CD4 + T ‘helper” Lymphocyte Lymphoma
7
Agbase protein annotation process Protein identifiers or Fasta format GORetriever Annotated Proteins GOanna Proteins with no annotations GOSlimViewer
9
44% 67% Proliferation Angiogenesis Apoptosis Migration Quiescence Differentiation Anergy Activation Senescence Cell Cycle 100% 20% 80% 69% 31% 56% 79%21% 92%8% 92% 8%32% 68% 33% Potential CD4+ T lymphocyte Biological Processes
11
AP-1 dependent gene expression Metastasis Tumor invasion AP-1 Integrin Signaling Pathway
12
Hypothesis driven data analysis Exploration of data to identify pathways of interacting proteins Protein protein interaction networks (PPI)
13
Why study PPIs Proteins do not function alone!!!!! PPI are inherent to the function of multiprotein complexes PPIs can help infer function : where functional information is available for one partner Changes in normal PPI can result in disease disease
14
Types of PPI
15
PPI categories based on composition, affinity and timescale of interaction Homo and hetero oligomeric complexes: interactions between identical or non-identical chains Obligate PPI: protomers do not exist in as stable structures in vivo these are functionally obligate Non-obligate PPI: protomers can exist as stable structures, may co-localize for function /are co-localized c Arc repressor dimer necessary for DNA binding Non-obligate homo dimer Sperm lysin
16
PPI based on the life time of the complex: transient or permanent Permanaent interactions are stable and exist only as complex Transient interactions are marked by association/dissociation cycles in vivo Weak interactions (sperm lysin) associate and dissociate Strong transient interactions require a molecular trigger heterotrimeric G protein dissociates to G-alpha andg-beta and g-gamma when it binds to GTP, GDP-bound form is a trimer
17
Control of protein oligomerization PPI interactions are a continuum of obligate and non-obligate states Interactions of complexes driven by concentration and free energy of complex relative to alternate states
18
Take home message of PPI types PPI interactions are a continuum of obligate and non-obligate states Interactions of complexes driven by concentration and free energy of complex relative to alternate states
19
How to identify PPI Experimental Computational Gene Coexpression TAP assays Sequence coevolution Yeast two hybrid Phylogenetic profile Gene Cluster Rosetta stone method Text mining TAP assays Yeast two hybrid (Y2H) Protein arrays
20
PLoS Computational Biology March 2007, Volume 3 e42 Y2H Assay Eukaryotic transcription factors have DNA binding and activation domain Physical association of these domains activates transcription Cretae chimeric proteins with either BD or AD tranfect yeast Gal4/LexA based reporters In vivo method that can detect transient PPI
21
TAP Assay TAP tag consists of two IgG binding domains of Staphylococcus protein A and calmodulin binding peptide seperated by tobacco etch virus protease cleavage site TAP provides direct information on protein complexes O. Puig et al,Methods, 2001
22
PLoS Computational Biology March 2007, Volume 3 e42 Gene Coexpression Expression profile similarity correlation coefficient between relative expression levels of two genes/proteins the normalized difference between their absolute expression levels The distribution for target proteins is compared with the distributions for random noninteracting protein pairs Expression levels of physically interacting proteins coevolve coevolution of gene expression is a better predictor of protein interactions than coevolution of amino acid sequences Good for studying permanent complexes : ribosome, proteasome
23
PLoS Computational Biology March 2007, Volume 3 e42 Protein microarrays/chips Protein chips are disposable arrays of microwells in silicone elastomer sheets placed on top of microscope slides Target proteins are over expressed immobilized and probed with fluorescently labeled proteins H Zhu et al (2000) “Analysis of yeast protein kinases using protein chips” Nature Genetics 26: 283-289 can detect PPI between actual proteins
24
PLoS Computational Biology March 2007, Volume 3 e42 Database/URL/FTPType DIP http://dip.doe-mbi.ucla.eduE,Shttp://dip.doe-mbi.ucla.edu BIND http://bind.ca E,C,Shttp://bind.ca MPact/MIPS http://mips.gsf.de/services/ppi E,C,Fhttp://mips.gsf.de/services/ppi STRING http://string.embl.deE,P,Fhttp://string.embl.de MINT http://mint.bio.uniroma2.it/mintE,Chttp://mint.bio.uniroma2.it/mint IntAct http://www.ebi.ac.uk/intactE,Chttp://www.ebi.ac.uk/intact BioGRID http://www.thebiogrid.orgE,Chttp://www.thebiogrid.org HPRD http://www.hprd.orgE,Chttp://www.hprd.org ProtCom http://www.ces.clemson.edu/compbio/ProtComS,Hhttp://www.ces.clemson.edu/compbio/ProtCom 3did, Interprets http://gatealoy.pcb.ub.es/3did/S,Hhttp://gatealoy.pcb.ub.es/3did/ Pibase, Modbase http://alto.compbio.ucsf.edu/pibaseS,Hhttp://alto.compbio.ucsf.edu/pibase CBM ftp://ftp.ncbi.nlm.nih.gov/pub/cbmSftp://ftp.ncbi.nlm.nih.gov/pub/cbm SCOPPI http://www.scoppi.org/Shttp://www.scoppi.org/ iPfam http://www.sanger.ac.uk/Software/Pfam/iPfamS InterDom http://interdom.lit.org.sgPhttp://interdom.lit.org.sg DIMA http://mips.gsf.de/genre/proj/dima/index.htmlF,Shttp://mips.gsf.de/genre/proj/dima/index.html Prolinks http://prolinks.doe-mbi.ucla.edu/cgibin/functionator/pronav/F Predictomehttp://predictome.bu.edu/F
25
PLoS Computational Biology March 2007, Volume 3 e42 Database/URL/FTPType DIP http://dip.doe-mbi.ucla.eduE,Shttp://dip.doe-mbi.ucla.edu BIND http://bind.ca E,C,Shttp://bind.ca MPact/MIPS http://mips.gsf.de/services/ppi E,C,Fhttp://mips.gsf.de/services/ppi STRING http://string.embl.deE,P,Fhttp://string.embl.de Type of data (high-throughput experimental data (E), structural data (S), manual curation(C), functional predictions (F), and interface homology modeling (H) Unit of interaction :P is protein IntAct http://www.ebi.ac.uk/intactE,Chttp://www.ebi.ac.uk/intact BioGRID http://www.thebiogrid.orgE,Chttp://www.thebiogrid.org HPRD http://www.hprd.orgE,Chttp://www.hprd.org ProtCom http://www.ces.clemson.edu/compbio/ProtComS,Hhttp://www.ces.clemson.edu/compbio/ProtCom 3did, Interprets http://gatealoy.pcb.ub.es/3did/S,Hhttp://gatealoy.pcb.ub.es/3did/ Pibase, Modbase http://alto.compbio.ucsf.edu/pibaseS,Hhttp://alto.compbio.ucsf.edu/pibase CBM ftp://ftp.ncbi.nlm.nih.gov/pub/cbmSftp://ftp.ncbi.nlm.nih.gov/pub/cbm
26
PPI database comparisons Proteins: Structure, Function and Bioinformatics 63:490-500 2006
27
Experimental PPI dataset overlap is small High FP rate in high- throughput exp …….difficult to confirm by multiple sources
28
How to identify PPI Experimental Computational Gene Coexpression TAP assays Sequence coevolution Yeast two hybrid Phylogenetic profile Gene Cluster/neighborhood Rosetta stone method Text mining TAP assays Yeast two hybrid (Y2H) Protein arrays
29
PLoS Computational Biology March 2007, Volume 3 e43 Phylogenetic profile (PP) Hypothesis: functionally linked and potentially interacting nonhomologous proteins co-evolve and have orthologs in the same subset of fully sequenced organisms
30
PLoS Computational Biology March 2007, Volume 3 e43 Gene Cluster, Gene Neighborhood Genes in the gene cluster/operon are co-regulated and participate in the same biological function
31
PLoS Computational Biology March 2007, Volume 3 e43 Sequence Co-evolution interacting proteins very often co-evolve changes in one protein ( loss of function or Interaction) compensated by the correlated changes in another protein. The orthologs of co-evolving proteins tend to interact, thereby making it possible to infer unknowninteractions in other genomes co-evolution can be reflected in terms of the similarity between phylogenetic trees of two non-homologous interacting protein families
32
PLoS Computational Biology March 2007, Volume 3 e43 Rosetta Stone method interacting proteins/domains have homologs in other genomes fused into one protein chain, a Rosetta Stone protein Gene fusion occurs to optimize co-expression of genes encoding for interacting proteins.
33
Text Mining Utilizing the wealth of publicly available data..search Medline or PubMed for words or word combinations co-occurrence of words together is a simple metric, however prone to high false positive rates Natural Language Processing (NLP) methods are specific “A binds to B”; “A interacts with B”; “A associates with B” difficult to detect so it has a higher false negative rate Normally requires a list of known gene names or protein names for a given organism
34
GO ToolBox Genome Biol. 2004;5(12):R101.
35
ProtQuant tool
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.