Modeling Functional Genomics Datasets CVM8890-101 Lesson 6 11 July 2007Bindu Nanduri.

Slides:



Advertisements
Similar presentations
Molecular Biomedical Informatics Machine Learning and Bioinformatics Machine Learning & Bioinformatics 1.
Advertisements

MitoInteractome : Mitochondrial Protein Interactome Database Rohit Reja Korean Bioinformation Center, Daejeon, Korea.
Using phylogenetic profiles to predict protein function and localization As discussed by Catherine Grasso.
Biological networks Bing Zhang Department of Biomedical Informatics Vanderbilt University
PPI network construction and false positive detection Jin Chen CSE Fall 1.
Global Mapping of the Yeast Genetic Interaction Network Tong et. al, Science, Feb 2004 Presented by Bowen Cui.
GO-based tools for functional modeling GO Workshop 3-6 August 2010.
Biological networks: Types and sources Protein-protein interactions, Protein complexes, and network properties.
Computational analysis of protein-protein interactions for bench biologists 2-8 September, Berlin Protein Interaction Databases Francesca Diella.
Pathways & Networks analysis COST Functional Modeling Workshop April, Helsinki.
The STRING database Michael Kuhn EMBL Heidelberg.
Gene regulation in cancer 11/14/07. Overview The hallmark of cancer is uncontrolled cell proliferation. Oncogenes code for proteins that help to regulate.
Research Methodology of Biotechnology: Protein-Protein Interactions Yao-Te Huang Aug 16, 2011.
Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis Jonsson.
Gene expression analysis summary Where are we now?
Protein-protein interactions
Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.
Biological networks: Types and origin Protein-protein interactions, complexes, and network properties Thomas Skøt Jensen Center for Biological Sequence.
Protein domains vs. structure domains - an example.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Introduction to biological networks. protein-gene interactions protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical reactions.
1 Protein-Protein Interaction Networks MSC Seminar in Computational Biology
Chip arrays and gene expression data. Motivation.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Biological networks: Types and origin
Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D. (1999). Detecting protein function and protein-protein interactions from genome sequences.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Chapter 4: Protein Interactions and Disease
Protein Interactions and Disease Audry Kang 7/15/2013.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
Modeling Functional Genomics Datasets CVM Lessons 4&5 10 July 2007Bindu Nanduri.
Bioinformatics: Applications
A highly abbreviated introduction to proteomics
Protein-protein interactions Chapter 12. Stable complex Transient Interaction Transient Signaling Complex Rap1A – cRaf1 Interface 1310 Å 2 Stable complex:
Protein protein interactions
Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.
Structure-Function Analysis 117 Jan 2006 DNA/Protein structure-function analysis and prediction Protein-protein Interaction (PPI): Protein-protein Interaction.
From motif search to gene expression analysis
Interactions and more interactions
Transient Protein-Protein Interactions (TPPI)
Protein-protein interactions Courtesy of Sarah Teichmann & Jose B. Pereira-Leal MRC Laboratory of Molecular Biology, Cambridge, UK EMBL-EBI.
Finish up array applications Move on to proteomics Protein microarrays.
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
Network & Systems Modeling 29 June 2009 NCSU GO Workshop.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Computational prediction of protein-protein interactions Rong Liu
GO-based tools for functional modeling TAMU GO Workshop 17 May 2010.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Anis Karimpour-Fard ‡, Ryan T. Gill †,
Overview  Introduction  Biological network data  Text mining  Gene Ontology  Expression data basics  Expression, text mining, and GO  Modules and.
TAP(Tandem Affinity Purification) Billy Baader Genetics 677.
EB3233 Bioinformatics Introduction to Bioinformatics.
By: Amira Djebbari and John Quackenbush BMC Systems Biology 2008, 2: 57 Presented by: Garron Wright April 20, 2009 CSCE 582.
I. Prolinks: a database of protein functional linkage derived from coevolution II. STRING: known and predicted protein-protein associations, integrated.
Bioinformatics and Computational Biology
GO based data analysis Iowa State Workshop 11 June 2009.
How many interactions are there? ~6,200 genes ~6,200 proteins x 2-10 interactions/protein ~12, ,000 interactions Yeast.
1 Computational functional genomics Lital Haham Sivan Pearl.
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Network Analysis Goal: to turn a list of genes/proteins/metabolites into a network to capture insights about the biological system 1.Types of high-throughput.
PROTEIN INTERACTION NETWORK – INFERENCE TOOL DIVYA RAO CANDIDATE FOR MASTER OF SCIENCE IN BIOINFORMATICS ADVISOR: Dr. FILIPPO MENCZER CAPSTONE PROJECT.
Detecting Protein Function and Protein-Protein Interactions from Genome Sequences TuyetLinh Nguyen.
Protein-protein Interactions
Protein-protein Interactions
Protein Complex Discovery
Anastasia Baryshnikova  Cell Systems 
Overview Domains and conclusion Introduction Biological network data
Protein-protein Interactions
Presentation transcript:

Modeling Functional Genomics Datasets CVM Lesson 6 11 July 2007Bindu Nanduri

Lesson 6: Functional genomics modeling II: a pathway analysis example.

Introduction to protein interaction networks

Cancer Proliferation Differentiation Quiescence Programmed Cell Death Cell Differentiation

Proliferation Differentiation Quiescence Programmed Cell Death Anergy Activation CD4 + T ‘helper” Lymphocyte Lymphoma

Agbase protein annotation process Protein identifiers or Fasta format GORetriever Annotated Proteins GOanna Proteins with no annotations GOSlimViewer

44% 67% Proliferation Angiogenesis Apoptosis Migration Quiescence Differentiation Anergy Activation Senescence Cell Cycle 100% 20% 80% 69% 31% 56% 79%21% 92%8% 92% 8%32% 68% 33% Potential CD4+ T lymphocyte Biological Processes

AP-1 dependent gene expression Metastasis Tumor invasion AP-1 Integrin Signaling Pathway

Hypothesis driven data analysis Exploration of data to identify pathways of interacting proteins Protein protein interaction networks (PPI)

Why study PPIs Proteins do not function alone!!!!! PPI are inherent to the function of multiprotein complexes PPIs can help infer function : where functional information is available for one partner Changes in normal PPI can result in disease disease

Types of PPI

PPI categories based on composition, affinity and timescale of interaction Homo and hetero oligomeric complexes: interactions between identical or non-identical chains Obligate PPI: protomers do not exist in as stable structures in vivo these are functionally obligate Non-obligate PPI: protomers can exist as stable structures, may co-localize for function /are co-localized c Arc repressor dimer necessary for DNA binding Non-obligate homo dimer Sperm lysin

PPI based on the life time of the complex: transient or permanent Permanaent interactions are stable and exist only as complex Transient interactions are marked by association/dissociation cycles in vivo Weak interactions (sperm lysin) associate and dissociate Strong transient interactions require a molecular trigger heterotrimeric G protein dissociates to G-alpha andg-beta and g-gamma when it binds to GTP, GDP-bound form is a trimer

Control of protein oligomerization PPI interactions are a continuum of obligate and non-obligate states Interactions of complexes driven by concentration and free energy of complex relative to alternate states

Take home message of PPI types PPI interactions are a continuum of obligate and non-obligate states Interactions of complexes driven by concentration and free energy of complex relative to alternate states

How to identify PPI Experimental Computational Gene Coexpression TAP assays Sequence coevolution Yeast two hybrid Phylogenetic profile Gene Cluster Rosetta stone method Text mining TAP assays Yeast two hybrid (Y2H) Protein arrays

PLoS Computational Biology March 2007, Volume 3 e42 Y2H Assay Eukaryotic transcription factors have DNA binding and activation domain Physical association of these domains activates transcription Cretae chimeric proteins with either BD or AD tranfect yeast Gal4/LexA based reporters In vivo method that can detect transient PPI

TAP Assay TAP tag consists of two IgG binding domains of Staphylococcus protein A and calmodulin binding peptide seperated by tobacco etch virus protease cleavage site TAP provides direct information on protein complexes O. Puig et al,Methods, 2001

PLoS Computational Biology March 2007, Volume 3 e42 Gene Coexpression Expression profile similarity correlation coefficient between relative expression levels of two genes/proteins the normalized difference between their absolute expression levels The distribution for target proteins is compared with the distributions for random noninteracting protein pairs Expression levels of physically interacting proteins coevolve coevolution of gene expression is a better predictor of protein interactions than coevolution of amino acid sequences Good for studying permanent complexes : ribosome, proteasome

PLoS Computational Biology March 2007, Volume 3 e42 Protein microarrays/chips Protein chips are disposable arrays of microwells in silicone elastomer sheets placed on top of microscope slides Target proteins are over expressed immobilized and probed with fluorescently labeled proteins H Zhu et al (2000) “Analysis of yeast protein kinases using protein chips” Nature Genetics 26: can detect PPI between actual proteins

PLoS Computational Biology March 2007, Volume 3 e42 Database/URL/FTPType DIP BIND E,C,Shttp://bind.ca MPact/MIPS E,C,Fhttp://mips.gsf.de/services/ppi STRING MINT IntAct BioGRID HPRD ProtCom 3did, Interprets Pibase, Modbase CBM ftp://ftp.ncbi.nlm.nih.gov/pub/cbmSftp://ftp.ncbi.nlm.nih.gov/pub/cbm SCOPPI iPfam InterDom DIMA Prolinks Predictomehttp://predictome.bu.edu/F

PLoS Computational Biology March 2007, Volume 3 e42 Database/URL/FTPType DIP BIND E,C,Shttp://bind.ca MPact/MIPS E,C,Fhttp://mips.gsf.de/services/ppi STRING Type of data (high-throughput experimental data (E), structural data (S), manual curation(C), functional predictions (F), and interface homology modeling (H) Unit of interaction :P is protein IntAct BioGRID HPRD ProtCom 3did, Interprets Pibase, Modbase CBM ftp://ftp.ncbi.nlm.nih.gov/pub/cbmSftp://ftp.ncbi.nlm.nih.gov/pub/cbm

PPI database comparisons Proteins: Structure, Function and Bioinformatics 63:

Experimental PPI dataset overlap is small High FP rate in high- throughput exp …….difficult to confirm by multiple sources

How to identify PPI Experimental Computational Gene Coexpression TAP assays Sequence coevolution Yeast two hybrid Phylogenetic profile Gene Cluster/neighborhood Rosetta stone method Text mining TAP assays Yeast two hybrid (Y2H) Protein arrays

PLoS Computational Biology March 2007, Volume 3 e43 Phylogenetic profile (PP) Hypothesis: functionally linked and potentially interacting nonhomologous proteins co-evolve and have orthologs in the same subset of fully sequenced organisms

PLoS Computational Biology March 2007, Volume 3 e43 Gene Cluster, Gene Neighborhood Genes in the gene cluster/operon are co-regulated and participate in the same biological function

PLoS Computational Biology March 2007, Volume 3 e43 Sequence Co-evolution interacting proteins very often co-evolve changes in one protein ( loss of function or Interaction) compensated by the correlated changes in another protein. The orthologs of co-evolving proteins tend to interact, thereby making it possible to infer unknowninteractions in other genomes co-evolution can be reflected in terms of the similarity between phylogenetic trees of two non-homologous interacting protein families

PLoS Computational Biology March 2007, Volume 3 e43 Rosetta Stone method interacting proteins/domains have homologs in other genomes fused into one protein chain, a Rosetta Stone protein Gene fusion occurs to optimize co-expression of genes encoding for interacting proteins.

Text Mining Utilizing the wealth of publicly available data..search Medline or PubMed for words or word combinations co-occurrence of words together is a simple metric, however prone to high false positive rates Natural Language Processing (NLP) methods are specific “A binds to B”; “A interacts with B”; “A associates with B” difficult to detect so it has a higher false negative rate Normally requires a list of known gene names or protein names for a given organism

GO ToolBox Genome Biol. 2004;5(12):R101.

ProtQuant tool