WIIFM: examples of functional modeling GO Workshop 3-6 August 2010
Key points Modeling is subordinate to the biological questions/hypotheses. Together the Gene Ontology and canonical genetic networks/pathways provide the central and complementary foundation for modeling functional genomics data. Annotation follows information and information changes daily: STEP 1 in analyzing functional genomics data is re-annotating your dataset. Examples of how we do functional modeling of genomics datasets.
What is the Gene Ontology? “a controlled vocabulary that can be applied to all organisms even as knowledge of gene and protein roles in cells is accumulating and changing” the de facto standard for functional annotation assign functions to gene products at different levels, depending on how much is known about a gene product is used for a diverse range of species structured to be queried at different levels, eg: find all the chicken gene products in the genome that are involved in signal transduction zoom in on all the receptor tyrosine kinases human readable GO function has a digital tag to allow computational analysis of large datasets COMPUTATIONALLY AMENABLE ENCYCLOPEDIA OF GENE FUNCTIONS AND THEIR RELATIONSHIPS
Ontologies Canonical and other Networks GO Cellular Component GO Biological Process GO Molecular Function BRENDA Pathway Studio 5.0 Ingenuity Pathway Analyses Cytoscape Interactome Databases Functional Understanding
Use GO for……. 1. Determining which classes of gene products are over-represented or under-represented. 2. Grouping gene products. 3. Relating a protein’s location to its function. 4. Focusing on particular biological pathways and functions (hypothesis-testing).
‘00‘01‘02‘03‘04‘05‘06‘07‘08‘09 No. YEAR No. x 10 6
ion/proton transport cell migration cell adhesion cell growth apoptosis immune response cell cycle/cell proliferationcell-cell signaling function unknown development endocytosis proteolysis and peptidolysis protein modification signal transduction B-cellsStroma Membrane proteins grouped by GO BP
LOCATION DETERMINES FUNCTION
GO is the “encyclopedia” of gene functions captured, coded and put into a directed acyclic graph (DAG) structure. In other words, by collecting all of the known data about gene product biological processes, molecular functions and cell locations, GO has become the master “cheat-sheet” for our total knowledge of the genetic basis of phenotype. Because every GO annotation term has a unique digital code, we can use computers to mine the GO DAGs for granular functional information. Instead of having to plough through thousands of papers at the library and make notes and then decide what the differential gene expression from your microarray experiment means as a net affect, the aim is for GO to have all the biological information captured and then retrieve it and compile it with your quantitative gene product expression data and provide a net affect.
“GO Slim” In contrast, we need to use the deep granular information rich data suitable for hypothesis-testing Many people use “GO Slims” which capture only high-level terms which are more often then not extremely poorly informative and not suitable for hypothesis-testing.
Shyamesh Kumar BVSc
days post infection mean total lesion score Susceptible (L7 2 ) Resistant (L6 1 ) Genotype Non-MHC associated resistance and susceptibility Resistant ( L6 1 ) Burgess et al,Vet Pathol 38:2,2001 The critical time point in MD lymphomagenesis Susceptible (L7 2 ) CD30 mab CD8 mab
Hypothesis At the critical time point of 21 dpi, MD-resistant genotypes have a T-helper (Th)-1 microenvironment (consistent with CTL activity), but MD-susceptible genotypes have a T-reg or Th-2 microenvironment (antagonistic to CTL). 2008, 57:
Infection of chickens (L6 1 & L7 2 ), kill and post-mortem at 21dpi and sample tissues Whole Tissue RNA extraction Laser Capture Microdissection (LCM) Cryosections Duplex QPCR RNA extraction
L6 (R) L7 (S) * * * * * IL-4 IL-10 IL-12 IL-18 IFNγ TGFβ GPR-83 SMAD-7 CTLA-4 mRNA 40 – mean C t value Whole tissue mRNA expression
IL-4IL-12IL-18TGFβGPR-83SMAD-7CTLA-4 * * * * 40 – mean C t value mRNA * Microscopic lesion mRNA expression L6 (R) L7 (S)
Th-1 Th-2 NAIVE CD4+ T CELL CYTOKINES AND T HELPER CELL DIFFERENTIATION APC T reg
Th-1 Th-2 NAIVE CD4+ T CELL IFN γ IL 12 IL 18 Macrophage NK Cell IL 12IL 4 IL10 APC CTL TGFβ T reg Smad 7 L6 Whole L7 Whole L7 Micro Th-1, Th-2, T-reg ? Inflammatory?
QPCR data Gene Ontology annotation Biological Process Modeling & Hypothesis testing Gene Ontology based hypothesis testing Relative mRNA expression data
Step I. GO-based Phenotype Scoring. Gene productTh1Th2TregInflammation IL IL IL IL IL IL IL IL IFN- 0.00 TGF- CTLA GPR SMAD Net Effect Step III. Inclusion of quantitative data to the phenotype scoring table and calculation of net affect. 111SMAD-7 1 GPR-83 1 CTLA-4 10 TGF- 111 IFN- 1111IL-18 ND 1IL-13 ND 1IL IL-10 11ND IL-8 11IL-6 ND11IL-4 1 ND 1IL-2 InflammationTregTh2Th1Gene product ND = No data Step II. Multiply by quantitative data for each gene product.
Th-1Th-2T-regInflammation Net Effect -40 Whole Tissue L6 (R) L7 (S)
Th-1Th-2T-reg Inflammation Phenotype Net Effect 5mm Microscopic lesions L6 (R) L7 (S)
Pro T-reg Pro Th-1 Anti Th-2 Pro CTL Anti CTL L6 (R) Whole lymphoma L7 Susceptible Pro CTL Anti CTL L6 Resistant Pro T-reg Pro Th-2 Anti Th-1
Pig Total mRNA and protein expression was measured from quadruplicate samples of control, electroscalple and harmonic scalple-treated tissue. Differentially-expressed mRNA’s and proteins identified using Monte-Carlo resampling 1. Using network and pathway analysis as well as Gene Ontology-based hypothesis testing, differences in specific phyisological processes between electroscalple and harmonic scalple-treated tissue were quantified and reported as net effects. Translation to clinical research (1) Nanduri, B., P. Shah, M. Ramkumar, E. A. Allen, E. Swaitlo, S. C. Burgess*, and M. L. Lawrence* Quantitative analysis of Streptococcus Pneumoniae TIGR4 response to in vitro iron restriction by 2-D LC ESI MS/MS. Proteomics 8, Bindu Nanduri
Proportional distribution of mRNA functions differentially-expressed by Electro and Harmonic Scalpel Immunity (primarily innate) Inflammation Wound healing Lipid metabolism Response to thermal injury Angiogenesis Total differentially-expressed mRNAs: 4302 Total differentially- expressed mRNAs: 1960 Electroscalpel Harmonic Scalpel HYPOTHESIS TERMS
Immunity (primarily innate) Wound healing Lipid metabolism Response to thermal injury Angiogenesis Electro-scalple Harmonic scalple Net functional distribution of differentially-expressed mRNAs: Relative bias Classical inflammation (heat, redness, swelling, pain, loss of function) Sensory response to pain
Hemorrhage Proportional distribution of protein functions differentially-expressed by Electro and Harmonic Scalpel Total differentially- expressed proteins: 509 Electro-scalpel Total differentially- expressed proteins: 433 Harmonic scalpel Immunity (primarily innate) Inflammation Wound Healing Lipid metabolism Response to thermal Injury Angiogenesis HYPOTHESIS TERMS
Net functional distribution of differentially-expressed proteins Immunity (primarily innate) Classical inflammation (heat, redness, swelling, pain, loss of function) Wound healing Lipid metabolism Response to thermal injury Angiogenesis Sensory response to pain Hemorrhage Relative bias Electroscalpel Harmonic Scalpel