6 November 2007 © ETH Zürich | Genevestigator Gene expression analysis and network discovery: Genevestigator Philip Zimmermann, Genevestigator Team, ETH.

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

© ETH Zürich | Genevestigator | This module was contributed by Philip Zimmermann Genevestigator – Module I Overview of Genevestigator.
Overview of Genevestigator
Cells (Prokaryotic & Eukaryotic) w There are two types of cells Prokaryotes and Eukaryotes w Prokaryotes cells that lack membrane-bound organelles. Bacteria.
Journal Club Jenny Gu October 24, Introduction Defining the subset of Superfamilies in LUCA Examine adaptability and expansion of particular superfamilies.
CELL COMMUNICATION. YOU MUST KNOW… THE 3 STAGES OF CELL COMMUNICATION: RECEPTION, TRANSDUCTION, AND RESPONSE HOW G-PROTEIN-COUPLED RECEPTORS RECEIVE CELL.
Microarray Data Analysis Day 2
Integrating Cross-Platform Microarray Data by Second-order Analysis: Functional Annotation and Network Reconstruction Ming-Chih Kao, PhD University of.
Molecular & Genomic Surgery Eric M. Wilson 1/5/10.
Endocrinology Introduction Lecture 3.
Gene Ontology John Pinney
CELL CONNECTIONS & COMMUNICATION AP Biology Ch.6.7; Ch. 11.
August 19, 2002Slide 1 Bioinformatics at Virginia Tech David Bevan (BCHM) Lenwood S. Heath (CS) Ruth Grene (PPWS) Layne Watson (CS) Chris North (CS) Naren.
Bioinformatics: A New Frontier for Computer Scientists Ruth G. Alscher Lenwood S. Heath.
Gene expression analysis summary Where are we now?
Plant Responses to Signals IV Photomorphogenesis Circadian Rhythms Gravitropism
‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns Tim Randolph & Garth Tan Presentation for Stat 593E.
Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al
ONCOMINE: A Bioinformatics Infrastructure for Cancer Genomics
Demonstration Trupti Joshi Computer Science Department 317 Engineering Building North (O)
December 14, 2001Slide 1 Some Biology That Computer Scientists Need for Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA 24061
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Inside the Cell 7.1 What’s Inside the Cell? Prokaryotic Cells Eukaryotic Cells –The Nucleus –Ribosomes –Rough Endoplasmic Reticulum –Golgi Apparatus –Smooth.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Identifying conserved promoter motifs and transcription factor binding sites in plant promoters Endre Sebestyén, ARI-HAS, Martonvásár, Hungary 26th, November,
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
Cell signaling Cells do not work in isolation but continually ‘talk’ to each other by sending and receiving chemical signals to each other. This process.
Section 1 Cellular Structure and Function Cell Discovery and Theory
Introduction to the biological pathway POSTECH NLP lab 발표자 : 정설경.
AP2/EREBP Transcription Factor Family
Transcriptional profiling and mRNA stability – don’t shoot the messenger David R. Sherman Seattle Biomedical Research Institute Grand Challenge of Latent.
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
PattArAn – From Annotation Triplets to Sentence Fingerprints Motivation Motivation  Scientific concepts are annotated with controlled vocabulary (CV)
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
1 Bio-Trac 40 (Protein Bioinformatics) October 8, 2009 Zhang-Zhi Hu, M.D. Associate Professor Department of Oncology Department of Biochemistry and Molecular.
November 16, 2001Slide 1 Opportunities in Bioinformatics for Computer Science Lenwood S. Heath Virginia Tech Blacksburg, VA University.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Plant Responses to Internal & External Stimuli
Supplemental Fig. S1 extracellular (P=0.000) cell wall (P=0.000) ribosome (P=0.001) ER (P=0.294) golgi apparatus (P=0.005) plasma membrane (P=0.000) mitochondria.
AP BIOLOGY REVIEW PART 1 – MOLECULES AND CELLS Aim at receiver and click the letter of the appropriate response!
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
Ontology based analyses methods ++ develop a grammar for making productions using mf, bp, cl: –derive a higher level grammar for next level of productions.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Gene Expression and Networks. 2 Microarray Analysis Supervised Methods -Analysis of variance -Discriminate analysis -Support Vector Machine (SVM) Unsupervised.
Gene Ontology Consortium
CAMPBELL BIOLOGY IN FOCUS © 2014 Pearson Education, Inc. Urry Cain Wasserman Minorsky Jackson Reece.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Pathways between Genes and Behaviour. Functional Genomics Understanding the pathways between genes and behaviours (i.e., mechanisms of genes affecting.
A Report on CAMDA’01 Biointelligence Lab School of Computer Science and Engineering Seoul National University Kyu-Baek Hwang and Jeong-Ho Chang.
DNAmRNAProtein Small molecules Environment Regulatory RNA How a cell is wired The dynamics of such interactions emerge as cellular processes and functions.
Development and Use of Controlled Vocabularies at the Arabidopsis Information Resource (TAIR) Sue Rhee Carnegie Institution Dept. Plant Biology
DNA, proteins and proteomes VCE Biology Unit 3. Contents Structure of DNA Protein Synthesis Protein Formation Protein Function Proteome.
GO-Slim term Cluster frequency cytoplasm 1944 out of 2727 genes, 71.3% 70 out of 97 genes, 72.2% out of 72 genes, 86.1% out.
Cell lineXY11q WM WM793-P WM793-P Lu ABAB WM793:WM793-P2WM793:1205-Lu WM793 WM793:WM793-P1 Supplemental.
Shortest Path Analysis and 2nd-Order Analysis Ming-Chih Kao U of M Medical School
Supplemental Figure 1. Levels of nitrate, carbohydrates and metabolites involved in nitrate assimilation, and the medium pH. Starch and the sum of Glucose,
0 Dpa Control pI 4-7 (Linear) 170 kDa Biotic stress pI 4-7 (Linear) 170 kDa kDa
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
Biology 1 st Semester Review Jeopardy A: Intro.
Protein. Protein and Roles 1: biological process unknown 1.1 Structural categories 1.2 organism categories 1.3 cellular component o unlocalized.
Gene Ontology TM (GO) Consortium
1 Survey of Biodata Analysis from a Data Mining Perspective Peter Bajcsy Jiawei Han Lei Liu Jiong Yang.
Figure S1 (a) (b) Fig. S1. Hydroponics culture of Arabidopsis thaliana. (a) Illustration of the hydroponics system in the growth chamber. (b) close-up.
Gene Annotation & Gene Ontology May 24, Gene lists from RNAseq analysis What do you do with a list of 100s of genes that contain only the following.
“Proteomics is a science that focuses on the study of proteins: their roles, their structures, their localization, their interactions, and other factors.”
Elaborate, complex, intracellular and intercellular
Endocrinology Introduction Lecture 3.
Volume 1, Issue 5, Pages (September 2008)
Presentation transcript:

6 November 2007 © ETH Zürich | Genevestigator Gene expression analysis and network discovery: Genevestigator Philip Zimmermann, Genevestigator Team, ETH Zurich

6 November 2007 P. Zimmermann / ETH Zurich / 2 Presentation flow  Gene networks – biological context  Microarray compendium: how, and what for?  Meta-profile analysis: concepts and validation  Genevestigator ® V3  Data integration  Summary & conclusion

6 November 2007 P. Zimmermann / ETH Zurich / 3 Presentation flow  Gene networks – biological context  Microarray compendium: how, and what for?  Meta-profile analysis: concepts and validation  Genevestigator ® V3  Data integration  Summary & conclusion

6 November 2007 P. Zimmermann / ETH Zurich / 4 Gene networks - biological context  What is the interpretational value of a gene network derived by graphical modeling or correlation analysis?  a snapshot in time?  a snapshot in space?  an average trend?

6 November 2007 P. Zimmermann / ETH Zurich / 5 Gene networks - biological context  From what experiment(s) was this network derived?  time-course?  cell culture, whole organism?  stimulus, drug response?  anatomy part?  stage of development?  genetic modification?

6 November 2007 P. Zimmermann / ETH Zurich / 6 Context and dynamics of networks  Hypothesis: networks are dynamic and context-dependant  => networks evolve!  => networks may have different functions in different contexts!  Question: how can we quantify the role of the context in shaping the network?

6 November 2007 P. Zimmermann / ETH Zurich / 7 Context: the time-space-response dimensions  Time => time-course, development  Space => anatomy parts, intracellular localization  Response => response to external perturbations => response to modifications in the genome

6 November 2007 P. Zimmermann / ETH Zurich / 8 Context and dynamics of networks  Modeling the time, space and response dimensions requires:  experiments testing time, space and response variables  storage of measurement data and its meta-data  developing analysis methods that incorporate these dimensions (→ meta-profiles)

6 November 2007 P. Zimmermann / ETH Zurich / 9 Presentation flow  Gene networks – biological context  Microarray compendium: how, and what for?  Meta-profile analysis: concepts and validation  Genevestigator ® V3  Data integration  Summary & conclusion

6 November 2007 P. Zimmermann / ETH Zurich / Analysis versus meta-analysis Data storage Data analysis 100 genes – what to do next? 10 billion data points – what to do next? Microarray experiment

6 November 2007 P. Zimmermann / ETH Zurich / heterogenous datasets Data repositories unsystematic or poor annotation Data Annotations + meta-analysis impossible! ? =

6 November 2007 P. Zimmermann / ETH Zurich / Data warehouses Data quality control + ordered datasets meta-analysis possible! = systematic annotation Expert annotation with systematic ontologies anatomy development stimulus mutation

6 November 2007 P. Zimmermann / ETH Zurich / 13 Data quality control RLENUSE Border elementsCorrelation matrix Affy QC metrics RNA degradation Unprocessed values

6 November 2007 P. Zimmermann / ETH Zurich / Ontologies – example of Anatomy  Mouse / Rat:  Edinburgh Mouse Atlas  Human:  mapping to Mouse and Rat anatomy tree  Arabidopsis / Barley:  terms from Plant Ontology  tree created by Genevestigator Expert annotation with systematic ontologies anatomy development stimulus mutation

6 November 2007 P. Zimmermann / ETH Zurich / Ontologies – example of Development  Mouse: Theiler stages  Rat: Witschi stages  Human: Carnegie table  Arabidopsis: Boyes key

6 November 2007 P. Zimmermann / ETH Zurich / Meta-analysis tools Who is most interested to mine this data? Who can best interpret the results? THE BIOLOGIST! Genevestigator ® – a tool for biologists

6 November 2007 P. Zimmermann / ETH Zurich / 17 Presentation flow  Gene networks – biological context  Microarray compendium: how, and what for?  Meta-profile analysis: concepts and validation  Genevestigator ® V3  Data integration  Summary & conclusion

6 November 2007 P. Zimmermann / ETH Zurich / 18 Expression meta-profiles [space] [time] [response] [response]

6 November 2007 P. Zimmermann / ETH Zurich / 19 Data validation Category type Probe set e.g. heart ventricle e.g. Mm [space] [time] [response]

6 November 2007 P. Zimmermann / ETH Zurich / 20 Data validation Category type Probe set e.g. heart ventricle [space] [time] [response]

6 November 2007 P. Zimmermann / ETH Zurich / 21 Mouse anatomy meta-profiles [space]

6 November 2007 P. Zimmermann / ETH Zurich / 22 Data validation Category type Probe set e.g. Mm [space] [time] [response]

6 November 2007 P. Zimmermann / ETH Zurich / 23 Transcription of Rnf33 has been shown to occur already in the mouse oocyte but not beyond the eight-cell stage nor in adult tissues Rnf33 Hoxa1 expression starts at E7.5 and begins to retreat caudally by day E8.5 hemopexin (hx), is known to be only lowly expressed in embryos and newborn mice and reaches it’s highest expression level not until the first year of age Hoxa1 hemopexin a – f: pre-natal g – l: post-natal

6 November 2007 P. Zimmermann / ETH Zurich / light-harvesting chlorophyll a/b binding protein (AT4G14690 ) protochlorophyllide reductase A (At5g54190 )

6 November 2007 P. Zimmermann / ETH Zurich / 25 Presentation flow  Gene networks – biological context  Microarray compendium: how, and what for?  Meta-profile analysis: concepts and validation  Genevestigator ® V3  Data integration  Summary & conclusion

6 November 2007 P. Zimmermann / ETH Zurich / 26 Development of Genevestigator ®  14‘500 Affymetrix arrays (Nov 2007)  Human, mouse, rat, arabidopsis, barley  Metabolic and regulatory pathway maps for mouse and arabidopsis  > 10‘000 registered users  > 500 citations in peer reviewed journals Anatomy Development Stimulus Mutation Microarray data Public repositories Genevestigator database Curation & Quality control Biological experiments Application server Client Java application Genevestigator

6 November 2007 P. Zimmermann / ETH Zurich / 27 Genevestigator ® V3 WebsiteJava Client Application Database and Application Server Cluster

6 November 2007 P. Zimmermann / ETH Zurich / 28 Toolsets and tools

6 November 2007 P. Zimmermann / ETH Zurich / 29 [space] [time] [response]

6 November 2007 P. Zimmermann / ETH Zurich / 30

6 November 2007 P. Zimmermann / ETH Zurich / 31

6 November 2007 P. Zimmermann / ETH Zurich / Biomarker Search toolset

6 November 2007 P. Zimmermann / ETH Zurich / 33 Abiotic stresses and hormonal responses salt (+) osmotic (+) cold (+) ABA (+) 2,4-D glucose salt (+) osmotic (+) ABA (+) norflurazon (-) mycorrhiza (-) anoxia (-) hypoxia (-) BL / H 3 BO 3 (+) syringolin (-) cycloheximide (-) H 2 O 2 (-) salt (-) osmotic (-) --- ozone (-) genotoxic (-) salt (+) drought (+) MeJA (+) syringolin (-) P. syringae (+) ozone (+) B. cinerea (+) hypoxia (-) ethylene (+) AVG (+) chitin (+)

6 November 2007 P. Zimmermann / ETH Zurich / 34 [space] [time] [response]

6 November 2007 P. Zimmermann / ETH Zurich / 35 Biclustering  Searches subsets of genes coexpressed across subsets of conditions  BiMax algorithm  Finds all maximal bicliques [space] [time] [response]

6 November 2007 P. Zimmermann / ETH Zurich / Example of a bicluster 36

6 November 2007 P. Zimmermann / ETH Zurich / 37 ABA response Beta-alanine Starch / sucrose Inositol phosphate Cold response Phenylalanine / Tyrosine Proline ABA biosynthesis [space] [time] [response]

6 November 2007 P. Zimmermann / ETH Zurich / 38 Presentation flow  Gene networks – biological context  Microarray compendium: how, and what for?  Meta-profile analysis: concepts and validation  Genevestigator ® V3  Data integration  Summary & conclusion

6 November 2007 P. Zimmermann / ETH Zurich / 39 Biomarker search [time]  Genes expressed specifically in seeds and germinating seedlings  De-novo identification of cis-regulatory elements

6 November 2007 P. Zimmermann / ETH Zurich / 40 Biomarker search [space] z = 18.2 z = 5.8 z = 5.4

6 November 2007 P. Zimmermann / ETH Zurich / 41 Biomarker search [response]  „Supervised biclustering“  isoxaben (+)  norflurazon (-)  light (+)  nitrate_low (-)

6 November 2007 P. Zimmermann / ETH Zurich / Anatomy clustering and promoter analysis  Clusters of genes expressed specifically in:  cell suspension  petals  roots  seeds  stamen  xylem z > 5.0

6 November 2007 P. Zimmermann / ETH Zurich / Development clustering and promoter analysis  Clusters of Arabidopsis genes expressed specifically at:  dev. stage 1  dev. stage 3  dev. stage 9 z > 5.0

6 November 2007 P. Zimmermann / ETH Zurich / Stimulus clustering and promoter analysis  „Supervised biclustering“ of stimulus meta-profiles:  cluster 1  cluster 2  cluster 4  cluster 5  cluster 7 z > 5.0

6 November 2007 P. Zimmermann / ETH Zurich / Data integration: transcriptome - proteome cell suspension cotyledons flowers leaves roots seeds cell suspension cotyledons flowers leaves roots seeds Transcripts Proteins

6 November 2007 P. Zimmermann / ETH Zurich / Arabidopsis leaf transcripts and proteins Protein quantification measure Transcript quantification measure Frequency general background range for transcript quantification measure proteins detected in leaves proteins not detected in leaves but for which there is a probeset on the ATH1 array

6 November 2007 P. Zimmermann / ETH Zurich / Protein detection and transcript abundance Fraction of „present“ transcripts that were detected on the protein level probe sets called “absent” on ATH1 (p >= 0.05) probe sets called “present” on ATH1 (p < 0.05) leaf proteins detected by peptide identification Transcript abundance measure (log2 signal) Number of transcripts/proteins leaf proteins detected

6 November 2007 P. Zimmermann / ETH Zurich / GO analysis cell wall chloroplast cytosol ER extracellular Golgi apparatus mitochondria nucleus other cellular components other cytoplasmic components other intracellular components other membranes plasma membrane plastid ribosome ATH1 array (control) Proteins not detected but transcripts have high abundance ( >13 ) GO Cellular Component n = 221 specific probesets with average signal in leaves >13

6 November 2007 P. Zimmermann / ETH Zurich / GO analysis cell organization and biogenesis developmental processes DNA or RNA metabolism electron transport or energy pathways other biological processes other cellular processes other metabolic processes protein metabolism response to abiotic or biotic stimulus response to stress signal transduction transcription transport ATH1 array (control) Proteins not detected but transcripts have high abundance ( >13 ) GO Biological Process n = 221 specific probesets with average signal in leaves >13

6 November 2007 P. Zimmermann / ETH Zurich / GO analysis GO Molecular Function n = 221 specific probesets with average signal in leaves >13 DNA or RNA binding hydrolase activity kinase activity nucleic acid binding nucleotide binding other binding other enzyme activity other molecular functions protein binding receptor binding or activity structural molecule activity transcription factor activity transferase activity transporter activity ATH1 array (control) Proteins not detected but transcripts have high abundance ( >13 )

6 November 2007 P. Zimmermann / ETH Zurich / Data integration – pathway analysis Protein abundance Transcript abundance Carotenoid biosynthesis Phenylpropanoid metabolism Chlorophyll / Porphyrin metabolism Riboflavin metabolism Mevalonate biosynthesis

6 November 2007 P. Zimmermann / ETH Zurich / Relative protein-to-transcript ratio Calvin cycle Fatty acid biosynthesis serine, glycine, cystein starch and sucrose metabolism

6 November 2007 P. Zimmermann / ETH Zurich / Relative protein-to-transcript ratio Chlorophyll / Porphyrin metabolism Fatty acid biosynthesis Glycolysis / Gluconeogenesis Purine metabolism Pyrimidine metabolism

6 November 2007 P. Zimmermann / ETH Zurich / Proteomic and transcriptomic biomarkers „Root-specific“expression Search by scoring the proteomic dataset Search by scoring the Genevestigator dataset

6 November 2007 P. Zimmermann / ETH Zurich / Proteomic and transcriptomic biomarkers Search by scoring the proteomic dataset Search by scoring the Genevestigator dataset

6 November 2007 P. Zimmermann / ETH Zurich / 56 Presentation flow  Gene networks – biological context  Microarray compendium: how, and what for?  Meta-profile analysis: concepts and validation  Genevestigator ® V3  Data integration  Summary & conclusion

6 November 2007 P. Zimmermann / ETH Zurich / 57 Summary and conclusions  Biological networks: importance of the biological context  Meta-profiles: context-driven analysis  Biological validation of meta-profiles and clusters  Genevestigator – a tool for biologists!  Data integration: challenging biological complexity

6 November 2007 P. Zimmermann / ETH Zurich / Experimental context? Organism? Data type? Modes of interactions? Network dynamics? Reproducibility?

6 November 2007 P. Zimmermann / ETH Zurich / Acknowledgements  ETH Zurich  Prof. Gruissem  Developer Team:  Tomas Hruz, Oliver Laule, Stefan Bleuler, Philip Zimmermann  Gabor Szabo, Frans Wessendorp, Lukas Oertle, Dominique Dümmler, Matthias Hirsch-Hoffmann

6 November 2007 P. Zimmermann / ETH Zurich / 60 Thanks for your attention!