Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University.

Similar presentations


Presentation on theme: "Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University."— Presentation transcript:

1 Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University of Glasgow, Scotland, UK

2 Background Datasets and evidence networks in post-genomic biology

3 Genomics Fully sequenced genomes (1995-2004): 18 archaea 163 bacteria 3 protozoa 24 yeast species and fungi 2 plants (Arabidopsis, rice) 2 insects (flies, honey bee) 2 worms (C.elegans, C. briggsae) 3 fish (fugu, puffer, zebrafish) chicken, cow, dog, mouse, rat, chimp human  lots of “lists” of genes

4 Transcriptomics microarrays measure gene expression levels (mRNA concentrations) relative or absolute values in organisms, tissues, cells produce gene lists (e.g., which genes are up-regulated by a disease, by drug treatment, in a certain tissue)

5 Proteomics 2D gels, liquid chromatography, and mass spectrometry measure protein concentrations in tissues, cells, organelles detect chemical modifications and processing of proteins produces lists of protein variants that are different among conditions

6 Metabolomics chromatography and mass spectrometry measure metabolite concentrations in tissues, cells, body fluids, cell culture medium produces lists of affected metabolites

7 Evidence networks relate items (genes, proteins, metabolites) that “have something to do with each other” relationship is based on objective evidence represented as bipartite graphs –two classes of nodes: items and evidence –automated analysis of results possible –intuitive visualization and links to literature

8 Types of evidence networks Relationship can be based on –physical neighborhood –phyletic pattern similarity –expressional correlation –biophysical similarity –chemical transformation –functional co-operation –literature co-citations

9 Types of evidence networks Relationship can be based on –physical neighborhood –phyletic pattern similarity –expressional correlation –biophysical similarity –chemical transformation –functional co-operation –literature co-citations A O M P K Z Y Q V D R L B C E F G H S N U J X I T W phyphy: a o m p k z y - - d - l - - - - - - - - - - - i t –aompkzy- d-l- it 2222 aompkzy--d-l-----------it- NtpA [C] H+-ATPase subunit Aaompkzy--d-l-----------it-C 1717 aompkzy--d-l-----------it- NtpB [C] H+-ATPase subunit Baompkzy--d-l-----------it-C 1717 aompkzy--d-l-----------it- NtpD [C] H+-ATPase subunit Daompkzy--d-l-----------it-C 1818 aompkzy--d-l-----------it- NtpI [C] H+-ATPase subunit Iaompkzy--d-l-----------it-C

10 Types of evidence networks Relationship can be based on –physical neighborhood –phyletic pattern similarity –expressional correlation –biophysical similarity –chemical transformation –functional co-operation –literature co-citations

11 Types of evidence networks Relationship can be based on –physical neighborhood –phyletic pattern similarity –expressional correlation –biophysical similarity –chemical transformation –functional co-operation –literature co-citations

12 Types of evidence networks Relationship can be based on –physical neighborhood –phyletic pattern similarity –expressional correlation –biophysical similarity –chemical transformation –functional co-operation –literature co-citations

13 Types of evidence networks Relationship can be based on –physical neighborhood –phyletic pattern similarity –expressional correlation –biophysical similarity –chemical transformation –functional co-operation –literature co-citations

14 Types of evidence networks Relationship can be based on –physical neighborhood –phyletic pattern similarity –expressional correlation –biophysical similarity –chemical transformation –functional co-operation –literature co-citations

15 What is the big picture? Graph-based iterative Group Analysis for the automated interpretation of biological datasets lists + graphs = understanding

16 What does this list mean? Fold-ChangeGene SymbolGene Title 126.45TNFAIP6tumor necrosis factor, alpha-induced protein 6 225.79THBS1thrombospondin 1 323.08SERPINE2 serine (or cysteine) proteinase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1), member 2 421.5PTX3pentaxin-related gene, rapidly induced by IL-1 beta 518.82THBS1thrombospondin 1 616.68CXCL10chemokine (C-X-C motif) ligand 10 718.23CCL4chemokine (C-C motif) ligand 4 814.85SOD2superoxide dismutase 2, mitochondrial 913.62IL1Binterleukin 1, beta 1011.53CCL20chemokine (C-C motif) ligand 20 1111.82CCL3chemokine (C-C motif) ligand 3 1211.27SOD2superoxide dismutase 2, mitochondrial 1310.89GCH1GTP cyclohydrolase 1 (dopa-responsive dystonia) 1410.73IL8interleukin 8 159.98ICAM1intercellular adhesion molecule 1 (CD54), human rhinovirus receptor 169.97SLC2A6solute carrier family 2 (facilitated glucose transporter), member 6 178.36BCL2A1BCL2-related protein A1 187.33TNFAIP2tumor necrosis factor, alpha-induced protein 2 196.97SERPINB2serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 2 206.69MAFBv-maf musculoaponeurotic fibrosarcoma oncogene homolog B (avian)

17 iterative Group Analysis (iGA) iGA uses simple hypergeometric distribution to obtain p-values Breitling et al., BMC Bioinformatics, 2004, 5:34

18 Graph-based iGA Breitling et al., BMC Bioinformatics, 2004, 5:100

19 Graph-based iGA 1. step: build the network Breitling et al., BMC Bioinformatics, 2004, 5:100

20 Graph-based iGA 2. step: assign ranks to genes Breitling et al., BMC Bioinformatics, 2004, 5:100

21 Graph-based iGA 3. step: find local minima p = 1/8 = 0.125 p = 2/8 = 0.25 p = 6/8 = 0.75 Breitling et al., BMC Bioinformatics, 2004, 5:100

22 Graph-based iGA 4. step: extend subgraph from minima p=1 p=0.014 p=0.018 p=0.125 Breitling et al., BMC Bioinformatics, 2004, 5:100

23 Graph-based iGA 5. step: select p-value minimum p=1 p=0.018 p=0.125 p=0.014 Breitling et al., BMC Bioinformatics, 2004, 5:100

24 Advantages of GiGA fast, unbiased and comprehensive analysis assignment of statistical significance values to interpretation detection of significant changes even if data are too noisy to reliably detect changed genes statistically meaningful interpretation already without replication experiments detection of patterns even for small absolute changes flexible use of annotations + intuitive visualization

25 Example 1 Microarrays Gene expression changes during the yeast diauxic shift

26 Yeast diauxic shift study DeRisi et al. (1997)Science 278: 680-6

27 Yeast diauxic shift study 0h9.5h11.5h13.5h15.5h18.5h20.5h UP 6144 - purine base metabolism 6099 - tricarboxylic acid cycle 3773 - heat shock protein activity 6099 - tricarboxylic acid cycle 9277 - cell wall (sensu Fungi) 3773 - heat shock protein activity 5749 - respiratory chain complex II (sensu Eukarya) 6099 - tricarboxylic acid cycle 3773 - heat shock protein activity 297 - spermine transporter activity 6950 - response to stress 6121 - oxidative phosphorylation, succinate to ubiquinone 5977 - glycogen metabolism 5749 - respiratory chain complex II (sensu Eukarya) 15846 - polyamine transport 297 - spermine transporter activity 8177 - succinate dehydrogenase (ubiquinone) activity 6950 - response to stress 6121 - oxidative phosphorylation, succinate to ubiquinone 4373 - glycogen (starch) synthase activity 3773 - heat shock protein activity 4373 - glycogen (starch) synthase activity 8177 - succinate dehydrogenase (ubiquinone) activity 15846 - polyamine transport 4373 - glycogen (starch) synthase activity 4129 - cytochrome c oxidase activity 6537 - glutamate biosynthesis 5353 - fructose transporter activity 7039 - vacuolar protein catabolism 5751 - respiratory chain complex IV (sensu Eukarya) 6097 - glyoxylate cycle 15578 - mannose transporter activity 6950 - response to stress 5749 - respiratory chain complex II (sensu Eukarya) 5750 - respiratory chain complex III (sensu Eukarya) 7039 - vacuolar protein catabolism 4129 - cytochrome c oxidase activity 6121 - oxidative phosphorylation, succinate to ubiquinone 9060 - aerobic respiration 8645 - hexose transport 5751 - respiratory chain complex IV (sensu Eukarya) 8177 - succinate dehydrogenase (ubiquinone) activity 4129 - cytochrome c oxidase activity

28 GiGA results – diauxic shift Down-regulated genes using GeneOntology-based network locusgene description ("anchor gene")p-valuemembersmax. rank YHL015Wribosomal protein S205.87E-863948 YMR217WGMP synthase3.38E-139172 YDR144Caspartyl protease|related to Yap3p4.06E-086242 YNL065Wmultidrug resistance transporter4.02E-053141 YLR062C6.41E-054367 YGL225WMay regulate Golgi function and glycosylation in Golgi1.12E-044422 YPR074Ctransketolase 11.44E-044449 total genes measured in network: 4087.

29 small ribosomal subunit large ribosomal subunit nucleolar rRNA processing translational elongation

30 GiGA case study – diauxic shift Up-regulated genes using metabolic network locusgene descriptionp-valuemembersmax. rank YER065Cisocitrate lyase4.96E-533954 YGR088Wcatalase T3.09E-1011106 YFR015C glycogen synthase (UDP-glucose-starch glucosyltransferase) 2.08E-04345 YJR073Cunsaturated phospholipid N-methyltransferase3.85E-045156 YDR001Cneutral trehalase5.01E-04360 YCR014CDNA polymerase IV5.44E-0417481 YIR038Cglutathione transferase8.64E-045183 total genes measured in network: 744.

31 glyoxylate cycle citrate (TCA) cycle oxidative phosphorylation (complex V) respiratory chain complex III respiratory chain complex II

32 respiratory chain complex IV

33 Example 2 Metabolomics Changes in metabolic profiles in drug-treated trypanosomes

34 GiGA applied to metabolomics data Challenge: No annotation available Solution: Build evidence network based on hypothetical reactions between observed masses (=mass differences)

35 Metabolite tree of mass 257.1028 (glycerylphosphorylcholine) 6 generations

36 Metabolite tree of mass 257.1028 4 generations

37 Metabolite tree of mass 257.1028 2 generations

38 Metabolite tree of mass 257.1028 colors indicate changes of metabolite signals compared to untreated samples after 60 min pentamidine (red = down, green = up)

39 GiGA metabolite trees for one experimental example

40 Choline tree found by GiGA (most significant subgraph, p<10 -13 ) extracted from

41 Summary post-genomic technologies produces “lists” neighborhood relationships yield “evidence networks (graphs) lists + graphs = biological insights GiGA graph analysis highlights and connects relevant areas in the “evidence network”

42 Acknowledgements Pawel Herzyk – Sir Henry Wellcome Functional Genomics Facility Anna Amtmann & Patrick Armengaud – IBLS Molecular Plant Science group Mike Barrett – IBLS Parasitology Research group FGF academic users: Wilhelmina Behan, Simone Boldt, Anna Casburn-Jones, Gillian Douce, Paul Everest, Michael Farthing, Heather Johnston, Walter Kolch, Peter O'Shaughnessy, Susan Pyne, Rosemary Smith, Hawys Williams

43 Contact Rainer Breitling Bioinformatics Research Centre Davidson Building A416 University of Glasgow, Scotland, UK R.Breitling@bio.gla.ac.uk http://www.brc.dcs.gla.ac.uk/~rb106x


Download ppt "Evidence networks for the analysis of biological systems Rainer Breitling IBLS – Molecular Plant Science group Bioinformatics Research Centre University."

Similar presentations


Ads by Google