Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Developing Semantic Pathway Alignment Algorithms for Systems Biology Jonas Gamalielsson 2006-09-06.

Similar presentations


Presentation on theme: "1 Developing Semantic Pathway Alignment Algorithms for Systems Biology Jonas Gamalielsson 2006-09-06."— Presentation transcript:

1 1 Developing Semantic Pathway Alignment Algorithms for Systems Biology Jonas Gamalielsson 2006-09-06

2 2 higher order networks, revealing functional modules Functional modules, which are hierarchically clustered Systems Biology  Stydy behaviour of complex biological systems  Consider interaction of all cellular/molecular parts  Often studied over time  Goal: develop models for system understanding  Powerful computational tools are required  Organisation: "Life's complexity pyramid" (Oltvai & Barabasi, 2002) genes, mRNA, proteins & metabolites regulatory motifs & metabolic pathways

3 3 Thesis aim  To develop semantic pathway alignment algorithms for systems biology  Three related algorithms GOTEM (GO-based regulatory TEMplates) GOSAP (GO-based Semantic Alignment of biological Pathways) EGOSAP (Evolutionary GO-based Semantic Alignment of biological Pathways)

4 4 Gene Ontology (GO) Gene_Ontology molecular_functionbiological_processcellular_component catalytic activitytransporter activitycellular process developmentcellextracellular Function sub-graphProcess sub-graphComponent sub-graph G1 G2 G3G4 G1 G2 G3 G4 G1 G2 G3 G4 G x = gene product

5 5 Semantic similarity  ms(A,B)=D  p ms (A,B)=p D =0.10  SS(A,B)=-log 2 (0.10)=3.32 D BC A p=0.10 p=0.03p=0.07 p=0.01 Note: all nodes not shown in graph is-a EXAMPLE: Resnik (1995)

6 6 GOTEM: background 1(2)  Highly desirable to derive gene regulatory networks using gene expression data  Reverse engineering (RE) algorithms derive a model (set of rules) that fits the data  Examples; boolean networks, neural networks, Bayesian networks  Limitations of RE algorithms Many derived model networks can fit the same data Few derived networks are actually biologically feasible RE algorithms do not distinguish between biologically plausible and implausible networks. Reduce search space?

7 7 GOTEM: background 2(2)  We propose GOTEM; GO-based regulatory TEMplates [1,2]  Contribution Means to distinguish between plausible and implausible networks GOTEM generalises knowledge about gene products using the molecular function part of Gene Ontology Binary semantic templates encoding general knowledge of regulation are derived from documented pathways and used to assess the biological plausibility of regulatory hypotheses [1] Gamalielsson, J., Olsson, B., Nilsson, P. (2005). A Gene Ontology based Method for Assessing the Biological Plausibility of Regulatory Hypotheses. Technical report, HS-IKI-TR-05-004, University of Skövde, Sweden [2] Gamalielsson, J., Nilsson, P., Olsson, B. (2006). A GO-based Method for Assessing the Biological Plausibility of Regulatory Hypotheses. In proceedings of the 2nd International Workshop on Bioinformatics Research and Applications (IWBRA 2006), Reading, Great Britain (May 2006)

8 8 GOTEM Annotation databases Templates GO term probability calculation Binary relations Extract binary pathway relations Template generation Model pathway databases Hypothesis assessment Enriched GO graph Method/algorithm Data/information GO Regulatory hypotheses Scored & ranked hypotheses

9 9 GOTEM: example RAD24 [act] MEC3 SWI4 [expr] CLN1 SWI4 [expr] CLN2 SWI6 [expr] CLN1 SWI6 [expr] CLN2 CLN1 [phos] SIC1 CLN2 [phos] SIC1 CDC28 [phos] SIC1. T1: GO:0003689 [act] GO:0003677 T2: GO:0003689 [act] GO:0003676 T3: GO:0003689 [act] GO:0005488 T4: GO:0003689 [act] GO:0003674 T5: GO:0003677 [act] GO:0003677 T6: GO:0003677 [act] GO:0003676 T7: GO:0003677 [act] GO:0005488. GO-score(Tx)=-log2((p(GOID LHS )+p(GOID RHS ))/2) RAD24 [?] MEC3 CLN1 [?] SWI4 MBP1 [?] CLN2. TM1: GO:0003689 [act] GO:0003677 (GO-score=6.80). TM1: GO:0003674 [exp] GO:0003674 (GO-score=0). Generation Assessment TM1: GO:0003700 [exp] GO:0016538 (GO-score=5.88).

10 10 GOTEM: results  Test Templates created from KEGG S. cerevisiae cell cycle Reverse engineered hypotheses from microarray gene expression data Assess how well templates can separate true positive interactions from false positive ones  Results Method can filter out a large proportion of implausible hypotheses Hence, improves specificity of network reconstruction

11 11 GOSAP: background 1(2)  Large base of biological pathways  Need for pathway analysis methods: Inter-species comparisons Intra-species comparisons Assess hypothetical pathways  Limitations of related work Previous efforts on metabolic pathways Little work on approximate matching by semantic similarity EC hierarchy used before, which only covers the molecular function of enzymes

12 12 GOSAP: background 2(2)  We propose GOSAP; GO-based Semantic Alignment of biological Pathways [3,4]  Contribution GO has not been used before for semantic pathway alignment GOSAP generalises about any kind of gene product using GO, not only enzymes Richer semantic description of gene products by combining function-, process- and component ontologies of GO in similarity calculations [3] Gamalielsson, J., Olsson, B. (2005). GOSAP: Gene Ontology Based Semantic Alignment of Biological Pathways. Technical report, HS-IKI-TR-05-005, University of Skövde, Sweden [4] Gamalielsson, J., Olsson, B. (200x). GOSAP: GO-based Semantic Alignment of Biological Pathways. Manuscript in preparation.

13 13 GOSAP Organism annotation databases GO term probability calculation Model paths Extraction of super-paths Model pathway database Path alignment Enriched GO graph Procedure/algorithm Data/information GO graph Query paths Scored & ranked path alignments Query pathway database Parameter settings

14 14 GOSAP: example Path extraction Path alignment e.g. 1. SWI4 [e]>CLN1 2. SWI4 [e]>CLN2[p]>SIC1 3. MBP1[e]>CLB5[p]>CDC6. Only super-paths, extracted by depth-first based algorithm. 1. SWI4[?]>CLN2 2. MBP1[?]>CLN1[?]>CDC6 1.SWI4 [e]>CLN1 2.SWI4 [e]>CLN2[p]>SIC1 3. MBP1[e]>CLB5[p]>CDC6 Query paths Model paths align Example alignment Q: FAR1 ?> SIC1 (GAP) CLN2 ?> SIC1 M: FAR1 i> CLN1 p> SWI6 e> CLN2 p> SIC1 F: GO:0004861 > GO:0019207 (GAP) GO:0016538 > GO:0019210 P: GO:0007050|GO:0045786 > GO:0000079 (GAP) GO:0000320|GO:0000321 > GO:0000079 C: GO:0005634 > GO:0005634 (GAP) GO:0005634 > GO:0005634

15 15 GOSAP: results  Test Model pathways: KEGG S. cerevisiae cell cycle, metabolic pathways Query pathways: Reverse engineered (RE) regulatory pathways, KEGG MAPK, metabolic pathways Assess if GOSAP can find significant alignments of biological interest  Results Method is able to  detect significant alignments between RE paths and model paths and between different metabolic pathways  suggest missing gene products in query paths Combined ontologies resulted in significant alignments when molecular function alone did not

16 16 EGOSAP: background 1(2)  Large base of biological pathways and microarray gene expression data  Sometimes only hypothetical sets of gene products are known  Highly desirable derive interactions between gene products  Limitations of related work Previous efforts merely map genes onto known pathways by identity No work on approximate matching by semantic similarity Related methods do not attempt to assemble hypothetical paths using a query set of gene products

17 17 EGOSAP: background 2(2)  We propose EGOSAP; Evolutionary GO-based Semantic Alignment of biological Pathways [5]  Contribution GO has not been used before for semantic pathway alignment GOSAP generalises about any kind of gene product using GO, not only enzymes Richer semantic description of gene products by combining function-, process- and component ontologies of GO in similarity calculations Hypothetical paths are assembled using an evolutionary algorithm and a query set of gene products [5] Gamalielsson, J., Corne, D. W., Olsson, B. (200x). EGOSAP: Evolutionary Gene Ontology Based Semantic Alignment of Biological Pathways. Manuscript in preparation.

18 18 EGOSAP Organism annotation databases GO term probability calculation Model paths Path extraction Model pathway database Evolution of path alignments Enriched GO graph Procedure/algorithm Data/information GO graph Query set of gene products Path alignments Parameter settings

19 19 EGOSAP: example Evolutionary algorithm t  0 initialise P(t) evaluate P(t) while(not term-cond) do begin t  t+1 select P(t) from P(t-1) alter P(t) evaluate P(t) apply elitism to P(t) end P(t): a set of gene product permutations initialised from query alphabet evaluate: Calculate fitness, i.e. semantic similarity score btw model path and each evolved path in P(t). select: tournament selection alter: partially mapped crossover, mutation Example alignment (fitness=0.73, p=0.01): Query (mouse): MEF2C > NR2F6 > NRBF2 > AFG3L2 > TRIM28 Model (yeast): SWI6 > SWI4 > NDD1 > ACE2 > SFL1 Function: GO:0003713 > GO:0003700 > GO:0016563 > GO:0008237 > GO:0016564 Process: GO:0006366 > GO:0007049 > GO:0006357 > GO:0006508 > GO:0000122 Component: GO:0005634 > GO:0005634 > GO:0005634 > GO:0016021 > GO:0005694

20 20 EGOSAP: results  Test Model pathway: S. cerevisiae regulatory chain motifs Query set: Differentially expressed genes for transgenic and knock-out mice Assess if EGOSAP can evolve significant alignments (of biological interest)  Results Method is able to detect significant alignments between evolved paths and model paths. Like for GOSAP, combined ontologies resulted in significant alignments when molecular function alone did not

21 21 Conclusions  Three methods for semantic analysis of biological pathways are developed  Methods assess biological plausibility of derived pathways compare different pathways for semantic similarities evolve hypothetical pathways similar to model pathways  Methods are novel  Methods are believed to be useful to biologists

22 22 Write-up schedule  September 2006 Thesis contributions, thesis skeleton, set of chapters, draft of GOSAP paper  October 2006 Submission GOSAP paper, redrafts of earlier material, set of new chapters, draft of EGOSAP paper  November 2006 Submission EGOSAP paper, nearly complete thesis draft  December 2006 - February 2007 Continual refinement of thesis  March 2007 Submission of thesis


Download ppt "1 Developing Semantic Pathway Alignment Algorithms for Systems Biology Jonas Gamalielsson 2006-09-06."

Similar presentations


Ads by Google