Download presentation
Presentation is loading. Please wait.
Published byBasil Rice Modified over 8 years ago
1
1 Department of Cancer Research and Molecular Medicine Norwegian University of Science and Technology Trondheim, Norway Gastrointestinal systems biology and the challenge of integrating biological background knowledge Astrid Lægreid
2
2 biomedical problem data analysis modeling hypothesis generation screening genome-wide system-wide hypothesis testing high throughput/ single gene Systems biology of the normal and diseased gastrointestinal system
3
3 biomedical problem gastric carcinoma gastric carcinoids gastrin-mediated gene expression modulation by CREB/CREM/ATF Systems biology of the normal and diseased gastrointestinal system hypergastrinemia
4
4 biomedical problem Systems biology of the normal and diseased gastrointestinal systemgastrin samples 104824 chronic transient hours gastrin-mediated gene expression
5
5 biomedical problem gastric carcinoma gastric carcinoids gastrin-mediated gene expression modulation by CREB/CREM/ATF Systems biology of the normal and diseased gastrointestinal system hypergastrinemia regulators effectors diagnosis prognosis treatment
6
6 Systems biology of the normal and diseased gastrointestinal system model for classification of gastric cancer modeling gene function with rule based methods modeling signal transduction from first principles retrieving information from databases, literature and ontologies modeling biological background information using case based reasoning data analysis modeling hypothesis generation
7
7 Model for classification of gastric cancer Nørsett K, Lægreid A, Midelfart H, Yadetie Y, Falkmer S, Grønbeck J, Waldum HL, Komorowski J, Sandvik AK. Gene Expression based Classification of Gastric Carcinoma. Cancer Letters, 210:227-237, 2004 Herman Midelfart, Jan Komorowski, Kristin Nørsett, Fekadu Yadetie, Arne K. Sandvik and Astrid Lægreid. Learning Rough Set Classifiers from Gene Expression and Clinical Data. Fundamenta Informaticae. 53: 155-183, 2002.
8
8 Use of Gene Ontology in models predicting gene function from gene expression time profiles The Transcriptional Program in the Response of Human Fibroblasts to Serum. Iyer et al, Science, 283: 83, 1999 ~ 500 genes ~ 200 with unknown function
9
9 0 - 4 (Increasing) AND 6 - 10 (Decreasing) AND 14 - 18 (Constant) => GO (cell proliferation) 1. Mine functional classes from an ontology 2. Extract features for learning 3. Induce minimal decision rules using rough sets 4. Predict the function of unknown genes using the rules
10
10 w the model w incorporates background biological knowledge w predicts multiple functions per gene w can provide possible new functions of the known genes w can provide hypotheses about the function of unknown genes w experimental work needs to be done to confirm our predictions Lægreid A, Hvidsten T, Midelfart H, Komorowski J, Sandvik AK. Predicting Gene Ontology Biological Process from Temporal Gene Expression Patterns. Genome Research. 13: 965-979, 2003 Hvidsten TR, Lægreid A, Komorowski J. Learning rule-based models from gene expression time profiles annotated using Gene Ontology. Bioinformatics, 19:1116-23, 2003 conclusions
11
11 how to improve models for prediction of biological roles of genes/proteins? wmore genes/proteins wmore measurements per gene/protein (time points, cell types, tissues, states,...) wmore annotations (GO, sequence, protein structure, cell biology, physiology, pathology,…) more training examples challenges improved modeling strategies wimproved strategies to combine data from many different domains
12
12 Modeling Signal Transduction from first principles
13
13 Modeling Signal Transduction from first principles Gunnar Tufte. Development of Digital Circuits on a Virtual Sblock FPGA. PhD thesis. NTNU. 2004 Towards Evolvable Hardware Modeling Search for models Evaluate/run model
14
14 how to improve signal transduction models? wmore genes/proteins wmore annotations (GO, sequence, protein structure, cell biology, physiology, pathology,…) wmore measurements per gene/protein (time points, cell types, tissues, states,...) more signal transduction data challenges improved modeling strategies
15
15 Systems biology of the normal and diseased gastrointestinal system model for classification of gastric cancer modeling gene function with rule based methods modeling signal transduction from first principles retrieving information from databases, literature and ontologies data analysis modeling hypothesis generation
16
16 Local Gene AnnotationDatabase AnnotationDatabase GeneOntologyGeneOntology LocusLinkLocusLinkUniGeneUniGene StatisticaltestsStatisticaltests Editable GO tree Editable File export InputDatabase Applicat ion UniGene Cluster ID`s UniGene GenBank Acc. Nr. GenBank Clone ID`s Homolo-GeneHomolo-GeneSwissProtSwissProt Output NMCAnnotationDatabaseNMCAnnotationDatabase e GOn GeneAnnotationsGeneAnnotations File export Retrieval and management of biological background information from public databases Local Gene AnnotationDatabase AnnotationDatabase GeneOntologyGeneOntology EntrezGeneEntrezGeneUniGeneUniGene StatisticaltestsStatisticaltests Editable GO tree Editable File export InputDatabaseApplication AffymetrixIDsAffymetrixIDs Clone IDs Homolo-GeneHomolo-GeneSwissProtSwissProt Output NMCAnnotationDatabaseNMCAnnotationDatabase e GOn GeneAnnotationsGeneAnnotations File export GenBank Acc. Nr. GenBank UniGene Cluster IDs UniGene
17
17 Local Gene AnnotationDatabase AnnotationDatabase GeneOntologyGeneOntology LocusLinkLocusLinkUniGeneUniGene StatisticaltestsStatisticaltests Editable GO tree Editable File export InputDatabase Applicat ion UniGene Cluster ID`s UniGene GenBank Acc. Nr. GenBank Clone ID`s Homolo-GeneHomolo-GeneSwissProtSwissProt Output NMCAnnotationDatabaseNMCAnnotationDatabase e GOn GeneAnnotationsGeneAnnotations File export Retrieval and management of biological background information from public databases Local Gene AnnotationDatabase AnnotationDatabase GeneOntologyGeneOntology EntrezGeneEntrezGeneUniGeneUniGene StatisticaltestsStatisticaltests Editable GO tree Editable File export InputDatabaseApplication AffymetrixIDsAffymetrixIDs Clone IDs Homolo-GeneHomolo-GeneSwissProtSwissProt Output NMCAnnotationDatabaseNMCAnnotationDatabase e GOn GeneAnnotationsGeneAnnotations File export GenBank Acc. Nr. GenBank UniGene Cluster IDs UniGene NMC
18
18 Retrieval of biological background information from unstructured sources using natural language processing Tveit A, Sætre R, Steigedal TS and Lægreid A. ProtChew: Automatic Extraction of Protein Names from Biomedical Literature. Proceedings of the International Workshop on Biomedical Data Engineering, BMDE2005, Tokyo Sætre R, Tveit A, Steigedal TS and Lægreid A. Semantic Annotation of Biomedical Literature using Google. Proceedings of the First International Workshop on Datamining and Bioninformatics, DMBIO2005, Singapore
19
19 Retrieval of biological background information from unstructured sources using natural language processing example output
20
20 Retrieval of biological background information from unstructured sources using natural language processing example output
21
21 Retrieval of biological background information from unstructured sources using natural language processing example output
22
22 Retrieval of biological background information from unstructured sources using natural language processing example output
23
23 New Gene Ontology structures for improved biological reasoning Original Gene Ontology MOLECULAR FUNCTION transcription regulator activity transcription termination factor activity RNA polymerase I transcription termination factor activity RNA polymerase II transcription termination factor activity RNA polymerase III transcription termination factor activity transcription termination from RNA polymerase I promoter transcription termination from RNA polymerase II promoter transcription termination from RNA polymerase III promoter transcription from RNA polymerase I promoter transcription from RNA polymerase II promoter transcription from RNA polymerase III promoter transcription, DNA dependent nucleobase, nucleoside, nucleotide and nucl e ic acid metabolism transcription cellular metabolism cellular physiological process cellular process BIOLOGICAL PROCESS intracellular membrane-bound organelle nucleus intracellular organelle intracellular cell CELLULAR COMPONENT GENE ONTOLOGY
24
24 New Gene Ontology structures for improved biological reasoning The Second Gene Ontology Layer offers relations between molecular function - biological process molecular function - cellular component (MOLECULAR FUNCTION) RNA polymerase I transcription termination factor activity RNA polymerase II transcription termination factor activity RNA polymerase III transcription termination factor activity transcription termination from RNA polymerase I promoter transcription termination from RNA polymerase II promoter transcription termination from RNA polymerase III promoter (BIOLOGICAL PROCESS ) nucleus (CELLULAR COMPONENT)
25
25 Second Layer contributes to a more complete and powerful biological knowledge network suitable for e.g. automatic reasoning. MOLECULAR FUNCTION transcription regulator activity transcription termination factor activity RNA polymerase I transcription termination factor activity RNA polymerase II transcription termination factor activity RNA polymerase III transcription termination factor activity transcription termination from RNA polymerase I promoter transcription termination from RNA polymerase II promoter transcription termination from RNA polymerase III promoter transcription from RNA polymerase I promoter transcription from RNA polymerase II promoter transcription from RNA polymerase III promoter transcription, DNA dependent nucleobase, nucleoside, nucleotide and nucleic acid metabolism transcription cellular metabolism cellular physiological process cellular process BIOLOGICAL PROCESS intracellular membrane-bound organelle nucleus intracellular organelle intracellular cell CELLULAR COMPONENT New Gene Ontology structures for improved biological reasoning
26
26 data set: 4223 genes differentially expressed in fibroblast serum response 7589 publicly available molecular function annotations new annotations biological process: 2623 cellular component:: 2112 Second Layer New Gene Ontology structures for improved biological reasoning (MOLECULAR FUNCTION) RNA polymerase I transcription termination factor activity RNA polymerase II transcription termination factor activity RNA polymerase III transcription termination factor activity transcription termination from RNA polymerase I promoter transcription termination from RNA polymerase II promoter transcription termination from RNA polymerase III promoter (BIOLOGICAL PROCESS ) nucleus (CELLULAR COMPONENT) Second Gene Ontology Layer complements existing annotation sets
27
27 New Gene Ontology structures for improved biological reasoning Second Gene Ontology Layer complements existing annotation sets
28
28 how to improve models for prediction of biological roles of genes/proteins? wmore genes/proteins wmore measurements per gene/protein (time points, cell types, tissues, states,...) wmore annotations (GO, sequence, protein structure, cell biology, physiology, pathology,…) more training examples challenges improved modeling strategies wimproved strategies to combine data from many different domains
29
29 Modeling biological backgound information using case based reasoning Kusnierczyk W, Aamodt A, Lægreid A. Knowledge-Based Support for Smart Pathway Building Tools. Proceedings of the International Conference on Case- Based Reasoning 2005, In Press
30
30 Modeling biological backgound information using case based reasoning Kusnierczyk W, Aamodt A, Lægreid A. Knowledge-Based Support for Smart Pathway Building Tools. Proceedings of the International Conference on Case-Based Reasoning 2005, In Press PathwayAssist networks can be used to generate cases for the case-based reasoning (CBR) tool
31
31 Arne K. Sandvik Helge L. Waldum Liv Thommesen Torunn Bruland Ola Ween Fekadu Yadetie Eva Hofsli Kristin Nørsett Kristine Misund Tonje S. Steigedal Norwegian Microarray Consortium Vidar Beisvåg Berit Doseth Eitrem Hallgeir Bergum Frode Jünge Lars Jølsum Vladimir Yankovski www.microarray.no Jan Komorowski Torgeir Hvidsten Herman Midelfart Mette Langaas Turid Follestad Bjørn Alsberg Arnar Flatberg Lars Giskehaug Einar Ryeng Agnar Aamodt Waclaw Kusnierczyk Kjell Bratbergsengen Heri Ramampiaro Yan Hua Chen Tore Amble Rune Sætre Torulf Mollestad Henrik Tveit Simen Myhre Sonja Ylving Berit Johansen Katarina Jørgensen Pauline Haddow Gunnar Tufte
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.