Download presentation
Presentation is loading. Please wait.
Published byMaximilian Poole Modified over 9 years ago
1
An Ontology for Protein- Protein Interaction Data Karen Jantz CIS Honors Project December 7, 2006
2
Overview Problem Statement Objectives Approach Background Methodology Evaluation Demonstration Conclusion
3
Problem Statement Several sources for protein-protein interaction data Different schemata Different purposes Different strengths/weaknesses
4
Objectives Unify the data Enable data mining Evaluate reliability of data across data sources Gain new information about the entire data set Enable others to easily add other data sources to the set
5
Approach: ontology o ontology – n. 1. that which exists (philosophy) 2. that which is represented (artificial intelligence) o A descriptive data model o Defines the entities and relationships within a domain o Based upon data o Human-readable
6
Approach: ontology Data integration Enables simultaneous querying across multiple databases Data transformation Enables interchange between database formats Data mining Enables reasoning and learning over the entire data set
7
Background: Data Sources DIP (Jing Xia) D atabase of I nteracting P roteins Most reliable data set Jing Xia BIND (Abhijit Erande, Aaron Schoenhofer) B iomolecular I nteractions N etwork D atabank Very large data set Contains interactions, molecular complexes, and pathways
8
Background: Data Sources MINT M olecular INT eractions database experimentally verified protein interactions Evaluates confidence level IntAct Not limited to binary interactions Allows user submissions mips CYGD M unich I nformation C enter for P rotein S equences: C omprehensive Y east G enome D atabase Limited to yeast Focuses on sequencing
9
Background: Tools Protégé Open-Source Project Graphical ontology editor Interacts with OWL Reasoner Detailed API for modifying ontologies programmatically
10
Background: Tools Prompt A Protégé Plugin Enables ontology mapping Enables ontology comparison
11
Background: Related Work PSI-MI Controlled vocabulary for PPI data Not a proposed database structure Decreases the strength of information Helpful in defining relationships and keys
12
Methodology: Overview Q: What interactions have been observed between with protein A? DIPBINDMIPSMINTIntAct Web Interface Unified Ontology Unified Data Set Q: What experiments give evidence for a given interaction?
13
Methodology: Design Review the singular database schemata and determine strengths/weaknesses View data files Native formats PSI-MI formats Create a unified schema of the data sources Create the unified ontology in Protégé Create each singular database as a subset of the unified ontology
14
Protégé Screenshot
15
Methodology: Data Import DOMParser Load data from XML Protégé-OWL API Insert entities into singular databases
16
Methodology: Transformation Use Prompt to create a mapping for each specific data source to the unified ontology Use Prompt mappings to insert individuals from each singular ontology into the unified model
17
Methodology: Transformation Duplicate Data Need to fill in attributes on existing records Write ‘Algorithm Plugin’ for Prompt to determine when individuals are the same
18
Prompt Screenshot - Mapping
19
Methodology: Query Interface Export Protégé data into MySQL Web interface for collecting data Working with domain experts to determine useful views, queries
20
Evaluation Performance Transformation Time in Protégé Query Time for Web Interface Size Minimize redundancy in data model Minimize duplicate data
21
Evaluation Correctness Domain Experts Dr. Brown, Dr. Wang Maintain proper data relationships Utility Enrich data
22
Evaluation
23
Demonstration
24
Future Work Complete transformations Import data Evaluate ontology Add other databases to model
25
Conclusions Adequate start Needs improvement, evolution, more data sources As the project matures, the ontology will be ready for use in the biological domain Will be able to more easily gain information about protein-protein interactions
26
References AAAI.org - AITopics: “Ontology” http://www.aaai.org/AITopics/html/ontol.html Protégé http://protege.stanford.edu/overview/protege- owl.html http://protege.stanford.edu/overview/protege- owl.html Prompt http://protege.cim3.net/cgi-bin/wiki.pl?Prompt PSI-MI http://psidev.sourceforge.net/mi/xml/doc/user
27
References BIND http://www.bind.ca DIP http://www.dip.doe-mbi.ucla.edu IntAct http://www.ebi.ac.uk/intact/site/ MINT http://mint.bio.uniroma2.it/mint/Welcome.do MIPS http://mips.gsf.de/genre/proj/yeast
28
Q & A
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.