Download presentation
Presentation is loading. Please wait.
1
1 Ontology Generation Based on a User-Specified Ontology Seed Cui Tao Data Extraction Research Group Department of Computer Science Brigham Young University Supported by NSF
2
www.deg.byu.edu 2 Introduction Motivation: Traditional search engines: return documents Ontology-based data extraction: return information Problem: Build extraction ontology that meet users needs Goal: Automatically build ontologies for users’ needs
3
www.deg.byu.edu 3 Example Example: a biologist is interested in information about large proteins in humans and their functions Possible queries: Find proteins in humans that are >20 kDa Find all the proteins in humans that serve as receptors ... Information sources --- various online databases NCBI Gene Cards The Gene Ontology GPM Proteomics Database …
4
www.deg.byu.edu 4 Extraction Ontology Regular Expression: ^\d{1,5}(\.\d{1,2})? Unit: kilodaltons?|kdas?|kds|?das?|daltons? Molecular Weight
5
www.deg.byu.edu 5 User Interface Select a title for the forms
6
www.deg.byu.edu 6 User Interface Binary Relationship Name Protein Name
7
www.deg.byu.edu 7 User Interface Binary Relationship Molecular Weight Protein Name Protein Molecular weight
8
www.deg.byu.edu 8 User Interface N-ary Relationship Chromosome number StartEnd Orientation Chromosome location Chromosome number StartEnd Orientation
9
www.deg.byu.edu 9 User Interface N-ary Relationship GO GO phrase GO ID Go ID Go term
10
www.deg.byu.edu 10 Protein Molecular Weight Name Chromosome location GO Chromosome number StartEndOrientation Overall Form Go ID Go term
11
www.deg.byu.edu 11 Ontology View Name Chromosome location Protein Chromosome number StartEnd Orientation GO GO phrase GO ID Molecular weight
12
www.deg.byu.edu 12 Protein Molecular Weight Name Chromosome location GO Chromosome number StartEndOrientation Go ID Go term Fill in the Form
13
www.deg.byu.edu 13 Protein Molecular Weight 29175 Daltons Name 14-3-3 protein epsilon Mitochondrial import stimulation factor Lsubunit Protein kinase C inhibitor protein-1 KCIP-1 14-3-3E Chromosome location GO Chromosome number 17 StartEndOrientation 1,250,267 1,194,558 minus Fill in the Form GO:0019899 GO:0019904 Go ID Go term enzyme binding protein domain specific binding
14
www.deg.byu.edu 14 Mapping Name 14-3-3 protein epsilon Mitochondrial import stimulation factor Lsubunit Protein kinase C inhibitor protein-1 KCIP-1 14-3-3E
15
www.deg.byu.edu 15 Mapping Name 14-3-3 protein epsilon Mitochondrial import stimulation factor Lsubunit Protein kinase C inhibitor protein-1 KCIP-1 14-3-3E
16
www.deg.byu.edu 16 Mapping Name
17
www.deg.byu.edu 17 Data Frame Generation Choose from data frame library Data frames for basic values Numbers within different ranges Integers, floats, etc Emails, phone numbers, addresses, etc Domain specific values (DNA sequences) Units Build lexicon files
18
www.deg.byu.edu 18 Data Frame Generation Find the best matched data frame from the library Find the correct units
19
www.deg.byu.edu 19 Build Lexicon Files Name
20
www.deg.byu.edu 20 Contribution Automatically generates ontologies depending on users’ requests Provides a tool for users to easily provide ontology seeds Automatically generates ontology views from ontology seeds Automatically map ontology concepts to source databases
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.