Download presentation
Presentation is loading. Please wait.
1
1 Semi-Automatic Semantic Annotation for Hidden-Web Tables Cui Tao & David W. Embley Data Extraction Research Group Department of Computer Science Brigham Young University Supported by NSF
2
www.deg.byu.edu 2 Semantic Annotation The Hidden Web: Hidden behind forms Hard to query “cdk-4"
3
www.deg.byu.edu 3 Semantic Annotation The Hidden Web: Hidden behind forms Hard to query to find the protein and the animo-acids information for gene “cdk-4"
4
www.deg.byu.edu 4 Semantic Annotation The Hidden Web: Hidden behind forms Hard to query Semantic annotation Machine-”understandable” Publicly accessible
5
www.deg.byu.edu 5 System Overview Initial semantic annotation Manually annotate a sample page With respect to a selected ontology Table interpretation Automatic Tables from hidden web pages Final semantic annotation Automatic Annotate interpreted tables
6
www.deg.byu.edu 6 Initial Semantic Annotation SMORE: Semantic Markup, Ontology and RDF Editor [Maryland information and network dynamics lab]
7
www.deg.byu.edu 7
8
8 Table Interpretation Table interpretation Locate label and value Pair label-value pairs Remember path TISP – Table Interpretation by Sibling Pages
9
www.deg.byu.edu 9 TISP
10
www.deg.byu.edu 10 Interpretation Technique: Sibling Page Comparison Same
11
www.deg.byu.edu 11 Interpretation Technique: Sibling Page Comparison Almost Same
12
www.deg.byu.edu 12 Interpretation Technique: Sibling Page Comparison Different Same
13
www.deg.byu.edu 13 Interpretation Technique: Sibling Page Comparison Label Path = Identification.Gene model(s).Gene Model Xpath = html[1]/…/table[3]/tr[1]/td[2]/table[1]/tr[6]/td[2]/table[1]/tr[2]/td[1] Structure Pattern of a Table
14
www.deg.byu.edu 14 Annotation Protein Name
15
www.deg.byu.edu 15 Annotation – Split Nucleotide Size
16
www.deg.byu.edu 16 Annotation – Merge Protein Information
17
www.deg.byu.edu 17 Annotation—Union Name
18
www.deg.byu.edu 18 Annotation—Selection Molecular Function
19
www.deg.byu.edu 19 Generated RDF Annotation
20
www.deg.byu.edu 20 Querying Annotated Data to find the protein and the animo-acids information for gene “cdk-4"
21
www.deg.byu.edu 21 Summary Semi-automatic semantic annotation for hidden web tables Facilitate large-scale annotation to the web
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.