Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Semi-Automatic Semantic Annotation for Hidden-Web Tables Cui Tao & David W. Embley Data Extraction Research Group Department of Computer Science Brigham.

Similar presentations


Presentation on theme: "1 Semi-Automatic Semantic Annotation for Hidden-Web Tables Cui Tao & David W. Embley Data Extraction Research Group Department of Computer Science Brigham."— Presentation transcript:

1 1 Semi-Automatic Semantic Annotation for Hidden-Web Tables Cui Tao & David W. Embley Data Extraction Research Group Department of Computer Science Brigham Young University Supported by NSF

2 www.deg.byu.edu 2 Semantic Annotation  The Hidden Web:  Hidden behind forms  Hard to query “cdk-4"

3 www.deg.byu.edu 3 Semantic Annotation  The Hidden Web:  Hidden behind forms  Hard to query to find the protein and the animo-acids information for gene “cdk-4"

4 www.deg.byu.edu 4 Semantic Annotation  The Hidden Web:  Hidden behind forms  Hard to query  Semantic annotation  Machine-”understandable”  Publicly accessible

5 www.deg.byu.edu 5 System Overview  Initial semantic annotation  Manually annotate a sample page  With respect to a selected ontology  Table interpretation  Automatic  Tables from hidden web pages  Final semantic annotation  Automatic  Annotate interpreted tables

6 www.deg.byu.edu 6 Initial Semantic Annotation  SMORE: Semantic Markup, Ontology and RDF Editor [Maryland information and network dynamics lab]

7 www.deg.byu.edu 7

8 8 Table Interpretation  Table interpretation  Locate label and value  Pair label-value pairs  Remember path  TISP – Table Interpretation by Sibling Pages

9 www.deg.byu.edu 9 TISP

10 www.deg.byu.edu 10 Interpretation Technique: Sibling Page Comparison Same

11 www.deg.byu.edu 11 Interpretation Technique: Sibling Page Comparison Almost Same

12 www.deg.byu.edu 12 Interpretation Technique: Sibling Page Comparison Different Same

13 www.deg.byu.edu 13 Interpretation Technique: Sibling Page Comparison Label Path = Identification.Gene model(s).Gene Model Xpath = html[1]/…/table[3]/tr[1]/td[2]/table[1]/tr[6]/td[2]/table[1]/tr[2]/td[1] Structure Pattern of a Table

14 www.deg.byu.edu 14 Annotation Protein Name

15 www.deg.byu.edu 15 Annotation – Split Nucleotide Size

16 www.deg.byu.edu 16 Annotation – Merge Protein Information

17 www.deg.byu.edu 17 Annotation—Union Name

18 www.deg.byu.edu 18 Annotation—Selection Molecular Function

19 www.deg.byu.edu 19 Generated RDF Annotation

20 www.deg.byu.edu 20 Querying Annotated Data to find the protein and the animo-acids information for gene “cdk-4"

21 www.deg.byu.edu 21 Summary  Semi-automatic semantic annotation for hidden web tables  Facilitate large-scale annotation to the web


Download ppt "1 Semi-Automatic Semantic Annotation for Hidden-Web Tables Cui Tao & David W. Embley Data Extraction Research Group Department of Computer Science Brigham."

Similar presentations


Ads by Google