OWL-AA: Enriching OWL with Instance Recognition Semantics for Automated Semantic Annotation 2006 Spring Research Conference Yihong Ding
2 Semantic Web and Automated Semantic Annotation Semantic Web: the web containing machine- processable web data Semantic Annotation: adds formal metadata to web pages Metadata links data in a web page to defined concepts in an ontology Annotated data becomes machine-processable Annotation needs automation to be scalable
3 “Main Drawback” of Current Automated Semantic Annotation Problem: “post-processing and mapping of the IE [information extraction] results to an ontology” [Kiryakov 2004] Needs human intervention Decreases system automation and scalability Solution: “use ontolog[ies] more directly during the process of extraction” [Kiryakov 2004] Does work (as our ontology-based annotation shows) But …
4 A Hidden Problem: Compatible with Standards A solution should be compatible with semantic web standards OWL (Web Ontology Language): standard Solutions must be OWL-compatible Current Solution OSMX (Object-oriented Systems Model in XML): not a standard, not OWL-compatible Declarative instance recognition semantics Needed by automated annotation process Lacking in OWL
5 Instance Recognition Semantics in Extraction Ontologies Instance recognition semantics: machine- processable recognizers of instances that belong to the extention of a concept in a specified domain. Examples in extraction ontologies External Representation Price: \d+|\d?\d?\d,\d\d\d Make: CarMake.lexicon Contextual Representation Context phrases (left, right), e.g. \$? Context keywords: e.g. price | obo | neg(\.|otiable)
6 OWL: Lacks Instance Recognition Semantics In general, OWL Declares class, property, hierarchical relationship, restriction. Declares instantiations. Does not support declaration of “instance recognition” Consequently, Not enough declarative semantics in OWL directly useable by automated annotation Mixture of knowledge declaration and knowledge processing Domain experts must know program implementation; Or, program developers must be domain experts. No annotation integrity checking Taurus is legal, though it is incorrect; And, machines cannot catch this error.
7 OWL-AA (RDF Schema)
8
9 OWL-AA Declarations
10 Implementation Jena API converts OWL-AA ontologies to OSMX ontologies Use OSMX ontologies to do automated annotation
11 Conclusion OWL-AA is a way to extend OWL to provide for automated semantic annotation. OWL-AA overcomes the “main drawback” of automated semantic annotation. OWL-AA allows us to separate the creation of domain knowledge from the implementation of a processor to use domain knowledge for the purpose of annotating web pages. OWL-AA provides for annotation integrity checking.
12 Instantiation Declaration Declaration vs. Instantiation
13 Instance Recognition Semantics Machine-processable recognizers of instances that belong to the extention of a concept in a specified domain. IRecS of Left Concept: has line in eye IRecS of Right Concept: no line in eye