Towards Linguistically Grounded Ontologies Paul Buitelaar, Philipp Cimiano, Peter Haase, and Michael Sintek Proceedings of the 6 th European Semantic Web Conference (ESWC’09) Heraklion, Greece, May/June 2009, pl 11/27/2012Buitelaar et al.1
1 Introduction Ontologies need linguistic grounding because: – Easier for human developers – Automatic information extraction is easier – Helps in “verbalizing” an ontology RDFS, OWL, and SKOS not adequate - W3C standards Present a unified model LexInfo based on: – LingInfo – LexOnto – Lexical Markup Framework (LMF) – ISO standard Basis for future Semantic Web standardization pl 11/27/2012Buitelaar et al.2
2 Motivation Separation between Linguistic and Ontological Level Flexible Coupling of the Ontological and Language Systems Subcategorization and Predicate-Argument Structure Why Related Work is Not Enough pl 11/27/2012Buitelaar et al.3
Separation of Levels rdfs:label is not good enough: cat cats Katze Katzen Fails to capture linguistic relationships Linguistic data does not belong in domain ontology Capture in a separate linguistic model - lexicon pl 11/27/2012Buitelaar et al.4
Flexible Coupling of Layers pl 11/27/2012Buitelaar et al.5
Subcategorization and Predicate Arguments Part-of-speech information is essential: – (Germany, capital, Berlin) – capital is a noun Need subcategorization frames: – (Rhein, flowsThrough, Karlsruhe) – flow is intransitive, requires through phrase, flow => flows Must capture variation of expression: – locatedAt: passes by, connects, goes through Map verb arguments to predicate arguments: – [The A8: subject] connects [Karlsruhe: direct object] => (Karlsruhe, locatedAt, A8) pl 11/27/2012Buitelaar et al.6
Why Related Work is Not Enough More expressive models are needed: – Capture morphology separately – Represent decomposition and linking of components – Model complex linguistic patterns, eg. subcat. frames – Specify meaning with respect to a domain ontology – Clearly separate linguistic and ontological levels SKOS, LMF, LexOnto, NLP frameworks, and LWF all fail to meet some of the requirements pl 11/27/2012Buitelaar et al.7
3 Towards an Ontological and Linguistic Joint Model Previous Work – LingInfo – direct connection of linguistic information to classes and properties – LexOnto – subcategorization frames and relation to properties – Lexical Markup Framework (LMF) – core package plus extensions for morphology, syntax, and semantics The LexInfo Model – built on LMF, integrates LingInfo and LexOnto models pl 11/27/2012Buitelaar et al.8
The LexInfo Model Req. 1: Morphology Relations – Already done in LMF Req. 2: Decomposition of Complex Terms – ListOfComponents extends LMF morphology – Make owl:Entity subclass of lmf:Sense Req. 3: Subcategorization Frames – Link lmf:SyntacticBehavior to lmf:PredicativeRepresentation – Additional sublclasses for LMF classes Req. 4: Relate to Domain Ontologies – Automatic by linking to domain ontologies Req. 5: Separation Between Linguistics and Ontologies – Fully separate, related by OWL2 meta-ontology pl 11/27/2012Buitelaar et al.9
pl 11/27/2012Buitelaar et al.10
pl 11/27/2012Buitelaar et al.11
pl 11/27/2012Buitelaar et al.12
pl 11/27/2012Buitelaar et al.13
pl 11/27/2012Buitelaar et al.14
pl 11/27/2012Buitelaar et al.15
pl 11/27/2012Buitelaar et al.16
4 Conclusions Language/knowledge interface too complex for RDFS/OWL/SKOS alone LingInfo allows publishing reusable models Other models fall short of requirements LexInfo integrates LingInfo and LexOnto models using LMF as the “glue” Ontologies and Java API available on Web Intend to continue developing and working with the LFM working group Basis for further standardization pl 11/27/2012Buitelaar et al.17