Download presentation
Presentation is loading. Please wait.
Published byPosy Bradford Modified over 8 years ago
1
Integrating lexical units, synsets and ontology in the Cornetto Database Piek Vossen 1, 2, Isa Maks 1, Roxane Segers 1, Hennie van der Vliet 1 1: Faculty of Arts, Vrije Universiteit Amsterdam 2: Irion Technologies, Delft
2
LREC, Marrakech 28-29-30 May 2008 2 Lrec conference, Marrakech, May, 2008 2 Project Cornetto Financed by NTU Dutch Language Union STEVIN: Dutch Flemish Research Programme for Dutch Language and Speech Technology (2004-2011) Consortium partners VUA (Vrije Universiteit Amsterdam, General Linguistics Department) UvA (University of Amsterdam, Informatics Institute) K.U. Leuven (Katholieke Universiteit Leuven, Department of Computer Science) Irion Technologies BV Delft
3
LREC, Marrakech 28-29-30 May 2008 3 Overview Goals of the project What’s in the Cornetto database? Integrating the ontology: Sumo terms and new axioms
4
LREC, Marrakech 28-29-30 May 2008 4 Goals of the Cornetto project COmbinatorial Relational NEtwork voor Taal TOepassingen Goal: to develop a lexical semantic database for Dutch: 40K Entries: generic and central part of the language Rich horizontal and vertical semantic relations Combinatoric information Ontological information
5
LREC, Marrakech 28-29-30 May 2008 5 Approach Combine the information from two existing Dutch lexical resources: The Dutch wordnet (DWN): synsets and lexical semantic relations The Referentiebestand Nederlands (RBN): morpho- syntactic information, semantic information, pragmatic information, frame structures, lexical functions and combinatorics Link to English WordNet Link to Wordnet Domains Link to SUMO
6
LREC, Marrakech 28-29-30 May 2008 6 Dutch Wordnet Referentie Bestand English Wordnet SUMO (KIF) WN-DOMAINS Align/Merge Cornetto *** Ontology: Dolce, Sumo Entry -LU/Synset -Pos -DWN data -RBN data -SUMO-pointer -PWN-pointer -Domain *** Acquisition Toolkit Acquisition Toolkit Corpus Validation Corpus Project overview Editing DOLCE (KIF)
7
LREC, Marrakech 28-29-30 May 2008 7 Data Organization Internal relations Princeton Wordnet Domains Spanish Wordnet Czech Wordnet German Wordnet French Wordnet Korean Wordnet Arabic Wordnet SUMO MILO Collection of Terms and Axioms Correspond to word- meaning pair form morphology syntax semantics pragmatics usage examples Lexical Unit (LU) Model meaning relations Synset Synonyms
8
LREC, Marrakech 28-29-30 May 2008 8 Integrating the ontology: Sumo terms and new axioms
9
LREC, Marrakech 28-29-30 May 2008 9 Rationale for an ontological layer Formal and fundamental model of meaning Detection of inconsistencies Formal reasoning Global semantic grid
10
LREC, Marrakech 28-29-30 May 2008 10 SUMO/MILO as ontological framework Based on pragmatic grounds: - availability, size, coverage - linking to English Wordnet - mapping to other Wordnet-like projects
11
LREC, Marrakech 28-29-30 May 2008 11 KIF Expressions vs triplets Axioms in Sumo are written in SUO-KIF Cornetto: replaced by triplets, based on first order logic SUMOCornetto triplet (and(instance, 0, Water) (exists ?L ?W)(instance, 1, Liquid) (instance, ?W, Water)(Attribute, 1, 0) (instance, ?L, Liquid) (Attribute, ?L, ?W))
12
LREC, Marrakech 28-29-30 May 2008 12 Mapping to SUMO Subsumption, equivalence, instance tea (drink) (+,, Tea) tea (shrub) (+,, FloweringPlant) date (fruit)(=,, Datefruit) Marrakech(instance,, City)
13
LREC, Marrakech 28-29-30 May 2008 13 Ontology mapping: female/male variants Teacher (a person whose occupation is teaching) SUMO: equivalent to Teacher In Dutch: no neutral form leraar (male teacher) (+,,Teacher), (instance,, Man) lerares (female teacher) (+,,Teacher), (instance,, Woman)
14
LREC, Marrakech 28-29-30 May 2008 14 Synsets versus Ontology Types Many Synsets are lexicalizations that can name instances of the same Sumo Type in different contexts: water used for a purpose (dishwater) water occurring somewhere or originating from (tap water) water being the result of a process (meltwater) The latter do not grant the introduction of new Types in the ontology
15
LREC, Marrakech 28-29-30 May 2008 15 Complex ontology mapping theewater (for making tea) (exists (?A ?W) (and (instance ?W Water) (hasPurposeForAgent ?W (exists (?T) (and (instance ?T Tea) (part ?W ?T)))))) Simplified representation as list of triplets: (instance, 0, Water) (instance, 1, Tea) (instance, 2, Making) (component, 0, 1) (resource, 0,2) (result,1, 2)
16
LREC, Marrakech 28-29-30 May 2008 16 Complex ontology mapping leidingwater, gemeentepils, kraanwater (out of the tap) (exists (?W ?F ?R) (and (instance ?W Water) (instance ?F Faucet(=Device)) (instance ?R Removing) (origin ?R ?F) (patient ?R ?W))) (instance, 0, Water), (instance, 1, Device), (instance, 2, Removing) (origin, 2, 1) (patient, 2, 0)
17
LREC, Marrakech 28-29-30 May 2008 17 Some more triplets for water kwelwater (groundwater coming to the surface by the pressure of water, especially occurring close to a dike) (instance, 0, GroundWater), (instance, 1, StationaryArtifact (=Dike)), (instance, 2, StreamWaterArea) (instance, 3, MotionUpward)
18
LREC, Marrakech 28-29-30 May 2008 18 But what to do with… Grondwater (groundwater) Sumo term: GroundWater ("Groundwater is the subclass of Water that is found in deposits in the earth.")Groundwater Water But is ground water a subclass of Water, or is it an instance of water with a certain place, usage or origin?Water ‘The groundwater got polluted.’ ‘They used groundwater for crop irrigation’
19
LREC, Marrakech 28-29-30 May 2008 19 The end…..
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.