Integrating lexical units, synsets and ontology in the Cornetto Database Piek Vossen 1, 2, Isa Maks 1, Roxane Segers 1, Hennie van der Vliet 1 1: Faculty.

Slides:



Advertisements
Similar presentations
Building Wordnets Piek Vossen, Irion Technologies.
Advertisements

FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT ) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content.
A centralized approach to language resources Piek Vossen S&T Forum on Multilingualism, Luxembourg, June 6th 2005.
Panel on Lexical Units Compounds as words Piek Vossen, Irion Technologies & Vrije Universiteit Amsterdam Global Wordnet Conference Szeged, Hungary,
Languages & Inference Appropriate layering Do we need a logic? Do we need Description Logic? Legacy data; database storage vs inference Tolerant/anytime.
The encoding of adjectives in the Dutch semantic database CORNETTO LREC, Marrakech May 2008 Isa Maks 1 Piek Vossen 1, 2, Roxane Segers 1 Hennie.
The Cornetto Database Piek Vossen, Isa Maks, Willy Martin, Hennie van der Vliet => Vrije Universiteit Amsterdam, Faculteit der Letteren Katja Hofmann,
Stevin programmadag 11 September 2006 Antwerpen. Stevin programmadag, 11 september 2006, Antwerpen 2 Consortium Vrije Universiteit Amsterdam, Faculteit.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Exploring the Effectiveness of Lexical Ontologies for Modeling Temporal Relations with Markov Logic Eun Y. Ha, Alok Baikadi, Carlyle Licata, Bradford Mott,
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme Presented by Smitashree Choudhury.
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
Improved TF-IDF Ranker
Knowledge Representation
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
ICT Monica Monachini – 1° KYOTO Workshop – Amsterdam 2/ KYOTO (ICT ) Yielding Ontologies for Transition-Based Organization Intelligent.
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
1 Ontology Language Comparisons doug foxvog 16 September 2004.
Section 4: Language and Intelligence Overview Instructor: Sandiway Fong Department of Linguistics Department of Computer Science.
The Harmony of Music and Computing Jantine Trapman Expanding a Domain- Specific Database.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
Klaus M. Frei1 WordNet „An On-line Lexical Database“ (Miller, G. A.; Beckwith, R.; Fellbaum, Chr.; Gross, D.; Miller, K. 1993, title). Based on psycho-linguistic.
Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. WSMX Data Mediation Adrian Mocan
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING PROJECT VISTA: Integrating Heterogeneous Utility Data A very brief overview.
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications Chapters Presented by Sole.
Adam Pease and Christiane Fellbaum Presenter: 吳怡安
Technická 2896/ Brno tel.: fax: Institute of Foreign Languages.
Applied Linguistics 665 Introduction. Some Fundamental Concepts Every language is complex. All languages are systematic. (not for NS) Speech is the primary.
Machine Learning Approach for Ontology Mapping using Multiple Concept Similarity Measures IEEE/ACIS International Conference on Computer and Information.
1 Define a model 2 Populate the lexicon. Core Model.
Linguistics & AI1 Linguistics and Artificial Intelligence Linguistics and Artificial Intelligence Frank Van Eynde Center for Computational Linguistics.
BUSINESS INFORMATICS descriptors presentation Vladimir Radevski, PhD Associated Professor Faculty of Contemporary Sciences and Technologies (CST) Linkoping.
LREC 2008 AWN 1 Arabic WordNet: Semi-automatic Extensions using Bayesian Inference H. Rodríguez 1, D. Farwell 1, J. Farreres 1, M. Bertran 1, M. Alkhalifa.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Suléne Pilon & Danie Prinsloo Overview: Teaching and Training in South Africa 25 November 2008;
Using a Lemmatizer to Support the Development and Validation of the Greek WordNet Harry Kornilakis 1, Maria Grigoriadou 1, Eleni Galiotou 1,2, Evangelos.
Quality Control for Wordnet Development in BalkaNet Pavel Smrž Faculty of Informatics, Masaryk University in Brno, Czech.
A Declarative Similarity Framework for Knowledge Intensive CBR by Díaz-Agudo and González-Calero Presented by Ida Sofie G Stenerud 25.October 2006.
1 Berendt: Advanced databases, first semester 2011, 1 Advanced databases – Inferring new knowledge.
Advanced topics in software engineering (Semantic web)
Introduction to Linguistics Ms. Suha Jawabreh Lecture # 2.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
Very Large Cross-lingual Resources at OAEI 2008 Laura Hollink Véronique Malaisé Vrije Universiteit Amsterdam.
Organization of the Lab Three meetings:  today: general introduction, first steps in Protégé OWL  November 19: second part of tutorial  December 3:
Working with Ontologies Introduction to DOGMA and related research.
Term Extraction with the Steunpunt Nederlandstalige Terminologie Dutch Terminology Service Centre Hennie van der Vliet Vrije Universiteit Amsterdam
Copy right 2004 Adam Pease permission to copy granted so long as slides and this notice are not altered Ontology Overview Introduction.
Semantic Grounding of Tag Relatedness in Social Bookmarking Systems Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme ISWC 2008 Hyewon Lim January.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Quality Assurance, Ontology Engineering, and Semantic Interoperability.
Introduction to Language and Society August 25. Areas in Linguistics Phonetics (sound) Phonology (sound in mind) Syntax (sentence structure) Morphology.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
WonderWeb. Ontology Infrastructure for the Semantic Web. IST WP4: Ontology Engineering Heiner Stuckenschmidt, Michel Klein Vrije Universiteit.
Merge Domain ontologies below Upper ontology Advisor: P-J, LEE Student: Y-C, LIN Date: April
Constructing A Yami Language Lexicon Database from Yami Archiving Projects Meng-Chien Yang(Providence University, Taiwan) D. Victoria Rau(National Chung.
Of 24 lecture 11: ontology – mediation, merging & aligning.
DALOS Progress Meeting – April 20th Florence The Lois data base A Knowledge Organization System for Dalos Daniela Tiscornia.
Mapping the NCI Thesaurus and the Collaborative Inter-Lingual Index Amanda Hicks University of Florida HealthInsight Workshop, Oslo, Norway.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Quality Assurance, Ontology Engineering, and Semantic Interoperability.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Talp Research Center, UPC, Barcelona, Spain
Ontology From Wikipedia, the free encyclopedia
CSC 594 Topics in AI – Applied Natural Language Processing
Ontology.
ece 720 intelligent web: ontology and beyond
WordNet: A Lexical Database for English
Bulgarian WordNet Svetla Koeva Institute for Bulgarian Language
Ontology.
CIS Monthly Seminar – Software Engineering and Knowledge Management IS Enterprise Modeling Ontologies Presenter : Dr. S. Vasanthapriyan Senior Lecturer.
Presentation transcript:

Integrating lexical units, synsets and ontology in the Cornetto Database Piek Vossen 1, 2, Isa Maks 1, Roxane Segers 1, Hennie van der Vliet 1 1: Faculty of Arts, Vrije Universiteit Amsterdam 2: Irion Technologies, Delft

LREC, Marrakech May Lrec conference, Marrakech, May, Project Cornetto Financed by NTU Dutch Language Union STEVIN: Dutch Flemish Research Programme for Dutch Language and Speech Technology ( ) Consortium partners VUA (Vrije Universiteit Amsterdam, General Linguistics Department) UvA (University of Amsterdam, Informatics Institute) K.U. Leuven (Katholieke Universiteit Leuven, Department of Computer Science) Irion Technologies BV Delft

LREC, Marrakech May Overview Goals of the project What’s in the Cornetto database? Integrating the ontology: Sumo terms and new axioms

LREC, Marrakech May Goals of the Cornetto project COmbinatorial Relational NEtwork voor Taal TOepassingen Goal: to develop a lexical semantic database for Dutch:  40K Entries: generic and central part of the language  Rich horizontal and vertical semantic relations  Combinatoric information  Ontological information

LREC, Marrakech May Approach Combine the information from two existing Dutch lexical resources:  The Dutch wordnet (DWN): synsets and lexical semantic relations  The Referentiebestand Nederlands (RBN): morpho- syntactic information, semantic information, pragmatic information, frame structures, lexical functions and combinatorics Link to English WordNet Link to Wordnet Domains Link to SUMO

LREC, Marrakech May Dutch Wordnet Referentie Bestand English Wordnet SUMO (KIF) WN-DOMAINS Align/Merge Cornetto  *** Ontology: Dolce, Sumo Entry -LU/Synset -Pos -DWN data -RBN data -SUMO-pointer -PWN-pointer -Domain *** Acquisition Toolkit Acquisition Toolkit Corpus Validation Corpus Project overview Editing DOLCE (KIF)

LREC, Marrakech May Data Organization Internal relations Princeton Wordnet Domains Spanish Wordnet Czech Wordnet German Wordnet French Wordnet Korean Wordnet Arabic Wordnet SUMO MILO Collection of Terms and Axioms Correspond to word- meaning pair form morphology syntax semantics pragmatics usage examples Lexical Unit (LU) Model meaning relations Synset Synonyms

LREC, Marrakech May Integrating the ontology: Sumo terms and new axioms

LREC, Marrakech May Rationale for an ontological layer Formal and fundamental model of meaning Detection of inconsistencies Formal reasoning Global semantic grid

LREC, Marrakech May SUMO/MILO as ontological framework Based on pragmatic grounds: - availability, size, coverage - linking to English Wordnet - mapping to other Wordnet-like projects

LREC, Marrakech May KIF Expressions vs triplets Axioms in Sumo are written in SUO-KIF Cornetto: replaced by triplets, based on first order logic SUMOCornetto triplet (and(instance, 0, Water) (exists ?L ?W)(instance, 1, Liquid) (instance, ?W, Water)(Attribute, 1, 0) (instance, ?L, Liquid) (Attribute, ?L, ?W))

LREC, Marrakech May Mapping to SUMO Subsumption, equivalence, instance tea (drink) (+,, Tea) tea (shrub) (+,, FloweringPlant) date (fruit)(=,, Datefruit) Marrakech(instance,, City)

LREC, Marrakech May Ontology mapping: female/male variants Teacher (a person whose occupation is teaching) SUMO: equivalent to Teacher In Dutch: no neutral form leraar (male teacher) (+,,Teacher), (instance,, Man) lerares (female teacher) (+,,Teacher), (instance,, Woman)

LREC, Marrakech May Synsets versus Ontology Types Many Synsets are lexicalizations that can name instances of the same Sumo Type in different contexts:  water used for a purpose (dishwater)  water occurring somewhere or originating from (tap water)  water being the result of a process (meltwater) The latter do not grant the introduction of new Types in the ontology

LREC, Marrakech May Complex ontology mapping  theewater (for making tea) (exists (?A ?W) (and (instance ?W Water) (hasPurposeForAgent ?W (exists (?T) (and (instance ?T Tea) (part ?W ?T))))))  Simplified representation as list of triplets: (instance, 0, Water) (instance, 1, Tea) (instance, 2, Making) (component, 0, 1) (resource, 0,2) (result,1, 2)

LREC, Marrakech May Complex ontology mapping leidingwater, gemeentepils, kraanwater (out of the tap) (exists (?W ?F ?R) (and (instance ?W Water) (instance ?F Faucet(=Device)) (instance ?R Removing) (origin ?R ?F) (patient ?R ?W))) (instance, 0, Water), (instance, 1, Device), (instance, 2, Removing) (origin, 2, 1) (patient, 2, 0)

LREC, Marrakech May Some more triplets for water kwelwater (groundwater coming to the surface by the pressure of water, especially occurring close to a dike) (instance, 0, GroundWater), (instance, 1, StationaryArtifact (=Dike)), (instance, 2, StreamWaterArea) (instance, 3, MotionUpward)

LREC, Marrakech May But what to do with… Grondwater (groundwater) Sumo term: GroundWater ("Groundwater is the subclass of Water that is found in deposits in the earth.")Groundwater Water But is ground water a subclass of Water, or is it an instance of water with a certain place, usage or origin?Water ‘The groundwater got polluted.’ ‘They used groundwater for crop irrigation’

LREC, Marrakech May The end…..