TUNING HIERARCHIES IN PRINCETON WORDNET AHTI LOHK | CHRISTIANE D. FELLBAUM | LEO VÕHANDU THE 8TH MEETING OF THE GLOBAL WORDNET CONFERENCE IN BUCHAREST.

Slides:



Advertisements
Similar presentations
Semi-automatic compound nouns annotation for data integration systems Tuesday, 23 June 2009 SEBD 2009 Sonia Bergamaschi Serena Sorrentino
Advertisements

Building Wordnets Piek Vossen, Irion Technologies.
Ontologies ARIN Practical W7/Spr Dimitar Kazakov & Suresh Manandhar.
10th Conference on Artificial Intelligence in Medicine (AIME 05) July 2005 Aberdeen, Scotland Building Medical Ontologies based on Terminology.
S-Match: an Algorithm and an Implementation of Semantic Matching Pavel Shvaiko 1 st European Semantic Web Symposium, 11 May 2004, Crete, Greece paper with.
Improved TF-IDF Ranker
CL Research ACL Pattern Dictionary of English Prepositions (PDEP) Ken Litkowski CL Research 9208 Gue Road Damascus,
Leveraging Data and Structure in Ontology Integration Octavian Udrea 1 Lise Getoor 1 Renée J. Miller 2 1 University of Maryland College Park 2 University.
Systems Analysis and Design 8th Edition
Ewa Rudnicka, Wojciech Witkowski, Maciej Piasecki G4.19 Research Group Institute of Informatics, Wrocław University of Technology nlp.pwr.wroc.pl plwordnet.pwr.wroc.pl.
Complete and Consistent Annotation of WordNet with the Top Concept Ontology Javier Álvez, Jordi Atserias, Jordi Carrera, Salvador Climent, Egoitza Laparra,
Asa MacWilliams Lehrstuhl für Angewandte Softwaretechnik Institut für Informatik Technische Universität München Dec Software.
NLP and Speech 2004 Feature Structures Feature Structures and Unification.
Erasmus University Rotterdam Frederik HogenboomEconometric Institute School of Economics Flavius Frasincar.
Klaus M. Frei1 WordNet „An On-line Lexical Database“ (Miller, G. A.; Beckwith, R.; Fellbaum, Chr.; Gross, D.; Miller, K. 1993, title). Based on psycho-linguistic.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 The Enhanced Entity- Relationship (EER) Model.
Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Presenter: Cosmin Adrian Bejan Alexander Budanitsky and.
Creating a Bilingual Ontology: A Corpus-Based Approach for Aligning WordNet and HowNet Marine Carpuat Grace Ngai Pascale Fung Kenneth W.Church.
Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.
Systems Analysis & Design Sixth Edition Systems Analysis & Design Sixth Edition Toolkit Part 5.
Article by: Feiyu Xu, Daniela Kurz, Jakub Piskorski, Sven Schmeier Article Summary by Mark Vickers.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Detection of Relations in Textual Documents Manuela Kunze, Dietmar Rösner University of Magdeburg C Knowledge Based Systems and Document Processing.
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications Chapters Presented by Sole.
10 December, 2013 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: DPM Meta model CWA1Page 1.
Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.
Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,
WORDNET Approach on word sense techniques - AKILAN VELMURUGAN.
Adam Pease and Christiane Fellbaum Presenter: 吳怡安
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Learning Information Extraction Patterns Using WordNet Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield,
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Jiuling Zhang  Why perform query expansion?  WordNet based Word Sense Disambiguation WordNet Word Sense Disambiguation  Conceptual Query.
12th of October, 2006KEG seminar1 Combining Ontology Mapping Methods Using Bayesian Networks Ontology Alignment Evaluation Initiative 'Conference'
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
On the Issue of Combining Anaphoricity Determination and Antecedent Identification in Anaphora Resolution Ryu Iida, Kentaro Inui, Yuji Matsumoto Nara Institute.
NLP And The Semantic Web Dainis Kiusals COMS E6125 Spring 2010.
The Current State of FrameNet CLFNG June 26, 2006 Fillmore.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
Quality Control for Wordnet Development in BalkaNet Pavel Smrž Faculty of Informatics, Masaryk University in Brno, Czech.
Application of INTEX in refinement and validation of Serbian WordNet Ivan Obradović, Ranka Stanković Cvetana Krstev, Gordana Pavlović-Lažetić University.
Systems Analysis & Design 7 th Edition Chapter 5.
Part4 Methodology of Database Design Chapter 07- Overview of Conceptual Database Design Lu Wei College of Software and Microelectronics Northwestern Polytechnical.
Wordnet - A lexical database for the English Language.
S calable K nowledge C omposition Ontology Interoperation January 19, 1999 Jan Jannink, Prasenjit Mitra, Srinivasan Pichai, Danladi Verheijen, Gio Wiederhold.
Element Level Semantic Matching Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan Paper by Fausto.
Using Semantic Relatedness for Word Sense Disambiguation
Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.
Subjectivity Recognition on Word Senses via Semi-supervised Mincuts Fangzhong Su and Katja Markert School of Computing, University of Leeds Human Language.
Using Wikipedia for Hierarchical Finer Categorization of Named Entities Aasish Pappu Language Technologies Institute Carnegie Mellon University PACLIC.
Knowledge Structure Vijay Meena ( ) Gaurav Meena ( )
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
Experiences of (Lexicographers and) Computer Scientists in Validating Estonian Wordnet with Test Patterns Ahti Lohk | Kadri Vare | Heili Orav | Leo Võhandu.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Mapping the NCI Thesaurus and the Collaborative Inter-Lingual Index Amanda Hicks University of Florida HealthInsight Workshop, Oslo, Norway.
Ontology Evaluation Outline Motivation Evaluation Criteria Evaluation Measures Evaluation Approaches.
METADATA MANAGEMENT AT ISTAT: CONCEPTUAL FOUNDATIONS AND TOOLS Istituto Nazionale di Statistica ITALY.
Lexicons, Concept Networks, and Ontologies
Talp Research Center, UPC, Barcelona, Spain
Element Level Semantic Matching
Ontology Evolution: A Methodological Overview
11/15/2018 Drug Side Effects Data Representation and Full Spectrum Inferencing using Knowledge Graphs in Intelligent Telehealth Presented on Student-Faculty.
WordNet: A Lexical Database for English
WordNet WordNet, WSD.
Co-champions: Mike Bennett, Andrea Westerinen
Chapter 5.
Presentation transcript:

TUNING HIERARCHIES IN PRINCETON WORDNET AHTI LOHK | CHRISTIANE D. FELLBAUM | LEO VÕHANDU THE 8TH MEETING OF THE GLOBAL WORDNET CONFERENCE IN BUCHAREST JANUARY 27-30, 2016

Outline A classified overview of the methods to validate wordnet hierarchies Graph-based methods The advantages of graph-based methods What kind of them are applied to Princeton WordNet Some new patterns and their examples Does it make sense to apply graph-based method on other wordnets? Summary Encouragement to wordnet developers to use these methods

What kind of methods different developers have used? Group of methods Use of corpus data, lexical resources Use the contents of a synset Popularity Corpus-based methods ++High Rule-based methods –+Medium Graph-based methods ––Low (yet!)

Corpus-based methods Different techniques for extracting the relevant information have been applied. Some of the well-known approaches include: Lexico-syntactic patterns (Hearst, 1992), (Nadig et al., 2008) Similarity measurements (Sagot and Fišer, 2012) Mapping and comparing to wordnet (Pedersen et al., others, 2013) Applying wordnet in NLP tasks (Saito et al., 2002) Group of methods Use of corpus data, lexical resources Use the contents of a synset Popularity Corpus-based meth.++High Rule-based meth.–+Medium Graph-based meth.––Low (yet!)

Rule-based methods These methods for validating hierarchies rely on lexical relations (word-word), semantic relations (concept-concept) and the rules among them. This includes the rules applied to the construction of WordNet (Fellbaum, 1998), and additional rules, such as the following: Metaproperties (rigidity, identity, unity and dependence) described in ontology construction (Guarino and Welty, 2002) Top Ontology concepts or “unique beginners” (Atserias et al., 2005; Miller, 1998) Specific rules for particular error detections (Gupta, 2002; Nadig et al., 2008). For instance, a rule proposed by (Nadig et al., 2008):“If one term of a synset X is a proper suffix of a term in a synset Y, X is a hypernym of Y” Group of methods Use of corpus data, lexical resources Use the contents of a synset Popularity Corpus-based meth.++High Rule-based meth.–+Medium Graph-based meth.––Low (yet!)

The advantages of graph-based methods Test patterns are applicable to wordnets in every language Test patterns highlight substructures that refer to possible errors and they simplify the work of the expert lexicographer (Lohk et al., 2012a), (Lohk et al., 2012b), (Lohk et al., 2014b) Using a test is always quicker than “[doing] a full revision in top- down or alphabetical order” (Čapek, 2012).

Graph-based methods These methods are purely formal and do not take into account the semantics among word forms. Specific substructures of a wordnet’s hierarchies are checked and validated. Target substructures include: Cycles (Šmrz, 2004), (Kubis, 2012) Shortcuts (Fischer, 1997) Rings (Liu et al., 2004; Richens, 2008) Dangling uplinks (Koeva et al., 2004; Šmrz, 2004) Orphan nodes (null graphs) (Čapek, 2012) Cycle ShortcutRingDangling uplink Group of methods Use of corpus data, lexical resources Use the contents of a synset Popularity Corpus-based meth.++High Rule-based meth.–+Medium Graph-based meth.––Low (yet!)

An artificial hierarchy and specific substructures 1 Short cut 2 Heart-shaped substructure 3 Ring 4 Closed subset 5 Dense component 6 Connected roots + 4 substructures Specific substructures = test patterns

Dense component

Heart-shaped substructure

„Compound“ pattern

Connected roots

Wordnets in comparison Wordnet Noun roots Verb roots Multiple inheritance cases Short cuts Rings Synset with many roots Heart- shaped substructure Dense component „Compound“ pattern The largest closed subsets Princeton WordNet Version ,453402, ,333×167 Finnish Wordnet Version ,453402, ,334×167 Cornetto Version , ,309621, ,032×589 Polish Wordnet Version , , ,254 5, ,794×4,683 Estonian Wordnet Version x4 13