How dominant is the commonest sense of a word? Adam Kilgarriff Lexicography MasterClass Univ of Brighton.

Slides:



Advertisements
Similar presentations
Finding multiwords of more than two words Adam Kilgarriff, Pavel Rychly, Vojtech Kovar, Vıt Baisa Lexical Computing Ltd; Masaryk Univ., Cz.
Advertisements

CICWSD: programming guide
Multi-Document Person Name Resolution Michael Ben Fleischman (MIT), Eduard Hovy (USC) From Proceedings of ACL-42 Reference Resolution workshop 2004.
Word Sense Disambiguation MAS.S60 Catherine Havasi Rob Speer.
Evaluating the Waspbench A Lexicography Tool Incorporating Word Sense Disambiguation Rob Koeling, Adam Kilgarriff, David Tugwell, Roger Evans ITRI, University.
The Impact of Task and Corpus on Event Extraction Systems Ralph Grishman New York University Malta, May 2010 NYU.
Linking Dictionary and Corpus Adam Kilgarriff Lexicography MasterClass Ltd Lexical Computing Ltd University of Sussex UK.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Simple Features for Chinese Word Sense Disambiguation Hoa Trang Dang, Ching-yi Chia, Martha Palmer, Fu- Dong Chiou Computer and Information Science University.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
1 Corpora for the coming decade Adam Kilgarriff Lexical Computing Ltd.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
Taking the Kitchen Sink Seriously: An Ensemble Approach to Word Sense Disambiguation from Christopher Manning et al.
CS 4705 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised –Dictionary-based.
1 Complementarity of Lexical and Simple Syntactic Features: The SyntaLex Approach to S ENSEVAL -3 Saif Mohammad Ted Pedersen University of Toronto, Toronto.
Document Centered Approach to Text Normalization Andrei Mikheev LTG University of Edinburgh SIGIR 2000.
LSA 311 Computational Lexical Semantics Dan Jurafsky Stanford University Lecture 2: Word Sense Disambiguation.
WSD using Optimized Combination of Knowledge Sources Authors: Yorick Wilks and Mark Stevenson Presenter: Marian Olteanu.
On lexical ambiguity Ágoston Tóth, PhD University of Debrecen Ruzomberok 24 June, 2009 Sponsored by OTKA research grant K
Albert Gatt Corpora and Statistical Methods Lecture 9.
Word Sense Disambiguation. Word Sense Disambiguation (WSD) Given A word in context A fixed inventory of potential word senses Decide which sense of the.
WORDNET Approach on word sense techniques - AKILAN VELMURUGAN.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Multilingual Word Sense Disambiguation using Wikipedia Bharath Dandala (University of North Texas) Rada Mihalcea (University of North Texas) Razvan Bunescu.
Francisco Viveros-Jiménez Alexander Gelbukh Grigori Sidorov.
A Fully Unsupervised Word Sense Disambiguation Method Using Dependency Knowledge Ping Chen University of Houston-Downtown Wei Ding University of Massachusetts-Boston.
Part 3. Knowledge-based Methods for Word Sense Disambiguation.
Word senses Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds, Sussex.
Why We Need Corpora and the Sketch Engine Adam Kilgarriff Lexical Computing Ltd, UK Universities of Leeds and Sussex.
Without data, nothing Adam Kilgarriff Lexical Computing Ltd University of Leeds.
Word Sense Disambiguation UIUC - 06/10/2004 Word Sense Disambiguation Another NLP working problem for learning with constraints… Lluís Màrquez TALP, LSI,
90288 – Select a Sample and Make Inferences from Data The Mayor’s Claim.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
90288 – Select a Sample and Make Inferences from Data The Mayor’s Claim.
Paper Review by Utsav Sinha August, 2015 Part of assignment in CS 671: Natural Language Processing, IIT Kanpur.
W ORD S ENSE D ISAMBIGUATION By Mahmood Soltani Tehran University 2009/12/24 1.
Improving Subcategorization Acquisition using Word Sense Disambiguation Anna Korhonen and Judith Preiss University of Cambridge, Computer Laboratory 15.
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
2014 EMNLP Xinxiong Chen, Zhiyuan Liu, Maosong Sun State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information.
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
1 Word senses: a computational response Adam Kilgarriff Auckland 2012Kilgarriff: Word senses: a computational response.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
Lecture 21 Computational Lexical Semantics Topics Features in NLTK III Computational Lexical Semantics Semantic Web USCReadings: NLTK book Chapter 10 Text.
Do we need lexicographers? Prospects for automatic lexicography Adam Kilgarriff Lexical Computing Ltd University of Leeds UK.
Subcorpus configuration Adam Kilgarriff. Feb 2010Kilgarriff: IWSG: Subcorpora2 “you can’t get away from genre” Bonnie Weber, Keynote Lecture ICON (Indian.
NUMBER OR NUANCE: Factors Affecting Reliable Word Sense Annotation Susan Windisch Brown, Travis Rood, and Martha Palmer University of Colorado at Boulder.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.
©2012 Paula Matuszek CSC 9010: Text Mining Applications Lab 3 Dr. Paula Matuszek (610)
Automatic acquisition for low frequency lexical items Nuria Bel, Sergio Espeja, Montserrat Marimon.
Detecting Missing Hyphens in Learner Text Aoife Cahill, SusanneWolff, Nitin Madnani Educational Testing Service ACL 2013 Martin Chodorow Hunter College.
1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling.
Zdroje jazykových dat Word senses Sense tagged corpora.
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Word Sense and Subjectivity (Coling/ACL 2006) Janyce Wiebe Rada Mihalcea University of Pittsburgh University of North Texas Acknowledgements: This slide.
1 Word senses: a computational response Adam Kilgarriff.
Finding Predominant Word Senses in Untagged Text Diana McCarthy & Rob Koeling & Julie Weeds & Carroll Department of Indormatics, University of Sussex {dianam,
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
Intro to NLP - J. Eisner1 Splitting Words a.k.a. “Word Sense Disambiguation”
SENSEVAL: Evaluating WSD Systems
Lecture 21 Computational Lexical Semantics
Category-Based Pseudowords
Statistical NLP: Lecture 9
WordNet WordNet, WSD.
Evaluate the limit: {image} Choose the correct answer from the following:
Unsupervised Word Sense Disambiguation Using Lesk algorithm
Factor Game Sample Game.
Statistical NLP : Lecture 9 Word Sense Disambiguation
Given that {image} {image} Evaluate the limit: {image} Choose the correct answer from the following:
Presentation transcript:

How dominant is the commonest sense of a word? Adam Kilgarriff Lexicography MasterClass Univ of Brighton

What do you think? (zero-freq senses don’t count)

The WSD task select correct sense in context sense inventory given in a dictionary old problem corpus methods are best

Lower bound Gale Church Yarowsky 1992 Baseline system: always choose commonest Around 70% Only small sample available SEMCOR Bigger sample, still too small SENSEVAL Big problem

Overview Mathematical model Evaluation (against SEMCOR) Implications for WSD evaluation

Model: assumptions Meanings unrelated Word sense frequency distribution same as word frequency distribution

Model All k word senses in a bag Randomly select 2 for a 2-sense word k(k-1)/2 possible 2-sense words

Set the frequency For a 2-sense word with freq 101, possibilities include 100:1 split How many times? 50:51 split How many times?

Words to model word senses Brown, or BNC How many types for each frequency Smooth to give monotonic-decreasing

Brown rawBrown smooth BNC rawBNC smooth … … … Freq # of words having that freq

Using Brown frequencies 100:1 split How many times? 16278*11.03 = 179,546 50:51 split How many times? 43.13*41.86 = 1805 Ratio 179,546:1805 = :1 split is 99 times likelier than 51:50

Generalising For a 2-sense word with fr=n select ‘commonest’ fr = m n/2 < m < n select another from subset where fr =n-m find all possible selections Calculate average ratio, commonest:other answer title question

Model: answers (BNC) n2-sense ‘words’ 3-sense ‘words’ 4-sense ‘words’

SEMCOR 250,000 word corpus Manually sense-tagged WordNet senses

Evaluate model against SEMCOR n2-sense words # % BNC 3-sense words # % BNC class class class

Discussion Same trend Assumption untrue: SFIP principle: a reading must be sufficiently frequent, insufficiently predictable to get into a dictionary generous vs pike generous: donation/person/helping pike: fish or weapon or hill or turnpike

Discussion More data, more meanings (without end) not changing ratios for known senses but addition of new senses Models pike not generous Dominated by singletons

SENSEVAL Evaluation exercise for WSD 1998; 2001; 2004 Two task-types: Lexical sample Choose a small samples of words and disambiguate multiple instances of each All-words Choose a text or two, disambiguate all words

Lower bound and SENSEVAL All-words Samples too small to see extent of skew freq of 2-sense word =3: lower bound=67% Lexical sample Skew in manual sample selection “good” candidate words show “balance” (amazing) Are systems better than baseline? SENSEVAL-3: systems scarcely beat baseline Not proven (and not likely)

What is the commonest sense Varies with domain More mileage than disambiguation cf default strategy in commercial MT McCarthy Koeling Weeds Carroll ACL-04 3-sentence window does not allow domain-identification methods Domain-id task more interesting and worthwhile than WSD

Thank you