Word Sense Disambiguation (WSD)

Slides:



Advertisements
Similar presentations
Semantics and Context in Natural Language Processing (NLP) Ari Rappoport The Hebrew University.
Advertisements

11 Chapter 20 Part 2 Computational Lexical Semantics Acknowledgements: these slides include material from Rada Mihalcea, Ray Mooney, Katrin Erk, and Ani.
CALTS, UNIV. OF HYDERABAD. SAP, LANGUAGE TECHNOLOGY CALTS has been in NLP for over a decade. It has participated in the following major projects: 1. NLP-TTP,
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
January 12, Statistical NLP: Lecture 2 Introduction to Statistical NLP.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
Word Sense Disambiguation Ling571 Deep Processing Techniques for NLP February 23, 2011.
Semantic similarity, vector space models and word- sense disambiguation Corpora and Statistical Methods Lecture 6.
Semi-supervised learning and self-training LING 572 Fei Xia 02/14/06.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
Language, Mind, and Brain by Ewa Dabrowska Chapter 2: Language processing: speed and flexibility.
CS 4705 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised –Dictionary-based.
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
Semi-Supervised Natural Language Learning Reading Group I set up a site at: ervised/
WSD using Optimized Combination of Knowledge Sources Authors: Yorick Wilks and Mark Stevenson Presenter: Marian Olteanu.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
ELN – Natural Language Processing Giuseppe Attardi
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
9/8/20151 Natural Language Processing Lecture Notes 1.
Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
Francisco Viveros-Jiménez Alexander Gelbukh Grigori Sidorov.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
1 Statistical NLP: Lecture 10 Lexical Acquisition.
Word Sense Disambiguation Many words have multiple meanings –E.g, river bank, financial bank Problem: Assign proper sense to each ambiguous word in text.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
PETRA – the Personal Embedded Translation and Reading Assistant Werner Winiwarter University of Vienna InSTIL/ICALL Symposium 2004 June 17-19, 2004.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
Paper Review by Utsav Sinha August, 2015 Part of assignment in CS 671: Natural Language Processing, IIT Kanpur.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
W ORD S ENSE D ISAMBIGUATION By Mahmood Soltani Tehran University 2009/12/24 1.
Natural Language Processing Artificial Intelligence CMSC February 28, 2002.
Introduction to CL & NLP CMSC April 1, 2003.
An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee
S1: Chapter 1 Mathematical Models Dr J Frost Last modified: 6 th September 2015.
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
Word Sense Disambiguation Kyung-Hee Sung Foundations of Statistical NLP Chapter 7.
HyperLex: lexical cartography for information retrieval Jean Veronis Presented by: Siddhanth Jain( ) Samiulla Shaikh( )
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
The interface between model-theoretic and corpus-based semantics
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Lecture 21 Computational Lexical Semantics Topics Features in NLTK III Computational Lexical Semantics Semantic Web USCReadings: NLTK book Chapter 10 Text.
Disambiguation Read J & M Chapter 17.1 – The Problem Washington Loses Appeal on Steel Duties Sue caught the bass with the new rod. Sue played the.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 24 (14/04/06) Prof. Pushpak Bhattacharyya IIT Bombay Word Sense Disambiguation.
Natural Language Processing Chapter 1 : Introduction.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 1 (03/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Introduction to Natural.
Natural Language Processing Slides adapted from Pedro Domingos
Zdroje jazykových dat Word senses Sense tagged corpora.
Knowledge-based Methods for Word Sense Disambiguation From a tutorial at AAAI by Ted Pedersen and Rada Mihalcea [edited by J. Wiebe]
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Second Language Learning From News Websites Word Sense Disambiguation using Word Embeddings.
NATURAL LANGUAGE PROCESSING
Tasneem Ghnaimat. Language Model An abstract representation of a (natural) language. An approximation to real language Assume we have a set of sentences,
Coarse-grained Word Sense Disambiguation
Lecture 21 Computational Lexical Semantics
Statistical NLP: Lecture 9
A method for WSD on Unrestricted Text
Artificial Intelligence 2004 Speech & Natural Language Processing
Statistical NLP : Lecture 9 Word Sense Disambiguation
Statistical NLP: Lecture 10
Presentation transcript:

Word Sense Disambiguation (WSD) By Asma Kausar and Joshim Uddin

Introduction to WSD In computation linguistics is the process of identifying and analyzing the meaning of words in context. An open problem of Natural Language Processing. Governs the process of which 'sense' of a word is used, since one word written or pronounced has different meanings

Problems Take the following examples: “ The rebel seized the opportunity to rebel” Rebel has two meanings, same spelling but pronounced differently First used being a Noun which indicates the 'person' who resists authority, control or convention and the latter being a Verb being the action taken by the person.

Problems “I read a book and it had a red cover” Same Pronunciation but spelt differently with different meanings To a human, it is common sense what it is Developing Algorithms to replicate this human ability can be a difficult task, as is further exemplification by implicit equivocation between “read” (Best book I've read) and “read” (I read the newspaper) When relating to Machine Translation it is identified as a distinctive task. One of the first problems faced by the systems was word sense ambiguity, it became apparent the semantic dis ambiguity at the lexical level

Early Approaches To WSD Three earlier attempts to solve WSD Preference Semantic: In simple words a representation of entire sentence would be build up from the representation of the individual word through the process of semantic interpretation. Word Expert Prasing: This approach is more highly lexicalized.This approach is based on assumptions of human knowledge about knowledge of words rather knowledge about rules. Polaried words: A system which is less lexicalized. This system cantain modules same in the NLP system like grammer, prasers, lexicon and sematic interpretor. Issues with all three: Lack of lexical hierarchy, the high degree of lexicalization.

Approaches To WSD Dictionary Based Approach – By Michiel E Lesk First Machine readable dictionary method Use Lesk Algorithm – based on assumption that words in a given "neighborhood" will tend to share a common subject. For example algorithm used for 'pine' 'cone'. 'Pine' has two major senses in Oxford dictionary, 'tree with needle shaped leaves' and 'Waste away through sorrow or illness'. Whereas 'cone' has three meaning,'solid body which narrows to a point','something of this shape whether solid or hollo' and 'fruit of certain evergreen trees'.

Machine Learning Approaches to WSD Tagging with Thesaurus catagories:-developed by mastermman in 1960 -- Used simple algorithm that based on repitition of catagories accociated with words in the same sentence. Approach remained unsuccessful but later was improved by using statical medal of catagories -- The later method was successful when tested on 12 Polysemy word (as 92 correct disambigaution word were found)

Machine Learning Approaches to WSD Clustering word Usages - Uses Yarowsky algorithm for unsupervised learning to find hidden structure in unlabeled data. Uses one sense per collocation and one sense per discourse properties of human language Decision list is based on ‘One sense per collocation’ property and start with large set of possible collocations and calculate log-likelihood ratio of word-sense probability for each collocation. The Higher log-likelihood the more predictive evidence.

Summary WSD is a long standing problem in Language processing Early problems suffered from lack of coverage due to lack of lexical resources The earlier approaches were not fully successful as they had some issues like Lack of lexical hierarchy, the high degree of lexicalization, due to large size of vocabulary. Most Successful was the Machine Learning approaches