WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on.

Slides:

Advertisements

Similar presentations

A Robust Approach to Aligning Heterogeneous Lexical Resources Mohammad Taher Pilehvar Roberto Navigli MultiJEDI ERC

Advertisements

A Linguistic Approach for Semantic Web Service Discovery International Symposium on Management Intelligent Systems 2012 (IS-MiS 2012) July 13, 2012 Jordy.

Word sense disambiguation and information retrieval Chapter 17 Jurafsky, D. & Martin J. H. SPEECH and LANGUAGE PROCESSING Jarmo Ritola -

Sentiment Analysis An Overview of Concepts and Selected Techniques.

Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.

Faculty Of Applied Science Simon Fraser University Cmpt 825 presentation Corpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary Jiri.

CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?

CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.

Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam

1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.

CS 4705 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised –Dictionary-based.

Article by: Feiyu Xu, Daniela Kurz, Jakub Piskorski, Sven Schmeier Article Summary by Mark Vickers.

Semantic Video Classification Based on Subtitles and Domain Terminologies Polyxeni Katsiouli, Vassileios Tsetsos, Stathes Hadjiefthymiades P ervasive C.

Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.

Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.

Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.

WORD SENSE DISAMBIGUATION By Mitesh M. Khapra Under the guidance of Prof. Pushpak Bhattacharyya.

CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.

Aiding WSD by exploiting hypo/hypernymy relations in a restricted framework MEANING project Experiment 6.H(d) Luis Villarejo and Lluís M à rquez.

Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.

Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.

Francisco Viveros-Jiménez Alexander Gelbukh Grigori Sidorov.

1 Statistical NLP: Lecture 10 Lexical Acquisition.

A Fully Unsupervised Word Sense Disambiguation Method Using Dependency Knowledge Ping Chen University of Houston-Downtown Wei Ding University of Massachusetts-Boston.

Jiuling Zhang  Why perform query expansion?  WordNet based Word Sense Disambiguation WordNet Word Sense Disambiguation  Conceptual Query.

Name : Emad Zargoun Id number : EASTERN MEDITERRANEAN UNIVERSITY DEPARTMENT OF Computing and technology “ITEC547- text mining“ Prof.Dr. Nazife Dimiriler.

Word Sense Disambiguation (WSD)

Word Sense Disambiguation Many words have multiple meanings –E.g, river bank, financial bank Problem: Assign proper sense to each ambiguous word in text.

Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.

Word Sense Disambiguation UIUC - 06/10/2004 Word Sense Disambiguation Another NLP working problem for learning with constraints… Lluís Màrquez TALP, LSI,

 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.

Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.

1 Statistical NLP: Lecture 9 Word Sense Disambiguation.

Paper Review by Utsav Sinha August, 2015 Part of assignment in CS 671: Natural Language Processing, IIT Kanpur.

WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.

W ORD S ENSE D ISAMBIGUATION By Mahmood Soltani Tehran University 2009/12/24 1.

SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.

CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 12: WSD approaches (contd.)

An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee

CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.

Word Sense Disambiguation Kyung-Hee Sung Foundations of Statistical NLP Chapter 7.

HyperLex: lexical cartography for information retrieval Jean Veronis Presented by: Siddhanth Jain( ) Samiulla Shaikh( )

Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,

Lecture 21 Computational Lexical Semantics Topics Features in NLTK III Computational Lexical Semantics Semantic Web USCReadings: NLTK book Chapter 10 Text.

Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources.

CSKGOI'08 Commonsense Knowledge and Goal Oriented Interfaces.

1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )

Methods for Automatic Evaluation of Sentence Extract Summaries * G.Ravindra +, N.Balakrishnan +, K.R.Ramakrishnan * Supercomputer Education & Research.

Using Semantic Relatedness for Word Sense Disambiguation

CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 24 (14/04/06) Prof. Pushpak Bhattacharyya IIT Bombay Word Sense Disambiguation.

1 Chen Yirong, Lu Qin, Li Wenjie, Cui Gaoying Department of Computing The Hong Kong Polytechnic University Chinese Core Ontology Construction from a Bilingual.

Presented By- Shahina Ferdous, Student ID – , Spring 2010.

Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏

Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.

Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,

1 Measuring the Semantic Similarity of Texts Author ： Courtney Corley and Rada Mihalcea Source ： ACL-2005 Reporter ： Yong-Xiang Chen.

1 Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003.

From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:

Overview of Statistical NLP IR Group Meeting March 7, 2006.

Finding Predominant Word Senses in Untagged Text Diana McCarthy & Rob Koeling & Julie Weeds & Carroll Department of Indormatics, University of Sussex {dianam,

Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.

Statistical Methods in NLP Course 10 Diana Trandab ă ț

Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.

Word Sense Disambiguation Algorithms in Hindi

Lecture 21 Computational Lexical Semantics

Statistical NLP: Lecture 9

A method for WSD on Unrestricted Text

Unsupervised Word Sense Disambiguation Using Lesk algorithm

Statistical NLP : Lecture 9 Word Sense Disambiguation

Statistical NLP: Lecture 10

Presentation transcript:

WORD SENSE DISAMBIGUATION Presented By Roshan R. Karwa Guided By Dr. M.B.Chandak A Technical Seminar on

M OTIVATION One of the open problem in NLP. Computationally determining which sense of a word is activated by its use in a particular context. E.g. I went to the bank to withdraw some money. Needed in: Machine Translation : For correct lexical choice. Information Retrieval: Resolving ambiguity in queries. Information Extraction: For accurate analysis of text. 2

OUTLINE Knowledge Based Approaches WSD using Selectional Preferences Overlap Based Approaches Machine Learning Based Approaches Supervised Approaches Unsupervised Algorithms Hybrid Approaches Summary Conclusion Future Work 3

Knowledge Based Approaches WSD USING SELECTIONAL PREFERENCES AND ARGUMENTs 4 This airlines serves dinner in the evening flight. serve (Verb) agent object – edible This airlines serves the sector between Agra & Delhi. serve (Verb) agent object – sector SENSE 1SENSE 2 o Requires exhaustive enumeration of:  Argument-structure of verbs.  Description of properties of words such that meeting the selectional preference criteria can be decided. E.g. This flight serves the “region” between Mumbai and Delhi

Knowledge Based Approaches: O VERLAP B ASED A PPROACHES Require a Machine Readable Dictionary (MRD). Find the overlap between the features of different senses of an ambiguous word (sense bag) and the features of the words in its context (context bag). The sense which has the maximum overlap is selected as the contextually appropriate sense. 5

OVERLAP BASED APPROACHES: LESK’S A LGORITHM 6 SENSE 1.. THE RESIDUE THAT REMAINS WHEN SOMETHING IS BURNED. SENSE 2.. TIMBER TREE.. SENSE 3.. STRONG ELASTIC WOOD OF ANY OF VARIOUS ASH TREES; USED FOR FURNITURE... ASH “On burning coal we get ash.” In this case Sense 1 of ash would be the winner sense. “ On burning the ash, We found that its root were deap into ground. Winner sense will be again 1 by simple LESK. But Intented meaning is ??????

OVERLAP BASED APPROACHES: WALKER’S ALGORITHM A Thesaurus Based approach. Step 1: For each sense of the target word find the thesaurus category to which that sense belongs. Step 2: Calculate the score for each sense by using the context words. A context words will add 1 to the score of the sense if the thesaurus category of the word matches that of the sense. E.g. The money in this bank fetches an interest of 8% per annum Target word: bank Clue words from the context: money, interest, annum, fetch Sense1: FinanceSense2: Location Money+10 Interest+10 Fetch00 Annum+10 Total30 Context words add 1 to the sense when the topic of the word matches that of the sense

WSD U SING C ONCEPTUAL D ENSITY Select a sense based on the relatedness of that word-sense to the context. Relatedness is measured in terms of conceptual distance (i.e. how close the concept represented by the word and the concept represented by its context words are) This approach uses a structured hierarchical semantic net (WordNet) for finding the conceptual distance. Smaller the conceptual distance higher will be the conceptual density. (i.e. if all words in the context are strong indicators of a particular concept then that concept will have a higher density.) 8

THE JURY(2) PRAISED THE ADMINISTRATION(3) AND OPERATION (8) OF ATLANTA POLICE DEPARTMENT(1 ) Step 1:Lattice Making Step 2: Compute CD Step 3: Select highest CD Step 4: Select concept operation division administrative unit jury committee police department local department government department department juryadministration body CD = CD = C ONCEPTUAL D ENSITY (E XAMPLE )

Supervised Approach: NAÏVE BAYES 10 sˆ= argmax s ε senses Pr(s|V w ) ‘ V w ’ is a feature vector consisting of: POS of w Semantic & Syntactic features of w Collocation vector (set of words around it)  typically consists of next word(+1), next-to-next word(+2), -2, -1 & their POS's Co-occurrence vector (number of times w occurs in bag of words around it) Applying Bayes rule and naive independence assumption sˆ= argmax s ε senses Pr(s). Π i=1 n Pr(V w i |s)

NAÏVE BAYES Example I went to the bank to withdraw some money Collocation vector: Co-occurrence vector: considering window, 2 words before bank and 2 words after bank. So bank appear one time. Vsense of bank:

NAÏVE BAYES Example (CONT..) P(Vbank|sense of bank) = ? P( |sense of bank) = P(N|sense of bank). P(org|sense of bank). P(plural-s|sense of bank). P(went|sense of bank). P(withdraw|sense of bank). P(money|sense of bank) Say, P(org| sense1 bank) P(org| sense2 bank)

E STIMATING P ARAMETERS Parameters in the probabilistic WSD are: Pr(s) Pr(V w i |s) Senses are marked with respect to sense repository (WORDNET) Pr(s)= count(s,w) / count(w) Pr(V w i |s)= Pr(V w i,s)/Pr(s) = c(V w i,s,w)/c(s,w)

Supervised Approach: DECISION LIST ALGORITHM Based on ‘One sense per collocation’ property. Nearby words provide strong and consistent clues as to the sense of a target word. Collect a large set of collocations for the ambiguous word. Calculate word-sense probability distributions for all such collocations. Calculate the log-likelihood ratio Higher log-likelihood = more predictive evidence Collocations are ordered in a decision list, with most predictive collocations ranked highest. 14 Pr(Sense-A| Collocation i ) Pr(Sense-B| Collocation i ) Log( ) 14 Assuming there are only two senses for the word. Of course, this can easily be extended to ‘k’ senses.

Training DataResultant Decision List D ECISION L IST A LGORITHM (C ONTD.) Classification of a test sentence is based on the highest ranking collocation found in the test sentence. E.g. … plucking flowers affects plant growth … 15

Unsupervised Approach: H YPERLEX KEY IDEA Instead of using “dictionary defined senses” extract the “senses from the corpus” itself These “corpus senses” or “uses” correspond to clusters of similar contexts for a word. (river) (water) (flow) (electricity) (victory) (team) (cup) (world) Example: “ Outre la production d electrite, le BARRAGE permettre de-regular le corpav du fleuve” ( In addition to production of electricity, the dam will regulate the river flow)

HYBRID: AN ITERATIVE APPROACH TO WSD Extracts collocational and contextual information form WordNet (gloss) and a small amount of tagged data. Monosemic words in the context serve as a seed set of disambiguated words. It would be interesting to exploit other semantic relations available in WordNet. Combine information obtained from multiple knowledge source

STRUCTURAL SEMANTIC INTERCONNECTIONS (SSI) AN ITERATIVE APPROACH. Uses the following relations hypernymy (car#1 is a kind of vehicle#1) denoted by (kind-of ) hyponymy (the inverse of hypernymy) denoted by (has-kind) meronymy (room#1 has-part wall#1) denoted by (has-part ) holonymy (the inverse of meronymy) denoted by (part-of ) attribute (dry#1 value-of wetness#1) denoted by (attr) gloss denoted by (gloss) context denoted by (context) domain denoted by (dl) Monosemic words serve as the seed set for disambiguation.

Structural Semantic Interconnections (SSI) contd. A SEMANTIC RELATIONS GRAPH FOR THE TWO SENSES OF THE WORD BUS (I.E. VEHICLE AND CONNECTOR)

RESOURCES FOR WSD Sense Inventory:  Dictionaries  Thesauri(roget’s thesaurus.. )  Lexical KB(Wordnet.. ) Corpora:  Raw(Brown corpus.. )  Sense Tagged(Semcor.. )  Automatically Tagged Corpora(Open dictionary Project..)

Conclusion Using a diverse set of features improves WSD accuracy. WSD results are better when the degree of polysemy is reduced. Hyperlex (unsupervised corpus based), SSI (hybrid) look promising for resource-poor Indian languages.

FUTURE IMPLEMENTATION Use unsupervised or hybrid approaches to develop a WSD engine. (focusing on MT) Automatically generate sense tagged data. Explore whether it possible to evaluate the role of WSD in MT.

R EFERENCES Agirre, Eneko & German Rigau "Word sense disambiguation using conceptual density", in Proceedings of the 16th International Conference on Computational Linguistics (COLING), Copenhagen, Denmark, 1996 Ng, Hwee T. & Hian B. Lee "Integrating multiple knowledge sources to disambiguate word senses: An exemplar-based approach", Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics (ACL), Santa Cruz, U.S.A., Ng, Hwee T "Exemplar-based word sense disambiguation: Some recent improvements", Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing (EMNLP), Providence, U.S.A.,

REFERENCES Rada Mihalcea and Dan Moldovan, 2000."An Iterative Approach to Word Sense Disambiguation", in Proceedings of Florida Artificial Intelligence Research Society Conference (FLAIRS 2000), [pg ] Orlando, FL, May Roberto Navigli, Paolo Velardi, 2005."Structural Semantic Interconnections: A Knowledge-Based Approach to Word Sense Disambiguation", IEEE Transactions On Pattern Analysis and Machine Intelligence, July Yee Seng Chan, Hwee Tou Ng and David Chiang, 2007."Word Sense Disambiguation Improves Statistical Machine Translation", in Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Ping Chen and Chris Bowes, University of Houston-Downtown and Wei Ding and Max Choly, University of Massachusetts, Boston Word Sense Disambiguation with Automatically Acquired Knowledge, 2012 IEEE INTELLIGENT SYSTEMS published by the IEEE Computer Society.

THANK YOU!

APPENDIX Conceptual Density CD(c,m)= m-1 h-1 ∑ nhyp ∕ ∑ nhyp i=o i=0

Unsupervised: LIN’S APPROACH 27 INSTALLATION PROFICIENCY ADEPTNESS READINESS TOILET/BATHROOM WordFreqLog Likelihood ORG Plant Company Industry914.6 Unit99.32 Aerospace25.81 Memory device15.79 Pilot25.37 SENSES OF FACILITYSUBJECTS OF “EMPLOY” Two different words are likely to have similar meanings if they occur in identical local contexts. E.g. The facility will employ 500 new employees. In this case Sense 1 of installation would be the winner sense.