Download presentation
Presentation is loading. Please wait.
Published byCarmel Harrell Modified over 9 years ago
1
Word Sense Disambiguation and Information Retrieval ByGuitao Gao Qing Ma Prof:Jian-Yun Nie
2
Outline Introduction WSD Approches Conclusion
3
Introduction Task of Information Retrieval Content Repesentation Indexing Bag of words indexing Problems: –Synonymy: query expansion –Polysemy: Word Sense Disambiguation
4
WSD Approaches Disambiguation based on manually created rules Disambiguation using machine readable dictionaries Disambiguation using thesauri Disambiguation based on unsupervised machine learning with corpora
5
Disambiguation based on manually created rules Weiss’ approach [Lesk 1988] : –set of rules to disambiguate five words –context rule: within 5 words –template rule: specific location –accuracy : 90% –IR improvement: 1% Small & Rieger’s approach [Small 1982] : –Expert system
6
Disambiguation using machine readable dictionaries Lesk’s approach [Lesk 1988] : –Senses are represented by different definitions –Looked up context words definitions –Find co-occurring words –Select most similar sense –Accuracy: 50% - 70%. –Problem: no enough overlapping words between definitions
7
Disambiguation using machine readable dictionaries Wilks’ approach [Wilks 1990] : –Attempt to solve Lesk’s problem –Expanding dictionary definition –Use Longman Dictionary of Contemporary English ( LDOCE ) –more word co-occurring evidence collected –Accuracy: between 53% and 85%.
8
Wilks’ approach [Wilks 1990] Commonly co-occurring words in LDOCE. [Wilks 1990]
9
Disambiguation using machine readable dictionaries Luk’s approach [Luk 1995]: –Statistical sense disambiguation –Use definitions from LDOCE – co-occurrence data collected from Brown corpus –defining concepts : 1792 words used to write definitions of LDOCE –LDOCE pre-processed :conceptual expansion
10
Luk’s approach [Luk 1995]: Entry in LDOCEConceptual expansion 1. (an order given by a judge which fixes) a punishment for a criminal found guilty in court found guilty in court { {order, judge, punish, crime, criminal,find, guilt, court}, 2. a group of words that forms a statement, command, exclamation, or question, usu. contains a subject and a verb, and (in writing) begins with a capital letter and ends with one of the marks. ! ? { group, word, form, statement, command, question, contain, subject, verb, write, begin, capital, letter, end, mark} } Noun “sentence” and its conceptual expansion [Luk 1995]
11
Luk’s approach [Luk 1995] cont. Collect co-occurrence data of defining concepts by constructing a two-dimensional Concept Co-occurrence Data Table (CCDT) –Brown corpus divided into sentences –collect conceptual co-occurrence data for each defining concept which occurs in the sentence –Insert collect data in the Concept Co- occurrence Data Table.
12
Luk’s approach [Luk 1995] cont. –Score each sense S with respect to context C [Luk 1995]
13
Luk’s approach [Luk 1995] cont. –Select sense with the highest score –Accuracy: 77% –Human accuracy: 71%
14
Approaches using Roget's Thesaurus [Yarowsky 1992] Resources used: –Roget's Thesaurus –Grolier Multimedia Encyclopedia Senses of a word: categories in Roget's Thesaurus 1042 broad categories covering areas like, tools/machinery or animals/insects
15
Approaches using Roget's Thesaurus [Yarowsky 1992] cont. tool, implement, appliance, contraption, apparatus, utensil, device, gadget, craft, machine, engine, motor, dynamo, generator, mill, lathe, equipment, gear, tackle, tackling, rigging, harness, trappings, fittings, accoutrements, paraphernalia, equipage, outfit, appointments, furniture, material, plant, appurtenances, a wheel, jack, clockwork, wheel- work, spring, screw, Some words placed into the tools/machinery category [Yarowsky 1992]
16
Approaches using Roget's Thesaurus [Yarowsky 1992] cont. Collect context for each category: –From Grolier Encyclopedia –each occurrence of each member of the category –extracts 100 surrounding words Sample occurrence of words in the tools/machinery category [Yarowsky 1992]
17
Approaches using Roget's Thesaurus [Yarowsky 1992] cont. Identify and weight salient words: Sample salient words for Roget categories 348 and 414 [Yarowsky 1992] To disambiguate a word: sums up the weights of all salient words appearing in context Accuracy: 92% disambiguating 12 words
18
Introduction to WordNet(1) Online thesaurus system Synsets: Synonymous Words Hierachical Relationship
19
Introduction to WordNet(2) [Sanderson 2000]
20
Voorhees’ Disambg. Experiment Calculation of Semantic Distance: Synset and Context words Word’s Sense: Synset closest to Context Words Retrieval Result: Worse than non-Disambig.
21
Gonzalo’s IR experiment(1) Two Questions Can WordNet really offer any potential for text retrieval How is text Retrieval performance affected by the disambiguation errors?
22
Gonzalo’s IR experiment(2) Text Collection: Summary and Document Experiments 1. Standard Smart Run 2. Indexed In Terms of Word-Sense 3. Indexed In Terms of Synset 4. Introduction of Disambiguation Error
23
Gonzalo’s IR experiment(3) Experiements %correct document retrieved Indexed by synsets 62.0 Indexing by word senses53.2 Indexing by words48.0 Indexing by synsets(5% error)62.0 Id. with 10% errors 60.8 Id. with 20% errors 56.1 Id. with 30% errors 54.4 Id. with all possible 52.6 Id. with 60% errors49.1
24
Gonzalo’s IR experiment(4) Disambiguation with WordNet can improve text retrieval Solution lies in reliable Automatic WSD technique
25
Disambiguation With Unsupervised Learning Yarowsky’s Unsupervised Method One Sense Per Collocation eg: Plant(manufacturing/life) One Sense Per Discourse eg: defense(War/Sports)
26
Yarowsky’s Unsupervised Method cont. Algorithm Details Step1:Store Word and its contexts as line eg:….zonal distribution of plant life….. Step2: Identify a few words that represent the word Sense eg. plant(manufacturing/life) Step3a: Get rules from the training set plant + X => A, weight plant + Y => B, weight Step3b:Use the rules created in 3a to classify all occurrences of plant sample set.
27
Yarowsky’s Unsupervised Method cont. Step3c: Use one-sense-per-discourse rule to filter or augment this addition Step3d: Repeat Step 3 a-b-c iteratively. Step4: the training converges on a stable residual set. Step 5: the result will be a set of rules. Those rules will be used to disambiguate the word “plant”. eg. plant + growth => life plant + car => manufacturing
28
Yarowsky’s Unsupervised Method cont. Advantages of this method: Better accuracy compared to other unsupervised method No need for costly hand-tagged training sets(supervised method)
29
Schütze and Pedersen’s approach [Schütze 1995] Source of word sense definitions –Not using a dictionary or thesaurus –Only using only the corpus to be disambiguated (Category B TREC-1 collection ) Thesaurus construction –Collect a (symmetric ) term-term matrix C –Entry c ij : number of times that words i and j co-occur in a symmetric window of total size k –Use SVD to reduce the dimensionality
30
Schütze and Pedersen’s approach [Schütze 1995] cont. –Thesaurus vector: columns –Semantic similarity: cosine between columns –Thesaurus: associate each word with its nearest neighbors –Context vector: summing thesaurus vectors of context words
31
Schütze and Pedersen’s approach [Schütze 1995] cont. Disambiguation algorithm –Identify context vectors corresponding to all occurrences of a particular word –Partition them into regions of high density –Tag a sense for each such region –Disambiguating a word: Compute context vector of its occurrence Find the closest centroid of a region Assign the occurrence the sense of that centroid
32
Schütze and Pedersen’s approach [Schütze 1995] cont. Accuracy: 90% Application to IR –replacing the words by word senses –sense based retrieval’s average precision for 11 points of recall increased 4% with respect to word based. –Combine the ranking for each document: average precision increased: 11% –Each occurrence is assigned n(2,3,4,5) senses; average precision increased: 14% for n=3
33
Schütze and Pedersen’s approach [Schütze 1995] cont.
34
Conclusion How much can WSD help improve IR effectiveness? Open question –Weiss: 1%, Voorhees’ method : negative –Krovetz and Croft, Sanderson : only useful for short queries –Schütze and Pedersen’s approaches and Gonzalo’s experiment : positive result WSD must be accurate to be useful for IR Schütze and Pedersen’s, Yarowsky’s algorithm: promising for IR Luk’s approach : robust for data sparse, suitable for small corpus.
35
References [Krovetz 92] R. Krovetz & W.B. Croft (1992). Lexical Ambiguity and Information Retrieval, in ACM Transactions onInformation Systems, 10(1). Gonzalo 1998] J. Gonzalo, F. Verdejo, I. Chugur and J. Cigarran, “ Indexing with WordNet synsets can improve Text Retrieval ”, Proceedings of the COLING/ACL ’ 98 Workshop on Usage of WordNet for NLP, Montreal,1998 [Gonzalo 1992] R. Krovetz & W.B. Croft. “ Lexical Ambiguity and Information Retrieval ”, in ACM Transactions on Information Systems, 10(1), 1992 [Lesk 1988] M. Lesk, “ They said true things, but called them by wrong names ” – vocabulary problems in retrieval systems, in Proc. 4th Annual Conference of the University of Waterloo Centre for the New OED, 1988 [Luk 1995] A.K. Luk. “ Statistical sense disambiguation with relatively small corpora using dictionary definitions ”. In Proceedings of the 33rd Annual Meeting of the ACL, Columbus, Ohio, June 1995. Association for Computational Linguistics. [Salton 83] G. Salton & M.J. McGill (1983). Introduction To Modern Information Retrieval. The SMART and SIRE experimental retrieval systems, in New York: McGraw-Hill [Sanderson 1997] Sanderson, M. Word Sense Disambiguation and Information Retrieval, PhD Thesis, Technical Report (TR-1997-7) of the Department of Computing Science at the University of Glasgow, Glasgow G12 8QQ, UK. [Sanderson 2000] Sanderson, Mark, “ Retrieving with Good Sense ”, http://citeseer.nj.nec.com/sanderson00retrieving.html, 2000 http://citeseer.nj.nec.com/sanderson00retrieving.html
36
References cont. [Sch ü tze 1995] H. Sch ü tze & J.O. Pedersen. “ Information retrieval based on word senses ”, in Proceedings of the Symposium on Document Analysis and Information Retrieval, 4: 161-175. [Small 1982] S. Small & C. Rieger, “ Parsing and comprehending with word experts (a theoryand its realisation) ” in Strategies for Natural Language Processing, W.G. Lehnert & M.H. Ringle, Eds., LEA: 89-148, 1982 [Voorhees 1993] E. M. Voorhees, “ Using WordNet ™ to disambiguate word sense for text retrieval, in Proceedings of ACM SIGIR Conference ”, (16): 171-180. 1993 [Weiss 73] S.F. Weiss (1973). Learning to disambiguate, in Information Storage and Retrieval, 9:33-41, 1973 [Wilks 1990] Y. Wilks, D. Fass, C. Guo, J.E. Mcdonald, T. Plate, B.M. Slator (1990). ProvidingMachine Tractable Dictionary Tools, in Machine Translation, 5: 99-154, 1990 [Yarowsky 1992] D. Yarowsky, ` “ Word sense disambiguation using statistical models of Roget ’ s categories trained on large corpora, in Proceedings of COLING Conference ” : 454-460, 1992 [Yarowsky 1994] Yarowsky, D. “Decision lists for lexical ambiguity resolution:Application to Accent Restoration in Spanish and French.” In Proceedings of the 32rd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM, 1994 [Yarowsky 1995] Yarowsky, D. “Unsupervised word sense disambiguation rivaling supervised methods.” In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, pages 189-- 196, Cambridge, MA, 1995
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.