Download presentation
Presentation is loading. Please wait.
Published byDeanna Braley Modified over 10 years ago
1
1 Extended Gloss Overlaps as a Measure of Semantic Relatedness Satanjeev Banerjee Ted Pedersen Carnegie Mellon University University of Minnesota Duluth Supported by NSF Grants: #0092784, REC-9979894
2
2 Semantic Relatedness ► Some pairs of words are closer in meaning than others E.g. car – tire are strongly related car – tree are not strongly related car – tree are not strongly related ► Relatedness between words can consist of Synonymy [e.g. car – automobile] Is-a/has-a relationships [e.g. car – tire] Co-occurrence [e.g. car – insurance]
3
3 Goal of this Paper ► Create a measure to quantify semantic relatedness Most existing work measures noun-noun only. ► Resnik (1995), Lin (1997), Jiang-Conrath (1997), Leacock-Chodorow (1998) We can measure across parts of speech. Based on WordNet definitions and relations. ► Evaluate Using word sense disambiguation. Compare to human relatedness judgments (in paper)
4
4 Description of WordNet ► Online English lexical database. ► Like dictionaries, contains word senses and their definitions or glosses E.g.: sentence: E.g.: sentence: “the penalty meted out to one adjudged guilty” ► Word senses that mean the same are grouped into synonym sets or synsets E.g.: {sentence, conviction, condemnation}
5
5 sentence: “the penalty meted out to one adjudged guilty” Synsets are connected to other synsets through “semantic relations” Semantic Relations in WordNet
6
6 final judgment: “a judgment disposing of the case before the court of law” sentence: “the penalty meted out to one adjudged guilty” a “sentence” is a … Synsets are connected to other synsets through “semantic relations”
7
7 Semantic Relations in WordNet final judgment: “a judgment disposing of the case before the court of law” sentence: “the penalty meted out to one adjudged guilty” Synsets are connected to other synsets through “semantic relations” [hypernym] a “sentence” is a …
8
8 Semantic Relations in WordNet final judgment: “a judgment disposing of the case before the court of law” sentence: “the penalty meted out to one adjudged guilty” hard time: “term served in a maximum security prison” death penalty: “punishment by death via execution” … is a “sentence” Synsets are connected to other synsets through “semantic relations” a “sentence” is a … [hypernym]
9
9 Semantic Relations in WordNet final judgment: “a judgment disposing of the case before the court of law” sentence: “the penalty meted out to one adjudged guilty” hard time: “term served in a maximum security prison” death penalty: “punishment by death via execution” … is a “sentence” Synsets are connected to other synsets through “semantic relations” [hyponym] a “sentence” is a … [hypernym]
10
10 Gloss Overlaps ≈ Relatedness ► Lesk’s (1986) idea: Related word senses are (often) defined using the same words. E.g: bank(1): “a financial institution” bank(2): “sloping land beside a body of water” lake: “a body of water surrounded by land”
11
11 Gloss Overlaps ≈ Relatedness ► Lesk’s (1986) idea: Related word senses are (often) defined using the same words. E.g: bank(1): “a financial institution” bank(2): “sloping land beside a body of water” lake: “a body of water surrounded by land”
12
12 Gloss Overlaps ≈ Relatedness ► Lesk’s (1986) idea: Related word senses are (often) defined using the same words. E.g: bank(1): “a financial institution” bank(2): “sloping land beside a body of water” lake: “a body of water surrounded by land” ► Gloss overlaps = # content words common to two glosses ≈ relatedness Thus, relatedness (bank(2), lake) = 3 And, relatedness (bank(1), lake) = 0
13
13 Limitations of (Lesk’s) Gloss Overlaps ► Most glosses are very short. So not enough words to find overlaps with. ► Solution: Extended gloss overlaps Add glosses of synsets connected to the input synsets.
14
14 sentence: “the penalty meted out to one adjudged guilty” bench: “persons who hear cases in a court of law” # overlapped words = 0 Extending a Gloss
15
15 sentence: “the penalty meted out to one adjudged guilty” final judgment: “a judgment disposing of the case before the court of law” bench: “persons who hear cases in a court of law” hypernym # overlapped words = 0 Extending a Gloss
16
16 sentence: “the penalty meted out to one adjudged guilty” final judgment: “a judgment disposing of the case before the court of law” bench: “persons who hear cases in a court of law” hypernym # overlapped words = 2 Extending a Gloss
17
17 Creating the Extended Gloss Overlap Measure ► How to measure overlaps? ► Which relations to use for gloss extension?
18
18 How to Score Overlaps? ► Lesk simply summed up overlapped words. ► But matches involving phrases – phrasal matches – are rarer, and more informative E.g. “court of law” ► Aim: Score of n words in a phrase > sum of scores of n words in shorter phrases ► Solution: Give a phrase of n words a score of “court of law” gets score of 9.
19
19 Which Relations to Use? ► Hypernyms [ “car” “vehicle” ] ► Hyponyms [ “car” “convertible” ] ► Meronyms [ “car” “accelerator” ] ► Holonym [ “car” “train” ] ► Also-see relation [“enter” “move in” ] ► Attribute [ “measure” “standard” ] ► Pertainym [ “centennial” “century” ]
20
20 Extended Gloss Overlap Measure ► Input two synsets A and B ► Find phrasal gloss overlaps between A and B ► Next, find phrasal gloss overlaps between every synset connected to A, and every synset connected to B ► Compute phrasal scores for all such overlaps ► Add phrasal scores to get relatedness of A and B ► A and B can be from different parts of speech.
21
21 Evaluation: On WSD ► Test semantic relatedness measures on Word Sense Disambiguation (WSD) task. ► WSD = determine the intended sense of a multi-sense word in a sentence E.g.: I sat on the bank of the lake. ► Our WSD algorithm: Pick that sense of the target word that is most strongly related to its neighboring words. (based on Lesk ’86)
22
22 the bench pronounced the sentence Word sense disambiguation using a relatedness measure
23
23 the bench pronounced the sentence bench: “persons who hear cases in a court of law” bench: “a long seat for more than one person”
24
24 the bench pronounced the sentence bench: “persons who hear cases in a court of law” bench: “a long seat for more than one person” pronounce: “pronounce judgment on” pronounce: “speak or utter in a certain way”
25
25 the bench pronounced the sentence sentence: “the penalty meted out to one adjudged guilty” sentence: “a string of words that satisfies grammar rules” bench: “persons who hear cases in a court of law” bench: “a long seat for more than one person” pronounce: “pronounce judgment on” pronounce: “speak or utter in a certain way”
26
26 the bench pronounced the sentence sentence: “the penalty meted out to one adjudged guilty” sentence: “a string of words that satisfies grammar rules” bench: “persons who hear cases in a court of law” bench: “a long seat for more than one person” pronounce: “pronounce judgment on” pronounce: “speak or utter in a certain way”
27
27 the bench pronounced the sentence sentence: “the penalty meted out to one adjudged guilty” sentence: “a string of words that satisfies grammar rules” bench: “persons who hear cases in a court of law” bench: “a long seat for more than one person” pronounce: “pronounce judgment on” pronounce: “speak or utter in a certain way”
28
28 the bench pronounced the sentence sentence: “the penalty meted out to one adjudged guilty” sentence: “a string of words that satisfies grammar rules” bench: “persons who hear cases in a court of law” bench: “a long seat for more than one person” pronounce: “pronounce judgment on” pronounce: “speak or utter in a certain way”
29
29 the bench pronounced the sentence sentence: “the penalty meted out to one adjudged guilty” sentence: “a string of words that satisfies grammar rules” bench: “persons who hear cases in a court of law” bench: “a long seat for more than one person” pronounce: “pronounce judgment on” pronounce: “speak or utter in a certain way”
30
30 the bench pronounced the sentence sentence: “the penalty meted out to one adjudged guilty” sentence: “a string of words that satisfies grammar rules” bench: “persons who hear cases in a court of law” bench: “a long seat for more than one person” pronounce: “pronounce judgment on” pronounce: “speak or utter in a certain way”
31
31 the bench pronounced the sentence sentence: “the penalty meted out to one adjudged guilty” sentence: “a string of words that satisfies grammar rules” bench: “persons who hear cases in a court of law” bench: “a long seat for more than one person” pronounce: “pronounce judgment on” pronounce: “speak or utter in a certain way”
32
32 the bench pronounced the sentence sentence: “the penalty meted out to one adjudged guilty” sentence: “a string of words that satisfies grammar rules” bench: “persons who hear cases in a court of law” bench: “a long seat for more than one person” pronounce: “pronounce judgment on” pronounce: “speak or utter in a certain way”
33
33 the bench pronounced the sentence sentence: “the penalty meted out to one adjudged guilty” sentence: “a string of words that satisfies grammar rules” bench: “persons who hear cases in a court of law” bench: “a long seat for more than one person” pronounce: “pronounce judgment on” pronounce: “speak or utter in a certain way”
34
34 the bench pronounced the sentence sentence: “the penalty meted out to one adjudged guilty” sentence: “a string of words that satisfies grammar rules” bench: “persons who hear cases in a court of law” bench: “a long seat for more than one person” pronounce: “pronounce judgment on” pronounce: “speak or utter in a certain way”
35
35 Evaluation Data ► Data from SENSEVAL-2 WSD exercise. ► 4,328 passages, each 2-3 sentences long and containing 1 multi-sense target word. ► Each target word labeled by humans with its most appropriate WordNet sense. ► WSD algorithm’s output senses compared against these human labels. ► Precision, recall, and f-measure reported.
36
36 Evaluation Results AlgorithmPrecisionRecallF-measure Sval-1 st 0.4020.4010.401 Extended Gloss 0.3510.3420.346 Sval-2 nd 0.2930.2930.293 Sval-3 rd 0.2470.2440.245 Lesk0.1830.1830.183 Random0.1410.1410.141
37
37 Which WN Relations Help? ► Evaluation with a single relation at a time E.g., comparing only hypernyms, only hyponyms, etc. ► Result: No single comparison is a big source of information. No pair exceeded f-measure of 0.136, as compared to overall f-measure of 0.346
38
38 Which WN Relations Help? ► Most helpful were: Hyponym relation ► kinds of “car” “compact”, “SUV”, “coupe”, etc. Meronym relation ► parts of “car” “accelerator”, “wheel”, “hood”, etc. ► These relations are usually one-many. Thus they give access to many glosses. ► Implies: more glosses more useful.
39
39 Conclusions ► We presented a new measure of semantic relatedness Can operate across parts of speech. ► We evaluated on the task of WSD. Performed much better than the Lesk baseline Performance comparable to other systems. ► Future work: Augment using corpus statistics. Evaluate on different task.
40
40 Resources ► WordNet::Similarity (relatedness measures) (http://search.cpan.org/dist/WordNet-Similarity) Extended gloss overlaps Resnik, Lin, Jiang-Conrath Leacock-Chodorow, Hirst-St. Onge Edge Counting, Random ► SenseRelate (WSD using relatedness) (http://www.d.umn.edu/~tpederse/senserelate.html)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.