1 Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003.

1 Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003

2 Outline Maxent Word Sense Disambiguator Coarse-grained WSD by Decision Tree Future Work

3 Maxent Word Sense Disambiguator (Martha, Hoa, Christiane, 2002 ) A deterministic model producing probability f j (sense,context):binary features : weight of feature j Z(context) : normalizing factor Can combine evidence from different knowledge source Feature weights determined automatically (GIS)

4 Features used in Maxent Model for English WSD Local Contextual Predicates Collocational features, e.g., target verb w, pos of w; pos of words at position –1,+1, w.r.t. w; words at positions –2, - 1, +1, +2, w.r.t. w Systactic features, e.g., active vs. passive, is there a sentential complement, subj, obj or indirect obj etc. Semantic features, e.g., Named Entity tag (PER, ORG, LOC) for proper nouns, and WN synsets and hypernyms for all nouns in above syntactic relation to w Topical Contextual Keywords Select 200-300 words k with lowest entropy (P(sense|k)), i.e., being most informative, from anywhere in context Maxent Word Sense Disambiguator

5 Fine-grained and coarse-grained WSD Part of the results from (Martha, Hoa, Christiane, 2002 ) Maxent Word Sense Disambiguator VerbWN (corp) Sen WN (corp) Grp MX-fineMX-group call28(14)11(7)0.4700.667 develop21(16)9(6)0.4930.739 live7(6)4(4)0.6870.746 pull18(10)10(5)0.5330.667 strike20(16)12(10)0.3330.444 Table 1 The Performance of Maxent Word Sense Disambiguator on five verbs

6 Coarse-grained WSD by Decision Tree A simpler model compared with Maxent Model Using Semantic Features from PropBank PropBank Each verb is defined by several framesets All verb instances belonging to the same frameset share a common set of roles Roles can be argn (n=0,1,…) and argM-f Frameset is consistent with Verb Sense Group Frameset tags and roles are semantic features for VSG

7 Building Decision Tree Use c5.0 of DT 3 Feature Sets: SF (Simple Feature set) works best: VOICE: PAS, ACT FRAMESET: 01,02, … ARGn (n=0,1,2 …) : 0(not occur), 1(occur) CoreFrame: 01-ARG0-ARG1, 02-ARG0-ARG2,… ARGM: 0(has not ARGM), 1(has ARGM) ARGM-f(f=DIS, ADV, …): i (occur i times) Automatic Verb Sense GroupingCoarse-grained WSD by Decision Tree

8 Table 2 Error rate of Decision Tree on five verbs Experimental Results Coarse-grained WSD by Decision Tree

9 Discussion Simple feature set and simple DT algorithms works well Potential sparse data problem Complicate DT algorithms (e.g., with boosting) tend to overfit the data Complex features are not utilized by the model Solution: use large corpus, e.g., parsed BNC corpus without frameset annotation Coarse-grained WSD by Decision Tree

10 Future Work Train DT or other models for coarse-grained WSD on large corpus without frameset annotation Unsupervised Frameset Tagging by EM-clustering Clustering nouns automatically instead of using WordNet to group nouns

1 Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003.

Similar presentations

Presentation on theme: "1 Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003.

Similar presentations

Presentation on theme: "1 Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003."— Presentation transcript:

Similar presentations

About project

Feedback