Download presentation
Presentation is loading. Please wait.
1
Automatic Verb Sense Grouping --- Term Project Proposal for CIS630 Jinying Chen 10/28/2002
2
Motivation “Making fine-grained and coarse-grained distinction, both manually and automatically (Martha, Hoa, Christiane, 2002) –The difficulty of finding consistent criteria for making fine-grained sense distinction, either manually or automatically –Well-defined sense groups can alleviate this problem –Potential application in Machine Translation
5
Model Unsupervised Learning EM algorithm (similar as in Dan Gildea 2002, Walde 2000, Rooth 1999, Ted Pedersen, 1997)
6
EM clustering algorithm Soft clustering P(v|c) Each verb v i is associated with a set of features {f i1, f i2, … f in }, there are m clusters {c 1, c 2, … c m } Estimate P(v|c) by maximize loglikelihood
7
Two problems How many clusters for a particular verb? –human knowledge of the rough number of verb sense groups is instructive in unsupervised learning –Olga’s proposal How many features for a particular verb? –May not be a problem: hopefully the EM algorithm can do feature selection on some degree –However, a well-restricted feature set can reduce the model complexity (O(nm)) and alleviate the effect of noise data –Borrow ideas from “Automatic Verb Classification based on Statistical Distribution of Argument Structure” (Paola Merlo and Suzanne Stevenson, 2001)
8
Plan Phase I --- Corpus analysis –Automatically and manually –Determine the range of feature set for each verb Phase II --- Automatic verb sense grouping –Implement EM clustering algorithm –Evaluate the performance Phase III --- Compare with other clustering methods –Ward’s minimum-variance method (Ward, 1963) –McQuitty’s similarity analysis (McQuitty, 1966) –Spectral Clustering (Brew & Walde, 2002)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.