Distributional clustering of English words Authors: Fernando Pereira, Naftali Tishby, Lillian Lee Presenter: Marian Olteanu.

Distributional clustering of English words Authors: Fernando Pereira, Naftali Tishby, Lillian Lee Presenter: Marian Olteanu

Introduction Method for automatic clustering of words  Distribution in particular syntactic contexts  Deterministic annealing Find lowest distortion sets of clusters Increasing annealing parameters  Clusters subdivide – hierarchical “soft” clustering  Clusters Class models Word co-occurrence

Introduction Simple tabulation of frequencies  Data sparseness Hindle proposed smoothing based on clustering  Estimating likelihood of unseen events from the frequencies of “similar” events that have been seen Example: estimating the likelihood of a particular direct object for a verb from the likelihood of that direct object for similar verbs

Introduction Hindle’s proposal  Words are similar if there is strong statistical evidence that they tend to participate in the same events This paper  Factor word association tendencies into associations of words to certain hidden classes and association between classes themselves  Derive classes directly from data

Introduction Classes  Probabilistic concepts or clusters c p(c|w) for each word w  Different than classical “hard” Boolean classes  Thus, this method is more robust Is not strongly affected by errors in frequency counts Problem in this paper  2 word classes: V and N Relation between a transitive main verb and the head noun of the direct object

Problem Raw knowledge:  f vn – frequency of occurrence of a particular pair (v,n) in the training corpus Unsmoothed probability - conditional density:  p n (v) =  This is p(v|n) Problem  How to use p n to classify the n  N

Methodology Measure of similarity between distributions  Kullback-Leibler distance This problem  Unsupervised learning – leardn underlying distribution of data  Objects have no internal structure, the only info. – statistics about joint appearance (kind of supervised learning)

Distributional Clustering Goal – find clusters such that p n (v) is approximated by: Solve by EM

Hierarchical clustering Deterministic annealing  Sequence of phase transitions Increasing the parameter β  Local influence of each noun on the definition of centroids

Results

Evaluation Relative entropy  Where t n is the relative frequency distribution of verbs taking n as direct object in the test set

Evaluation Check if the model can disambiguate between two verbs, v and v’

Distributional clustering of English words Authors: Fernando Pereira, Naftali Tishby, Lillian Lee Presenter: Marian Olteanu.

Similar presentations

Presentation on theme: "Distributional clustering of English words Authors: Fernando Pereira, Naftali Tishby, Lillian Lee Presenter: Marian Olteanu."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributional clustering of English words Authors: Fernando Pereira, Naftali Tishby, Lillian Lee Presenter: Marian Olteanu.

Similar presentations

Presentation on theme: "Distributional clustering of English words Authors: Fernando Pereira, Naftali Tishby, Lillian Lee Presenter: Marian Olteanu."— Presentation transcript:

Similar presentations

About project

Feedback