Download presentation
Presentation is loading. Please wait.
Published byMarian Phelps Modified over 8 years ago
1
Lecture 24 Distributiona l based Similarity II Topics Distributional based word similarityReadings: NLTK book Chapter 2 (wordnet) Text Chapter 20 April 10, 2013 CSCE 771 Natural Language Processing
2
– 2 – CSCE 771 Spring 2013 Overview Last Time (Programming) Examples of thesaurus based word similarity path-similarity – memory fault ; sim-path(c1,c2) = -log pathlen(c1,c2)nick, Lin extended Lesk – glosses of words need to include hypernymsToday Distributional methodsReadings: Text 19,20 NLTK Book: Chapter 10 Next Time: Distributiona l based Similarity II
3
– 3 – CSCE 771 Spring 2013 Figure 20.8 Summary of Thesaurus Similarity measures Elderly moment IS-A memory fault IS-A mistake sim-path correct in table
4
– 4 – CSCE 771 Spring 2013 Example computing PPMI Need counts so lets make up someNeed counts so lets make up some we need to edit this table to have counts
5
– 5 – CSCE 771 Spring 2013 Associations PMI-assoc assoc PMI (w, f) = log 2 P(w,f) / P(w) P(f)assoc PMI (w, f) = log 2 P(w,f) / P(w) P(f) Lin- assoc - f composed of r (relation) and w’ assoc LIN (w, f) = log 2 P(w,f) / P(r|w) P(w’|w)assoc LIN (w, f) = log 2 P(w,f) / P(r|w) P(w’|w) t-test_assoc (20.41)
6
– 6 – CSCE 771 Spring 2013 Figure 20.10 Co-occurrence vectors Dependency based parser – special case of shallow parsing identify from “I discovered dried tangerines.” (20.32) discover(subject I)I(subject-of discover) tangerine(obj-of discover)tangerine(adj-mod dried)
7
– 7 – CSCE 771 Spring 2013 Figure 20.11 Objects of the verb drink Hindle 1990
8
– 8 – CSCE 771 Spring 2013 vectors review dot-productlengthsim-cosine
9
– 9 – CSCE 771 Spring 2013 Figure 20.12 Similarity of Vectors
10
– 10 – CSCE 771 Spring 2013 Fig 20.13 Vector Similarity Summary
11
– 11 – CSCE 771 Spring 2013 Figure 20.14 Hand-built patterns for hypernyms Hearst 1992
12
– 12 – CSCE 771 Spring 2013 Figure 20.15
13
– 13 – CSCE 771 Spring 2013 Figure 20.16
14
– 14 – CSCE 771 Spring 2013 http://www.cs.ucf.edu/courses/cap5636/fall2011/nltk.pdf how to do in nltk NLTK 3.0a1 released : February 2013 This version adds support for NLTK’s graphical user interfaces. http://nltk.org/nltk3-alpha/ This version adds support for NLTK’s graphical user interfaces. http://nltk.org/nltk3-alpha/ which similarity function in nltk.corpus.wordnet is Appropriate for find similarity of two words? I want use a function for word clustering and yarowsky algorightm for find similar collocation in a large text. http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Linguisticshttp://en.wikipedia.org/wiki/Portal:Linguisticshttp://en.wikipedia.org/wiki/Yarowsky_algorithmhttp://nltk.googlecode.com/svn/trunk/doc/howto/wordnet.html
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.