Lecture 22 Word Similarity

Slides:



Advertisements
Similar presentations
DISTRIBUTIONAL WORD SIMILARITY David Kauchak CS159 Fall 2014.
Advertisements

| 1 › Gertjan van Noord2014 Zoekmachines Lecture 4.
SI485i : NLP Set 11 Distributional Similarity slides adapted from Dan Jurafsky and Bill MacCartney.
Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model.
Word Sense Disambiguation Ling571 Deep Processing Techniques for NLP February 28, 2011.
Information Retrieval Ling573 NLP Systems and Applications April 26, 2011.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 12: Language Models for IR.
SLIDE 1IS 240 – Spring 2007 Prof. Ray Larson University of California, Berkeley School of Information Tuesday and Thursday 10:30 am - 12:00.
1 CS 430 / INFO 430 Information Retrieval Lecture 3 Vector Methods 1.
The Vector Space Model …and applications in Information Retrieval.
Learning syntactic patterns for automatic hypernym discovery Rion Snow, Daniel Jurafsky and Andrew Y. Ng Prepared by Ang Sun
1 CS 430 / INFO 430 Information Retrieval Lecture 9 Latent Semantic Indexing.
Web search basics (Recap) The Web Web crawler Indexer Search User Indexes Query Engine 1 Ad indexes.
Lexical Semantics CSCI-GA.2590 – Lecture 7A
Computational Lexical Semantics Lecture 8: Selectional Restrictions Linguistic Institute 2005 University of Chicago.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Feature selection LING 572 Fei Xia Week 4: 1/29/08 1.
Weighting and Matching against Indices. Zipf’s Law In any corpus, such as the AIT, we can count how often each word occurs in the corpus as a whole =
Term Frequency. Term frequency Two factors: – A term that appears just once in a document is probably not as significant as a term that appears a number.
Lecture 22 Word Similarity Topics word similarity Thesaurus based word similarity Intro. Distributional based word similarityReadings: NLTK book Chapter.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Lecture 21 Computational Lexical Semantics Topics Features in NLTK III Computational Lexical Semantics Semantic Web USCReadings: NLTK book Chapter 10 Text.
Lecture 22 Word Similarity Topics word similarity Thesaurus based word similarity Intro. Distributional based word similarityReadings: NLTK book Chapter.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 3. Word Association.
1 CS 430: Information Discovery Lecture 11 Latent Semantic Indexing.
CIS 530 Lecture 2 From frequency to meaning: vector space models of semantics.
Information Retrieval Techniques MS(CS) Lecture 7 AIR UNIVERSITY MULTAN CAMPUS Most of the slides adapted from IIR book.
Lecture 24 Distributional Word Similarity II Topics Distributional based word similarity example PMI context = syntactic dependenciesReadings: NLTK book.
CpSc 881: Information Retrieval. 2 Using language models (LMs) for IR ❶ LM = language model ❷ We view the document as a generative model that generates.
NLP.
Lecture 24 Distributiona l based Similarity II Topics Distributional based word similarityReadings: NLTK book Chapter 2 (wordnet) Text Chapter 20 April.
Vector Semantics Dense Vectors.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 14: Language Models for IR.
SIMS 202, Marti Hearst Content Analysis Prof. Marti Hearst SIMS 202, Lecture 15.
IR 6 Scoring, term weighting and the vector space model.
From Frequency to Meaning: Vector Space Models of Semantics
Vector Semantics. Dan Jurafsky Why vector models of meaning? computing the similarity between words “fast” is similar to “rapid” “tall” is similar to.
Automated Information Retrieval
Plan for Today’s Lecture(s)
CSCE 590 Web Scraping – Information Extraction II
Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :
Lecture 8: Word Clustering
Issues/Parameters in Vector Model
Semantic Processing with Context Analysis
Lecture 13: Language Models for IR
Lecture 24 Distributional Word Similarity II
Vector Semantics Introduction.
Many slides from Rada Mihalcea (Michigan), Paul Tarau (U.North Texas)
Natural Language Processing (NLP)
Word Meaning and Similarity
Relational Algebra Chapter 4, Part A
CSC 594 Topics in AI – Natural Language Processing
Vector-Space (Distributional) Lexical Semantics
Lecture 21 Computational Lexical Semantics
Lecture 22 Word Similarity
Statistical NLP: Lecture 9
Basic Information Retrieval
Representation of documents and queries
Test of Independence through Mutual Information
From frequency to meaning: vector space models of semantics
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.
Lecture 19 Word Meanings II
Natural Language Processing (NLP)
CS 430: Information Discovery
Statistical NLP : Lecture 9 Word Sense Disambiguation
Term Frequency–Inverse Document Frequency
VECTOR SPACE MODEL Its Applications and implementations
Natural Language Processing (NLP)
Presentation transcript:

Lecture 22 Word Similarity CSCE 771 Natural Language Processing Lecture 22 Word Similarity Topics word similarity Thesaurus based word similarity Intro. Distributional based word similarity Readings: NLTK book Chapter 2 (wordnet) Text Chapter 20 April 8, 2013

Overview Readings: Text 19,20 NLTK Book: Chapter 10 Last Time (Programming) Features in NLTK NL queries  SQL NLTK support for Interpretations and Models Propositional and predicate logic support Prover9 Today Last Lectures slides 25-29 Computational Lexical Semantics Readings: Text 19,20 NLTK Book: Chapter 10 Next Time: Computational Lexical Semantics II

ACL Anthology - http://aclweb.org/anthology-new/

Figure 20.8 Summary of Thesaurus Similarity measures

Wordnet similarity functions path_similarity()? lch_similarity()? wup_similarity()? res_similarity()? jcn_similarity()? lin_similarity()?

Examples: but first a Pop Quiz How do you get hypernyms from wordnet?

Example: P(c) values entity thing (not specified) physical thing abstraction idea pacifier#2 living thing non-living thing mammals amphibians reptiles novel pacifier#1 cat dog whale frog snake right minke Color code Blue: wordnet Red: Inspired

Example: counts (made-up) entity thing (not specified) physical thing abstraction idea pacifier#2 living thing non-living thing mammals amphibians reptiles novel pacifier#1 cat dog whale frog snake right minke Color code Blue: wordnet Red: Inspired

Example: P(c) values entity thing (not specified) physical thing abstraction idea pacifier#2 living thing non-living thing mammals amphibians reptiles novel pacifier#1 cat dog whale frog snake right minke Color code Blue: wordnet Red: Inspired

Example: entity thing (not specified) physical thing abstraction idea pacifier#2 living thing non-living thing mammals amphibians reptiles novel pacifier#1 cat dog whale frog snake right minke Color code Blue: wordnet Red: Inspired

simLesk(cat, dog) ??? (42)S: (n) dog#1 (dog%1:05:00::), domestic dog#1 (domestic_dog%1:05:00::), Canis familiaris#1 (canis_familiaris%1:05:00::) (a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds) "the dog barked all night“ (18)S: (n) cat#1 (cat%1:05:00::), true cat#1 (true_cat%1:05:00::) (feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats) (1)S: (n) wolf#1 (wolf%1:05:00::) (any of various predatory carnivorous canine mammals of North America and Eurasia that usually hunt in packs)

Problems with thesaurus-based don’t always have a thesaurus Even so problems with recall missing words phrases missing thesauri work less well for verbs and adjectives less hyponymy structure Distributional Word Similarity D. Jurafsky

Distributional models of meaning vector-space models of meaning offer higher recall than hand-built thesauri less precision probably intuition Distributional Word Similarity D. Jurafsky

Word Similarity Distributional Methods 20.31 tezguino example (Nida) A bottle of tezguino is on the table. Everybody likes tezguino. tezguino makes you drunk. We make tezguino out of corn. What do you know about tezguino?

Distributional Word Similarity D. Jurafsky Term-document matrix Collection of documents Identify collection of important terms, discriminatory terms(words) Matrix: terms X documents – term frequency tfw,d = each document a vector in ZV: Z= integers; N=natural numbers more accurate but perhaps misleading Example Distributional Word Similarity D. Jurafsky

Example Term-document matrix Subset of terms = {battle, soldier, fool, clown} As you like it 12th Night Julius Caesar Henry V Battle 1 8 15 Soldier 2 12 36 fool 37 58 5 clown 6 117 Distributional Word Similarity D. Jurafsky

Figure 20.9 Term in context matrix for word similarity (Co-occurrence vectors) window of 20 words – 10 before 10 after from Brown corpus – words that occur together non Brown example The Graduate School requires that all PhD students to be admitted to candidacy at least one year prior to graduation. Passing … Small table from the Brown 10 before 10 after

Pointwise Mutual Information td-idf (inverse document frequency) rating instead of raw counts idf intuition again – pointwise mutual information (PMI) Do events x and y occur more than if they were independent? PMI(X,Y)= log2 P(X,Y) / P(X)P(Y) PMI between words Positive PMI between two words (PPMI)

Computing PPMI Matrix F with W (words) rows and C (contexts) columns fij is frequency of wi in cj,

Example computing PPMI Need counts so lets make up some we need to edit this table to have counts

Associations PMI-assoc assocPMI(w, f) = log2 P(w,f) / P(w) P(f) Lin- assoc - f composed of r (relation) and w’ assocLIN(w, f) = log2 P(w,f) / P(r|w) P(w’|w) t-test_assoc (20.41)

Figure 20.10 Co-occurrence vectors Dependency based parser – special case of shallow parsing identify from “I discovered dried tangerines.” (20.32) discover(subject I) I(subject-of discover) tangerine(obj-of discover) tangerine(adj-mod dried)

Figure 20.11 Objects of the verb drink Hindle 1990

vectors review dot-product length sim-cosine

Figure 20.12 Similarity of Vectors

Fig 20.13 Vector Similarity Summary

Figure 20.14 Hand-built patterns for hypernyms Hearst 1992

Figure 20.15

Figure 20.16

http://www. cs. ucf. edu/courses/cap5636/fall2011/nltk http://www.cs.ucf.edu/courses/cap5636/fall2011/nltk.pdf how to do in nltk NLTK 3.0a1 released : February 2013 This version adds support for NLTK’s graphical user interfaces. http://nltk.org/nltk3-alpha/ which similarity function in nltk.corpus.wordnet is Appropriate for find similarity of two words? I want use a function for word clustering and yarowsky algorightm for find similar collocation in a large text. http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Linguistics http://en.wikipedia.org/wiki/Portal:Linguistics http://en.wikipedia.org/wiki/Yarowsky_algorithm http://nltk.googlecode.com/svn/trunk/doc/howto/wordnet.html