Lecture 22 Word Similarity Topics word similarity Thesaurus based word similarity Intro. Distributional based word similarityReadings: NLTK book Chapter.

Slides:



Advertisements
Similar presentations
Indexing. Efficient Retrieval Documents x terms matrix t 1 t 2... t j... t m nf d 1 w 11 w w 1j... w 1m 1/|d 1 | d 2 w 21 w w 2j... w 2m 1/|d.
Advertisements

DISTRIBUTIONAL WORD SIMILARITY David Kauchak CS159 Fall 2014.
Semantic News Recommendation Using WordNet and Bing Similarities 28th Symposium On Applied Computing 2013 (SAC 2013) March 21, 2013 Michel Capelle
Lecture 11 Search, Corpora Characteristics, & Lucene Introduction.
SI485i : NLP Set 11 Distributional Similarity slides adapted from Dan Jurafsky and Bill MacCartney.
What is Statistical Modeling
Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model.
Domain Templates Seeing patterns in related words.
Word Sense Disambiguation Ling571 Deep Processing Techniques for NLP February 28, 2011.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Collective Word Sense Disambiguation David Vickrey Ben Taskar Daphne Koller.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 22, 2004.
SLIDE 1IS 240 – Spring 2007 Prof. Ray Larson University of California, Berkeley School of Information Tuesday and Thursday 10:30 am - 12:00.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Learning syntactic patterns for automatic hypernym discovery Rion Snow, Daniel Jurafsky and Andrew Y. Ng Prepared by Ang Sun
Other IR Models Non-Overlapping Lists Proximal Nodes Structured Models Retrieval: Adhoc Filtering Browsing U s e r T a s k Classic Models boolean vector.
Vector Semantics Introduction. Dan Jurafsky Why vector models of meaning? computing the similarity between words “fast” is similar to “rapid” “tall” is.
Feature Selection for Automatic Taxonomy Induction The Features Input: Two terms Output: A numeric score, or. Lexical-Syntactic Patterns Co-occurrence.
Lexical Semantics CSCI-GA.2590 – Lecture 7A
Asymmetric Word Similarity Behrad Assadian Trevor Martin Ben Azvine.
Computational Lexical Semantics Lecture 8: Selectional Restrictions Linguistic Institute 2005 University of Chicago.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
Lecture 6 Hidden Markov Models Topics Smoothing again: Readings: Chapters January 16, 2013 CSCE 771 Natural Language Processing.
Learn to Comment Lance Lebanoff Mentor: Mahdi. Emotion classification of text  In our neural network, one feature is the emotion detected in the image.
No. 1 Classification and clustering methods by probabilistic latent semantic indexing model A Short Course at Tamkang University Taipei, Taiwan, R.O.C.,
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
1 Query Operations Relevance Feedback & Query Expansion.
Feature selection LING 572 Fei Xia Week 4: 1/29/08 1.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Expressing Implicit Semantic Relations without Supervision ACL 2006.
Weighting and Matching against Indices. Zipf’s Law In any corpus, such as the AIT, we can count how often each word occurs in the corpus as a whole =
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Erasmus University Rotterdam Introduction Content-based news recommendation is traditionally performed using the cosine similarity and TF-IDF weighting.
Lecture 21 Computational Lexical Semantics Topics Features in NLTK III Computational Lexical Semantics Semantic Web USCReadings: NLTK book Chapter 10 Text.
Lecture 22 Word Similarity Topics word similarity Thesaurus based word similarity Intro. Distributional based word similarityReadings: NLTK book Chapter.
Query Suggestion. n A variety of automatic or semi-automatic query suggestion techniques have been developed  Goal is to improve effectiveness by matching.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 3. Word Association.
A pTree organization for text mining... Position are April apple and an always. all again a... Term (Vocab)
Web Search and Text Mining Lecture 5. Outline Review of VSM More on LSI through SVD Term relatedness Probabilistic LSI.
CIS 530 Lecture 2 From frequency to meaning: vector space models of semantics.
Information Retrieval Techniques MS(CS) Lecture 7 AIR UNIVERSITY MULTAN CAMPUS Most of the slides adapted from IIR book.
Lecture 24 Distributional Word Similarity II Topics Distributional based word similarity example PMI context = syntactic dependenciesReadings: NLTK book.
Lecture 19 Word Meanings II Topics Description Logic III Overview of MeaningReadings: Text Chapter 189NLTK book Chapter 10 March 27, 2013 CSCE 771 Natural.
NLP.
Natural Language Processing Topics in Information Retrieval August, 2002.
Lecture 24 Distributiona l based Similarity II Topics Distributional based word similarityReadings: NLTK book Chapter 2 (wordnet) Text Chapter 20 April.
Vector Semantics.
Vector Semantics Dense Vectors.
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
Vector Semantics. Dan Jurafsky Why vector models of meaning? computing the similarity between words “fast” is similar to “rapid” “tall” is similar to.
Plan for Today’s Lecture(s)
Lecture 24 Distributional Word Similarity II
Vector Semantics Introduction.
Word Meaning and Similarity
Relational Algebra Chapter 4, Part A
CSC 594 Topics in AI – Natural Language Processing
Vector-Space (Distributional) Lexical Semantics
Lecture 21 Computational Lexical Semantics
Lecture 22 Word Similarity
Statistical NLP: Lecture 9
Test of Independence through Mutual Information
From frequency to meaning: vector space models of semantics
Lecture 22 Word Similarity
Lecture 19 Word Meanings II
Statistical NLP : Lecture 9 Word Sense Disambiguation
Presentation transcript:

Lecture 22 Word Similarity Topics word similarity Thesaurus based word similarity Intro. Distributional based word similarityReadings: NLTK book Chapter 2 (wordnet) Text Chapter 20 April 8, 2013 CSCE 771 Natural Language Processing

– 2 – CSCE 771 Spring 2013 Overview Last Time (Programming) Features in NLTK NL queries  SQL NLTK support for Interpretations and Models Propositional and predicate logic support Prover9Today Last Lectures slides Features in NLTK Computational Lexical SemanticsReadings: Text 19,20 NLTK Book: Chapter 10 Next Time: Computational Lexical Semantics II

– 3 – CSCE 771 Spring 2013 ACL Anthology -

– 4 – CSCE 771 Spring 2013 Figure 20.8 Summary of Thesaurus Similarity measures

– 5 – CSCE 771 Spring 2013 Wordnet similarity functions path_similarity()? lch_similarity()? lch_similarity()? wup_similarity()? wup_similarity()? res_similarity()? res_similarity()? jcn_similarity()? jcn_similarity()? lin_similarity()? lin_similarity()?

– 6 – CSCE 771 Spring 2013 Examples: but first a Pop Quiz How do you get hypernyms from wordnet?

– 7 – CSCE 771 Spring 2013 Example: P(c) values entity physical thing abstraction living thing thing (not specified) non-living thing mammalsamphibiansreptilesnovel snake frogcatdogwhale rightminke idea pacifier#2 pacifier#1 Color code Blue: wordnet Red: Inspired

– 8 – CSCE 771 Spring 2013 Example: counts (made-up) entity physical thing abstraction living thing thing (not specified) non-living thing mammalsamphibiansreptilesnovel snake frogcatdogwhale rightminke idea pacifier#2 pacifier#1 Color code Blue: wordnet Red: Inspired

– 9 – CSCE 771 Spring 2013 Example: P(c) values entity physical thing abstraction living thing thing (not specified) non-living thing mammalsamphibiansreptilesnovel snake frogcatdogwhale rightminke idea pacifier#2 pacifier#1 Color code Blue: wordnet Red: Inspired

– 10 – CSCE 771 Spring 2013 Example: entity physical thing abstraction living thing thing (not specified) non-living thing mammalsamphibiansreptilesnovel snake frogcatdogwhale rightminke idea pacifier#2 pacifier#1 Color code Blue: wordnet Red: Inspired

– 11 – CSCE 771 Spring 2013 sim Lesk (cat, dog) ???  (42)S: (n) dog#1 (dog%1:05:00::), domestic dog#1 (domestic_dog%1:05:00::), Canis familiaris#1 (canis_familiaris%1:05:00::) (a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds) "the dog barked all night“ S:domestic dog#1 (domestic_dog%1:05:00::)Canis familiaris#1 (canis_familiaris%1:05:00::)S:domestic dog#1 (domestic_dog%1:05:00::)Canis familiaris#1 (canis_familiaris%1:05:00::)  (18)S: (n) cat#1 (cat%1:05:00::), true cat#1 (true_cat%1:05:00::) (feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats) S:true cat#1 (true_cat%1:05:00::)S:true cat#1 (true_cat%1:05:00::)  (1)S: (n) wolf#1 (wolf%1:05:00::) (any of various predatory carnivorous canine mammals of North America and Eurasia that usually hunt in packs) S:

– 12 – CSCE 771 Spring 2013 Problems with thesaurus-based  don’t always have a thesaurus  Even so problems with recall  missing words  phrases missing  thesauri work less well for verbs and adjectives  less hyponymy structure Distributional Word Similarity D. Jurafsky

– 13 – CSCE 771 Spring 2013 Distributional models of meaning  vector-space models of meaning  offer higher recall than hand-built thesauri  less precision probably  intuition Distributional Word Similarity D. Jurafsky

– 14 – CSCE 771 Spring 2013 Word Similarity Distributional Methods tezguino example (Nida) A bottle of tezguino is on the table.A bottle of tezguino is on the table. Everybody likes tezguino.Everybody likes tezguino. tezguino makes you drunk.tezguino makes you drunk. We make tezguino out of corn.We make tezguino out of corn. What do you know about tezguino?What do you know about tezguino?

– 15 – CSCE 771 Spring 2013 Term-document matrix  Collection of documents  Identify collection of important terms, discriminatory terms(words)  Matrix: terms X documents –  term frequency tf w,d =  each document a vector in Z V :  Z= integers; N=natural numbers more accurate but perhaps misleading  Example Distributional Word Similarity D. Jurafsky

– 16 – CSCE 771 Spring 2013 Example Term-document matrix Subset of terms = {battle, soldier, fool, clown} Distributional Word Similarity D. Jurafsky As you like it12 th NightJulius CaesarHenry V Battle11815 Soldier fool clown611700

– 17 – CSCE 771 Spring 2013 Figure 20.9 Term in context matrix for word similarity (Co-occurrence vectors) window of 20 words – 10 before 10 after from Brown corpus – words that occur together  non Brown example   The Graduate School requires that all PhD students to be admitted to candidacy at least one year prior to graduation. Passing …   Small table from the Brown 10 before 10 after

– 18 – CSCE 771 Spring 2013 Pointwise Mutual Information td-idf (inverse document frequency) rating instead of raw countstd-idf (inverse document frequency) rating instead of raw counts idf intuition again – pointwise mutual information (PMI)pointwise mutual information (PMI) Do events x and y occur more than if they were independent? PMI(X,Y)= log2 P(X,Y) / P(X)P(Y) PMI between wordsPMI between words Positive PMI between two words (PPMI)Positive PMI between two words (PPMI)

– 19 – CSCE 771 Spring 2013 Computing PPMI  Matrix F with W (words) rows and C (contexts) columns  f ij is frequency of w i in c j,

– 20 – CSCE 771 Spring 2013 Example computing PPMI Need counts so lets make up someNeed counts so lets make up some we need to edit this table to have counts

– 21 – CSCE 771 Spring 2013 Associations PMI-assoc assoc PMI (w, f) = log 2 P(w,f) / P(w) P(f)assoc PMI (w, f) = log 2 P(w,f) / P(w) P(f) Lin- assoc - f composed of r (relation) and w’ assoc LIN (w, f) = log 2 P(w,f) / P(r|w) P(w’|w)assoc LIN (w, f) = log 2 P(w,f) / P(r|w) P(w’|w) t-test_assoc (20.41)

– 22 – CSCE 771 Spring 2013 Figure Co-occurrence vectors  Dependency based parser – special case of shallow parsing  identify from “I discovered dried tangerines.” (20.32)  discover(subject I)I(subject-of discover)  tangerine(obj-of discover)tangerine(adj-mod dried)

– 23 – CSCE 771 Spring 2013 Figure Objects of the verb drink Hindle 1990

– 24 – CSCE 771 Spring 2013 vectors review dot-productlengthsim-cosine

– 25 – CSCE 771 Spring 2013 Figure Similarity of Vectors

– 26 – CSCE 771 Spring 2013 Fig Vector Similarity Summary

– 27 – CSCE 771 Spring 2013 Figure Hand-built patterns for hypernyms Hearst 1992

– 28 – CSCE 771 Spring 2013 Figure 20.15

– 29 – CSCE 771 Spring 2013 Figure 20.16

– 30 – CSCE 771 Spring how to do in nltk NLTK 3.0a1 released : February 2013 This version adds support for NLTK’s graphical user interfaces. This version adds support for NLTK’s graphical user interfaces. which similarity function in nltk.corpus.wordnet is Appropriate for find similarity of two words? I want use a function for word clustering and yarowsky algorightm for find similar collocation in a large text.