Incorporating Dictionary and Corpus Information into a Context Vector Measure of Semantic Relatedness Siddharth Patwardhan Advisor: Ted Pedersen 07/18/2003.

Slides:



Advertisements
Similar presentations
Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme Presented by Smitashree Choudhury.
Advertisements

A UTOMATICALLY A CQUIRING A S EMANTIC N ETWORK OF R ELATED C ONCEPTS Date: 2011/11/14 Source: Sean Szumlanski et. al (CIKM’10) Advisor: Jia-ling, Koh Speaker:
2 Information Retrieval System IR System Query String Document corpus Ranked Documents 1. Doc1 2. Doc2 3. Doc3.
Semantic News Recommendation Using WordNet and Bing Similarities 28th Symposium On Applied Computing 2013 (SAC 2013) March 21, 2013 Michel Capelle
1 Extended Gloss Overlaps as a Measure of Semantic Relatedness Satanjeev Banerjee Ted Pedersen Carnegie Mellon University University of Minnesota Duluth.
Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.
Creating a Similarity Graph from WordNet
Measures of Text Similarity
K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
CSM06 Information Retrieval Lecture 3: Text IR part 2 Dr Andrew Salway
Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.
Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Presenter: Cosmin Adrian Bejan Alexander Budanitsky and.
June 19-21, 2006WMS'06, Chania, Crete1 Design and Evaluation of Semantic Similarity Measures for Concepts Stemming from the Same or Different Ontologies.
Singular Value Decomposition in Text Mining Ram Akella University of California Berkeley Silicon Valley Center/SC Lecture 4b February 9, 2011.
Generating topic chains and topic views: Experiments using GermaNet Irene Cramer, Marc Finthammer, and Angelika Storrer Faculty.
Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Written by Alexander Budanitsky Graeme Hirst Retold by.
1 Complementarity of Lexical and Simple Syntactic Features: The SyntaLex Approach to S ENSEVAL -3 Saif Mohammad Ted Pedersen University of Toronto, Toronto.
Dimension of Meaning Author: Hinrich Schutze Presenter: Marian Olteanu.
Semantic Video Classification Based on Subtitles and Domain Terminologies Polyxeni Katsiouli, Vassileios Tsetsos, Stathes Hadjiefthymiades P ervasive C.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Using Information Content to Evaluate Semantic Similarity in a Taxonomy Presenter: Cosmin Adrian Bejan Philip Resnik Sun Microsystems Laboratories.
Feature Selection for Automatic Taxonomy Induction The Features Input: Two terms Output: A numeric score, or. Lexical-Syntactic Patterns Co-occurrence.
SI485i : NLP Set 10 Lexical Relations slides adapted from Dan Jurafsky and Bill MacCartney.
WORDNET Approach on word sense techniques - AKILAN VELMURUGAN.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.
Personalisation Seminar on Unlocking the Secrets of the Past: Text Mining for Historical Documents Sven Steudter.
1 Statistical NLP: Lecture 10 Lexical Acquisition.
Geometric Conceptual Spaces Ben Adams GEOG 288MR Spring 2008.
Jiuling Zhang  Why perform query expansion?  WordNet based Word Sense Disambiguation WordNet Word Sense Disambiguation  Conceptual Query.
Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Paper Review by Utsav Sinha August, 2015 Part of assignment in CS 671: Natural Language Processing, IIT Kanpur.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.
An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
HyperLex: lexical cartography for information retrieval Jean Veronis Presented by: Siddhanth Jain( ) Samiulla Shaikh( )
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
Semantics-Based News Recommendation with SF-IDF+ International Conference on Web Intelligence, Mining, and Semantics (WIMS 2013) June 13, 2013 Marnix Moerland.
1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.
Algorithmic Detection of Semantic Similarity WWW 2005.
Using Semantic Relatedness for Word Sense Disambiguation
1 Latent Concepts and the Number Orthogonal Factors in Latent Semantic Analysis Georges Dupret
1 Measuring the Semantic Similarity of Texts Author : Courtney Corley and Rada Mihalcea Source : ACL-2005 Reporter : Yong-Xiang Chen.
1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling.
Semantics-Based News Recommendation International Conference on Web Intelligence, Mining, and Semantics (WIMS 2012) June 14, 2012 Michel Capelle
NLP.
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
Semantic Grounding of Tag Relatedness in Social Bookmarking Systems Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme ISWC 2008 Hyewon Lim January.
Natural Language Processing Topics in Information Retrieval August, 2002.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
WordNet::Similarity Measuring the Relatedness of Concepts Yue Wang Department of Computer Science.
Semantic Evaluation of Machine Translation Billy Wong, City University of Hong Kong 21 st May 2010.
Sentence Similarity Based on Semantic Nets and Corpus Statistics
Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.
Vector Semantics Dense Vectors.
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
CSCI 5832 Natural Language Processing
Statistical NLP: Lecture 9
A method for WSD on Unrestricted Text
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.
Giannis Varelas Epimenidis Voutsakis Paraskevi Raftopoulou
Unsupervised Word Sense Disambiguation Using Lesk algorithm
Statistical NLP : Lecture 9 Word Sense Disambiguation
Presentation transcript:

Incorporating Dictionary and Corpus Information into a Context Vector Measure of Semantic Relatedness Siddharth Patwardhan Advisor: Ted Pedersen 07/18/2003

Semantic Relatedness Is needle more related to thread than it is to pie? Humans agree on the relatedness of most word pairs. Miller and Charles (1991) suggest that relatedness is based on the overlap of contextual representation of words.

Measuring Relatedness Automatic measures that attempt to imitate the human perception of relatedness of words and concepts. Number of measures of relatedness, based on WordNet and corpus data have been proposed. In this thesis, we compare various of measures of semantic relatedness.

WordNet Semantic network. Nodes represents real world concepts. Rich network of relationships between these concepts. Relationships such as “car is a kind of vehicle”, “high opposite of low”, etc exist. Node = Synonyms + Definition (gloss)

WordNet – Schematic

WordNet – Is a Hierarchy

WordNet – Overview Four parts of speech – nouns, verbs, adjectives and adverbs. ~ 111,400 concepts. ~ 13 types of relationships. 9 noun is a hierarchies with an average depth of verb is a hierarchies with an average depth of 2.

The Adapted Lesk Algorithm Performs Word Sense Disambiguation (Banerjee and Pedersen 2002). Uses the overlaps of dictionary definitions of word senses to determine sense of the target. Basic hypothesis – correct sense of the target word is most related to the senses of the words in its context. Overlaps measure the relatedness.

Our Extension of Adapted Lesk Use any measure in place of gloss overlaps and perform WSD. Extended Gloss Overlaps is a measure of semantic relatedness. Adapted Lesk WSD Algorithm Relatedness Measure Resources Context Target Sense

Extended Gloss Overlaps Measure c1c1 c2c2 c 12 c 11 c 22 c 21 WordNet relation Gloss Overlap

Gloss Overlaps – Scoring A fruit bearing coniferous tree that grows in hilly regions. The fruit of a coniferous tree used in salad = = 5 an artificial source of visible illumination 0

Leacock-Chodorow Measure (1998) Based on simple edge counts in the is a hierarchy of WordNet. Deals with nouns only. The path length is scaled by the depth of the taxonomy. Relatedness(c 1, c 2 ) = –log(path_length / 2D) where c 1 and c 2 are the concepts and D is the depth of the taxonomy.

The Resnik Measure (1995) Deals with nouns only and is based on the is a hierarchy of WordNet Uses Information Content of concepts. Information Content of a concept indicates the specificity of the concept. IC(concept) = –log(P(concept)) Probability of occurrence of concept in a corpus is calculated using its frequency in the corpus. P(concept) = freq(concept)/freq(root)

Information Content Counting the frequency of concepts: Occurrence of a concept implies occurrence of all its subsuming concepts. Root node includes the count of all concepts in the hierarchy. Counting from non sense-tagged text raises some issues.

Information Content Motor vehicle (327 +1) *Root* ( ) minicab (6) cab (23) car (73 +1) bus (17) racing car (12)

Information Content racing car (13) *Root* (32785) Motor vehicle (329) minicab (6) cab (23) car (75) bus (17)

Resnik Measure Relatedness(c 1, c 2 ) = IC(lcs(c 1, c 2 )) lcs(c 1, c 2 ) is the lowest concept in the is a hierarchy that subsumes both c 1 and c 2. medium of exchange coin credit card nickel dime

Jiang-Conrath Measure (1997) It is a measure of semantic distance: distance = IC(c 1 ) + IC(c 2 ) – 2 · IC(lcs(c 1, c 2 )) We inverted it and used it as a measure of semantic relatedness: Relatedness(c 1, c 2 ) = 1 / distance Also deals with nouns only. Has a lower bound of 0, no upper bound.

Lin Measure (1998) Ranges between 0 and 1. If either IC(c 1 ) or IC(c 2 ) is zero, the relatedness is zero.

Hirst-St.Onge Measure (1998) Word pairs can be related to each other by the extra-strong, medium-strong or the strong relationship. All WordNet relations are categorized as upward, downward or horizontal relations. The measure assigns relatedness to words, rather than concepts. We modify it for concepts.

Need for a New Measure Preliminary results (Patwardhan, Banerjee and Pedersen 2003) show that Extended Gloss Overlaps does really well at WSD. Preliminary results also show that Jiang- Conrath does very well at WSD. One uses WordNet glosses, while the other is based on statistics derived from a corpus of text. Extended Gloss Overlaps – too exact.

A Vector Measure Represents glosses as multidimensional vectors of co-occurrence counts. Relatedness defined as the cosine of the gloss vectors. This alternate representation overcomes the “exactness” of the Extended Gloss Overlaps measure. Based on context vectors of Schütze (1998).

Vector Measure – Word Space We start by creating a “word space” – a list of words that will form the dimensions of the vector space. These words must be highly topical content words. We use a stop list and frequency cutoffs on the words in WordNet glosses to create this list (~54,000 words).

Vector Measure – Word Vectors A word vector is created corresponding to every content word w in the WordNet glosses. The words of the Word Space are the dimensions of this vector. The vector contains the co-occurrence counts of words co-occurring with w in a large corpus. coin=[ 35, 3, 0, 0, 56, 14 ] dollar dime movie noon cent bank

Vector Measure – Gloss Vectors Gloss vector for a concept is created by adding the word vectors for all the content words in its gloss. Gloss may be augmented by concatenating glosses of related concepts in WordNet (similar to Extended Gloss Overlaps). an artificial source of visible illumination

Gloss Vector – An Example eyeredprisongemhigh artificial source visible illumination Gloss vector an artificial source of visible illumination

Visualizing the Vectors gem red artificial source visible illumination Gloss vector

Vector Measure The values of this measure ranges between 0 and 1. The measure combines dictionary and corpus information to measure semantic relatedness.

Comparison of the Measures to Human Relatedness We use 30 word pairs from Miller and Charles’ experiment. These are a subset of 65 pairs used by Rubenstein and Goodenough (1965) in a similar experiment. We find the correlation of the ranking by the measures with human ranking of the 30 pairs.

The Word Pairs Car Automobile Magician Wizard Tool Implement Cemetery Woodland Coast Forest Gem Jewel Midday Noon Brother Monk Food Rooster Lad Wizard Journey Voyage Furnace Stove Lad Brother Coast Hill Chord Smile Boy Lad Food Fruit Crane Implement Forest Graveyard Glass Magician Coast Shore Bird Cock Journey Car Shore Woodland Rooster Voyage Asylum Madhouse Bird Crane Monk Oracle Monk Slave Noon String

Human Relatedness Study MeasureM&CR&G Vector Jiang-Conrath Extended Gloss Overlaps0.81 Hirst-St.Onge Resnik Lin Leacock-Chodorow

Variations in the Vector Measure Word Vector Dimensions Relations AllGloss No frequency cutoffs ,000 most frequent words Words with frequencies 5 to 1,

Variations on the Information Content based Measures SourceResLinJcn SemCor (200,000) Brown (1,000,000) Treebank (1,000,000) BNC (100,000,000)

Effect of Smoothing and Counting Information Content from BNC ResLinJcn Our counting, no smoothing Our counting, smoothing Resnik counting, no smoothing Resnik counting, smoothing

Application-oriented Comparison Using Adapted Lesk as test-bed, determine accuracy on S ENSEVAL- 2 data, using the various measures. Window of context of 5 was used.

WSD Results MeasureNouns Only All POS Jiang-Conrath 0.46n/a Ex. Gloss Overlaps Lin 0.39n/a Vector Hirst-St.Onge Resnik 0.29n/a Leacock Chodorow 0.28n/a

Conclusions Modified the Adapted Lesk algorithm, to use any measure of relatedness for WSD. Introduced Gloss Overlaps as a measure of semantic relatedness. Created a new measure of relatedness, based on context vectors. Compared these to five other measures of semantic relatedness.

Conclusions Comparison was done with respect to human perception of relatedness. An application-oriented comparison of the measures was also done.

Future Work Determining ways to get better word vectors (frequency cutoffs). Dimensionality reduction techniques, such as Singular Value Decomposition (SVD). Use of semantic relatedness in Medical Informatics (on-going). Principled approach to context selection (WSD).

WordNet::Similarity User base (A conservative estimate): United States Carnegie Mellon University Texas Tech University University of California, Berkeley Stanford University University of Maryland, College Park University of Minnesota, Duluth University of North Texas Mayo Clinic, Rochester Canada University of Alberta University of Ottawa University of Toronto York University Austrailia Monash University China Microsoft Research Spain Basque Country University