Leveraging Sentiment to Compute Word Similarity GWC 2012, Matsue, Japan Balamurali A R *,+ Subhabrata Mukherjee + Akshat Malu + Pushpak Bhattacharyya +

Slides:

Advertisements

Similar presentations

A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.

Advertisements

CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)

Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.

A Robust Approach to Aligning Heterogeneous Lexical Resources Mohammad Taher Pilehvar Roberto Navigli MultiJEDI ERC

Improved TF-IDF Ranker

Towards Twitter Context Summarization with User Influence Models Yi Chang et al. WSDM 2013 Hyewon Lim 21 June 2013.

Semantic News Recommendation Using WordNet and Bing Similarities 28th Symposium On Applied Computing 2013 (SAC 2013) March 21, 2013 Michel Capelle

Using Semantic Similarity Measures in the Biomedical Domain for Computing Similarity between Genes based on Gene Ontology By : Elham Khabiri Adviser :

1 Extended Gloss Overlaps as a Measure of Semantic Relatedness Satanjeev Banerjee Ted Pedersen Carnegie Mellon University University of Minnesota Duluth.

Incorporating Dictionary and Corpus Information into a Context Vector Measure of Semantic Relatedness Siddharth Patwardhan Advisor: Ted Pedersen 07/18/2003.

Measures of Text Similarity

Sentiment Analysis An Overview of Concepts and Selected Techniques.

LEDIR : An Unsupervised Algorithm for Learning Directionality of Inference Rules Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From EMNLP.

GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.

Automatic Metaphor Interpretation as a Paraphrasing Task Ekaterina Shutova Computer Lab, University of Cambridge NAACL 2010.

Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.

Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam

Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.

Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Presenter: Cosmin Adrian Bejan Alexander Budanitsky and.

Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.

Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Written by Alexander Budanitsky Graeme Hirst Retold by.

Learning syntactic patterns for automatic hypernym discovery Rion Snow, Daniel Jurafsky and Andrew Y. Ng Prepared by Ang Sun

Semantic Video Classification Based on Subtitles and Domain Terminologies Polyxeni Katsiouli, Vassileios Tsetsos, Stathes Hadjiefthymiades P ervasive C.

A Random Graph Walk based Approach to Computing Semantic Relatedness Using Knowledge from Wikipedia Presenter: Ziqi Zhang OAK Research Group, Department.

Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.

Learning Information Extraction Patterns Using WordNet Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield,

Carmen Banea, Rada Mihalcea University of North Texas A Bootstrapping Method for Building Subjectivity Lexicons for Languages.

Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : UNIVERSIT´E CATHOLIQUE DE LOUVAIN, BELGIUM ASSOCIATION FOR COMPUTING MACHINERY.

Jiuling Zhang  Why perform query expansion?  WordNet based Word Sense Disambiguation WordNet Word Sense Disambiguation  Conceptual Query.

Part 3. Knowledge-based Methods for Word Sense Disambiguation.

Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation

Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.

A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark Greenwood Natural Language Processing Group University of Sheffield, UK.

Complex Linguistic Features for Text Classification: A Comprehensive Study Alessandro Moschitti and Roberto Basili University of Texas at Dallas, University.

1 Statistical NLP: Lecture 9 Word Sense Disambiguation.

Paper Review by Utsav Sinha August, 2015 Part of assignment in CS 671: Natural Language Processing, IIT Kanpur.

WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.

SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.

10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.

2014 EMNLP Xinxiong Chen, Zhiyuan Liu, Maosong Sun State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information.

A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:

14/12/2009ICON Dipankar Das and Sivaji Bandyopadhyay Department of Computer Science & Engineering Jadavpur University, Kolkata , India ICON.

A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield, UK.

A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,

Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,

Semantics-Based News Recommendation with SF-IDF+ International Conference on Web Intelligence, Mining, and Semantics (WIMS 2013) June 13, 2013 Marnix Moerland.

Lecture 21 Computational Lexical Semantics Topics Features in NLTK III Computational Lexical Semantics Semantic Web USCReadings: NLTK book Chapter 10 Text.

Using Semantic Relatedness for Word Sense Disambiguation

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,

Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.

SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining

1 Measuring the Semantic Similarity of Texts Author ： Courtney Corley and Rada Mihalcea Source ： ACL-2005 Reporter ： Yong-Xiang Chen.

1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling.

Semantics-Based News Recommendation International Conference on Web Intelligence, Mining, and Semantics (WIMS 2012) June 14, 2012 Michel Capelle

From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:

2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.

Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.

Second Language Learning From News Websites Word Sense Disambiguation using Word Embeddings.

Semantic Evaluation of Machine Translation Billy Wong, City University of Hong Kong 21 st May 2010.

Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.

Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.

Word Sense Disambiguation Algorithms in Hindi

Exploring and Navigating: Tools for GermaNet

Bridget McInnes Ted Pedersen Serguei Pakhomov

An Overview of Concepts and Selected Techniques

A method for WSD on Unrestricted Text

Giannis Varelas Epimenidis Voutsakis Paraskevi Raftopoulou

Unsupervised Word Sense Disambiguation Using Lesk algorithm

Presentation transcript:

Leveraging Sentiment to Compute Word Similarity GWC 2012, Matsue, Japan Balamurali A R *,+ Subhabrata Mukherjee + Akshat Malu + Pushpak Bhattacharyya + * IITB-Monash Research Academy, IIT Bombay + Dept. of Computer Science and Engineering, IIT Bombay Authors: Slides Acknowledgement: Akshat Malu

Roadmap  Similarity Metrics  SenSim Metric: Our Approach  Evaluation  Results & Conclusion

Similarity Metrics

 An unavoidable component in an NLP system  Example :Word Sense disambiguation (Banerjee & Pedersen 2002), malapropism detection (Hirst & St-Onge,1997)  Underlying principle: Distributional similarity in terms of their meaning.  Example: Refuge and Asylum are similar  Existing approaches: Finding the similarity between a word pair based on their meaning (definition)

Similarity Metrics – Is meaning alone enough? Refuge Mad house Asylum Mad house

Similarity & Sentiment  Our hypothesis “Knowing the sentiment content of the words is beneficial in measuring their similarity”

SenSim Metric: Our Approach

SenSim Metric  Using sentiment along with the meaning of the words to calculate their similarity  The gloss of the synset is the most informative piece  We leverage it in calculating both, the meaning based similarity as well as the sentiment similarity of the word pair  We use a gloss-vector based approach with cosine similarity in our metric

Gloss Vector  Gloss vector is created by representing all the words of the gloss in the form of a vector  Assumption: Synset for the word is already known  Each dimension of the gloss vector represents a sentiment score of the respective content word  Sentiment scores are obtained from different scoring functions based on an external lexicon  SentiWordNet 1.0 is used as the external lexicon  Problem: The vector thus formed is too sparse

Augmenting Gloss Vector  To counter sparsity of gloss vectors, they are augmented with the gloss of the related synsets  The context is further extended by adding the gloss of synsets of the words present in the gloss of the original word Refuge a shelter from danger or hardship Area CountryHarborage a shelter from danger or hardshipa shelter from danger or hardship; a particular geographical region of indefinite boundarya shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge a shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge; a structure that provides privacy and protection from danger a shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge; a structure that provides privacy and protection from danger; the condition of being susceptible to harm or injury a shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge; a structure that provides privacy and protection from danger; the condition of being susceptible to harm or injury; a state of misfortune or affliction a shelter from danger or hardship Hyponymy Hypernymy Hyponymy

Scoring Functions  Sentiment Difference (SD)  Difference between the positive and negative sentiment values  Sentiment Max (SM)  The greater of the positive and negative sentiment values  Sentiment Threshold Difference (TD)  Same as SD but with a minimum threshold value  Sentiment Threshold Max (TM)  Same as SM but with a minimum threshold value Score SD (A) = SWN pos (A) – SWN neg (A)Score SM (A) = max(SWN pos (A), SWN neg (A)) Score TD (A) = sign(max(SWN pos (A), SWN neg (A)) )* (1+abs(SWN pos (A) – SWN neg (A))) Score TM (A) = sign(max(SWN pos (A), SWN neg (A))) * (1+abs(max(SWN pos (A), SWN neg (A))))

SenSim Metric SenSim_x(A,B) = cosine (gloss vec (sense(A),gloss vec (sense(B)) gloss vec = 1:score_x(1) 2:score_x(2) n:score_x(n) Score_X (Y) = Sentiment score of word Y using scoring function x x = Scoring function of type SD/SM/TD/TM gloss vec = 1:score_x(1) 2:score_x(2) n:score_x(n) Score_X (Y) = Sentiment score of word Y using scoring function x x = Scoring function of type SD/SM/TD/TM

Evaluation

Synset Replacement using Similarity metrics Comparing scores * given by SenSim with those given by human annotators * all scores normalized to a scale of 1-5 (1-least similar, 5-most similar) Evaluation IntrinsicExtrinsic Correlation with Human Annotators Annotation based on meaning Annotation based on sentiment and meaning combined Correlation between other Metrics Correlation with Human Annotators Correlation between other Metrics Synset Replacement using Similarity metrics

 Unknown feature problem in supervised classification– If Test_Synset T is not in Train_synset_list Get Train_Synset_List Get Test_Synset_List Training corpus(Synset) Test corpus(Synset) Using similarity metric S find a similar synset R Replace T with R New Test corpus(Synset ) Yes No Metrics used: LIN (Lin, 1998),LCH (Leacock and Chodorow 1998), Lesk (Banerjee and Pedersen, 2002)

Results & Conclusion

Results – Intrinsic Evaluation (1/4)  Sentiment as a parameter for finding similarity (1/2)  Adding sentiment to the context yields better correlation among the annotators  The decrease in correlation on adding sentiment in case of NOUN can be because sentiment does not play that important role in this case Annotation Strategy OverallNOUNVERBADJECTIVEADVERB Meaning Meaning+Sentiment Pearson Correlation Coefficient between two annotators for various annotation strategies

Results – Intrinsic Evaluation (2/4)  Sentiment as a parameter for finding similarity (2/2) Metric UsedOverallNOUNVERBADJECTIVEADVERB LESK LIN NA LCH NA SenSim (SD) SenSim (SM) SenSim (TD) SenSim (TM) Pearson Correlation(r) of various metrics with Gold standard data * all experiments performed on gold standard data consisting 48 sense-marked word pairs

Results- Extrinsic Evaluation (3/4)  Effect of SenSim on synset replacement strategy  Baseline signifies the experiment where in there are no synset replacements Metric UsedAccuracy (%)PPNPPRNR Baseline LESK LIN LCH SenSim (SD) SenSim (SM) SenSim (TD) SenSim (TM) Classification results of synset replacement experiment using different similarity metrics; * PP-Positive Precision (%), NP-Negative Precision(%), PR-Positive Recall (%), NR-Negative Recall (%)

Results- Extrinsic Evaluation (4/4)  Effect of SenSim on synset replacement strategy  Improvement is only marginal as no complex features are used for training the classifier Metric UsedAccuracy (%)PPNPPRNR Baseline LESK LIN LCH SenSim (SD) SenSim (SM) SenSim (TD) SenSim (TM) Classification results of synset replacement experiment using different similarity metrics; * PP-Positive Precision (%), NP-Negative Precision(%), PR-Positive Recall (%), NR-Negative Recall (%)

Conclusions  Proposed that sentiment content can aid in similarity measurement, which to date has been done on the basis of meaning alone.  Verified this hypothesis by taking the correlation between annotators using different annotation strategies  Annotator correlation on including sentiment as an additional parameter for similarity measurement was higher than just semantic similarity  SenSim, based on this hypothesis, performs better than the existing metrics, which fail to account for sentiment while calculating similarity

References

References (1/2)  Balamurali, A., Joshi, A. & Bhattacharyya, P. (2011), Harnessing wordnet senses for supervised sentiment classification, in ‘Proc. Of EMNLP-2011’.  Banerjee, S. & Pedersen, T. (2002), An adapted lesk algorithm for word sense disambiguation using wordnet, in ‘Proc. of CICLing-02’. Banerjee, S. & Pedersen, T. (2003), Extended gloss overlaps as a measure of semantic relatedness, in ‘Proc. of IJCAI-03’.  Esuli, A. & Sebastiani, F. (2006), SentiWordNet: A publicly available lexical resource for opinion mining, in ‘Proceedings of LREC-06’, Genova, IT.  Grishman, R. (2001), Adaptive information extraction and sublanguage analysis, in ‘Proc. Of IJCAI-01’.  Hirst, G. & St-Onge, D. (1997), ‘Lexical chains as representation of context for the detection and correction malapropisms’.  Jiang, J. J. & Conrath, D. W. (1997), Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy, in ‘Proc. of ROCLING X’.  Leacock, C. & Chodorow, M. (1998), Combining local context with wordnet similarity for word sense identification, in ‘WordNet: A Lexical Reference System and its Application’.  Leacock, C., Miller, G. A. & Chodorow, M. (1998), ‘Using corpus statistics and wordnet relations for sense identification’, Comput. Linguist. 24.  Lin, D. (1998), An information-theoretic definition of similarity, in ‘Proc. of ICML ’98’. Partington, A. (2004), ‘Utterly content in each others company semantic prosody and semantic preference’, International Journal of Corpus Linguistics 9(1).

 Patwardhan, S. (2003), Incorporating dictionary and corpus information into a context vector measure of semantic relatedness, Master’s thesis, University of Minnesota, Duluth.  Pedersen, T., Patwardhan, S. & Michelizzi, J. (2004), Wordnet::similarity: measuring the relatedness of concepts, in ‘Demonstration Papers at HLT-NAACL’04’.  Rada, R., Mili, H., Bicknell, E. & Blettner, M. (1989), ‘Development and application of a metric on semantic nets’, IEEE Transactions on Systems Management and Cybernetics 19(1).  Resnik, P. (1995a), Disambiguating noun groupings with respect to Wordnet senses, in ‘Proceedings of the Third Workshop on Very Large Corpora’, Somerset, New Jersey.  Resnik, P. (1995b), Using information content to evaluate semantic similarity in a taxonomy, in ‘Proc. of IJCAI-95’.  Richardson, R., Smeaton, A. F. & Murphy, J. (1994), Using wordnet as a knowledge base for measuring semantic similarity between words, Technical report, Proc. of AICS-94.  Sinclair, J. (2004), Trust the Text: Language, Corpus and Discourse, Routledge.  Wan, S. & Angryk, R. A. (2007), Measuring semantic similarity using wordnet-based context vectors., in ‘Proc. of SMC’07’.  Wu, Z. & Palmer, M. (1994), Verb semantics and lexical selection, in ‘Proc. of ACL-94’, New Mexico State University, Las Cruces, New Mexico.  Zhong, Z. & Ng, H. T. (2010), It makes sense: A wide-coverage word sense disambiguation system  for free text., in ‘ACL (System Demonstrations)’ 10’. References (2/2)

Back up Slides…

Metrics used for Comparison  LIN  uses the information content individually possessed by two concepts in addition to that shared by them.  Lesk  based on the overlap of words in their individual glosses  Leacock & Chodorow (LCH)  the shortest path through hypernymy relation sim LIN (A,B) = 2 X log Pr (lso (A,B)) log Pr (A) + log Pr (B) sim LCH (A,B) = -log ( len (A,B) 2D )

How?  Use of WordNet Similarity Metrics If Test_Synset T is not in Train_synset_list Get Train_Synset_List Get Test_Synset_List Training corpus(Synset) Test corpus(Synset) Using similarity metric S find a similar synset R Replace T with R New Test corpus(Synset ) Yes No Metrics used: LIN (Lin, 1998),LCH (Leacock and Chodorow 1998), Lesk (Banerjee and Pedersen, 2002)

Synset Replacement using Similarity Metric

Experimental Setup  Datasets Used  Intrinsic Evaluation  Gold Standard Data containing 48 sense-marked word pairs  Extrinsic Evaluation  Dataset provided by Balamurali et al (2011)  Word Sense Disambiguation carried out using the WSD engine by Zhong & Ng (2010) (82% accuracy)  WordNet::Similarity 2.05 package used for computing similarity by other metric scores  Pearson Correlation Coefficient used to find inter-annotator agreement  Sentiment Classification done using C-SVM; all results are average of five-fold cross-validation accuracies