Leveraging Sentiment to Compute Word Similarity GWC 2012, Matsue, Japan Balamurali A R *,+ Subhabrata Mukherjee + Akshat Malu + Pushpak Bhattacharyya +

Leveraging Sentiment to Compute Word Similarity GWC 2012, Matsue, Japan Balamurali A R *,+ Subhabrata Mukherjee + Akshat Malu + Pushpak Bhattacharyya + * IITB-Monash Research Academy, IIT Bombay + Dept. of Computer Science and Engineering, IIT Bombay Authors: Slides Acknowledgement: Akshat Malu

Roadmap  Similarity Metrics  SenSim Metric: Our Approach  Evaluation  Results & Conclusion

Similarity Metrics

 An unavoidable component in an NLP system  Example :Word Sense disambiguation (Banerjee & Pedersen 2002), malapropism detection (Hirst & St-Onge,1997)  Underlying principle: Distributional similarity in terms of their meaning.  Example: Refuge and Asylum are similar  Existing approaches: Finding the similarity between a word pair based on their meaning (definition)

Similarity Metrics – Is meaning alone enough? Refuge Mad house Asylum Mad house

Similarity & Sentiment  Our hypothesis “Knowing the sentiment content of the words is beneficial in measuring their similarity”

SenSim Metric: Our Approach

SenSim Metric  Using sentiment along with the meaning of the words to calculate their similarity  The gloss of the synset is the most informative piece  We leverage it in calculating both, the meaning based similarity as well as the sentiment similarity of the word pair  We use a gloss-vector based approach with cosine similarity in our metric

Gloss Vector  Gloss vector is created by representing all the words of the gloss in the form of a vector  Assumption: Synset for the word is already known  Each dimension of the gloss vector represents a sentiment score of the respective content word  Sentiment scores are obtained from different scoring functions based on an external lexicon  SentiWordNet 1.0 is used as the external lexicon  Problem: The vector thus formed is too sparse

Augmenting Gloss Vector  To counter sparsity of gloss vectors, they are augmented with the gloss of the related synsets  The context is further extended by adding the gloss of synsets of the words present in the gloss of the original word Refuge a shelter from danger or hardship Area CountryHarborage a shelter from danger or hardshipa shelter from danger or hardship; a particular geographical region of indefinite boundarya shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge a shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge; a structure that provides privacy and protection from danger a shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge; a structure that provides privacy and protection from danger; the condition of being susceptible to harm or injury a shelter from danger or hardship; a particular geographical region of indefinite boundary; a place of refuge; a structure that provides privacy and protection from danger; the condition of being susceptible to harm or injury; a state of misfortune or affliction a shelter from danger or hardship Hyponymy Hypernymy Hyponymy

Scoring Functions  Sentiment Difference (SD)  Difference between the positive and negative sentiment values  Sentiment Max (SM)  The greater of the positive and negative sentiment values  Sentiment Threshold Difference (TD)  Same as SD but with a minimum threshold value  Sentiment Threshold Max (TM)  Same as SM but with a minimum threshold value Score SD (A) = SWN pos (A) – SWN neg (A)Score SM (A) = max(SWN pos (A), SWN neg (A)) Score TD (A) = sign(max(SWN pos (A), SWN neg (A)) )* (1+abs(SWN pos (A) – SWN neg (A))) Score TM (A) = sign(max(SWN pos (A), SWN neg (A))) * (1+abs(max(SWN pos (A), SWN neg (A))))

SenSim Metric SenSim_x(A,B) = cosine (gloss vec (sense(A),gloss vec (sense(B)) gloss vec = 1:score_x(1) 2:score_x(2)............ n:score_x(n) Score_X (Y) = Sentiment score of word Y using scoring function x x = Scoring function of type SD/SM/TD/TM gloss vec = 1:score_x(1) 2:score_x(2)............ n:score_x(n) Score_X (Y) = Sentiment score of word Y using scoring function x x = Scoring function of type SD/SM/TD/TM

Evaluation

Synset Replacement using Similarity metrics Comparing scores * given by SenSim with those given by human annotators * all scores normalized to a scale of 1-5 (1-least similar, 5-most similar) Evaluation IntrinsicExtrinsic Correlation with Human Annotators Annotation based on meaning Annotation based on sentiment and meaning combined Correlation between other Metrics Correlation with Human Annotators Correlation between other Metrics Synset Replacement using Similarity metrics

 Unknown feature problem in supervised classification– If Test_Synset T is not in Train_synset_list Get Train_Synset_List Get Test_Synset_List Training corpus(Synset) Test corpus(Synset) Using similarity metric S find a similar synset R Replace T with R New Test corpus(Synset ) Yes No Metrics used: LIN (Lin, 1998),LCH (Leacock and Chodorow 1998), Lesk (Banerjee and Pedersen, 2002)

Results & Conclusion

Results – Intrinsic Evaluation (1/4)  Sentiment as a parameter for finding similarity (1/2)  Adding sentiment to the context yields better correlation among the annotators  The decrease in correlation on adding sentiment in case of NOUN can be because sentiment does not play that important role in this case Annotation Strategy OverallNOUNVERBADJECTIVEADVERB Meaning0.7680.8030.7500.5270.759 Meaning+Sentiment0.7990.7500.8890.7200.844 Pearson Correlation Coefficient between two annotators for various annotation strategies

Results – Intrinsic Evaluation (2/4)  Sentiment as a parameter for finding similarity (2/2) Metric UsedOverallNOUNVERBADJECTIVEADVERB LESK0.220.51-0.91 0.190.37 LIN0.270.240.00NA LCH0.360.340.44NA SenSim (SD)0.460.730.55 0.080.76 SenSim (SM)0.500.620.48 0.060.54 SenSim (TD)0.450.730.55 0.080.59 SenSim (TM)0.480.620.48 0.060.78 Pearson Correlation(r) of various metrics with Gold standard data * all experiments performed on gold standard data consisting 48 sense-marked word pairs

Results- Extrinsic Evaluation (3/4)  Effect of SenSim on synset replacement strategy  Baseline signifies the experiment where in there are no synset replacements Metric UsedAccuracy (%)PPNPPRNR Baseline89.1091.5087.0785.1891.24 LESK89.3691.5787.4685.6891.25 LIN89.2791.2487.6185.8590.90 LCH89.6490.4888.8686.4789.63 SenSim (SD)89.9591.3988.6587.1190.93 SenSim (SM)90.0692.0188.3886.6791.58 SenSim (TD)90.1191.6888.6986.9791.23 SenSim (TM)90.1791.8188.7187.0991.36 Classification results of synset replacement experiment using different similarity metrics; * PP-Positive Precision (%), NP-Negative Precision(%), PR-Positive Recall (%), NR-Negative Recall (%)

Results- Extrinsic Evaluation (4/4)  Effect of SenSim on synset replacement strategy  Improvement is only marginal as no complex features are used for training the classifier Metric UsedAccuracy (%)PPNPPRNR Baseline89.1091.5087.0785.1891.24 LESK89.3691.5787.4685.6891.25 LIN89.2791.2487.6185.8590.90 LCH89.6490.4888.8686.4789.63 SenSim (SD)89.9591.3988.6587.1190.93 SenSim (SM)90.0692.0188.3886.6791.58 SenSim (TD)90.1191.6888.6986.9791.23 SenSim (TM)90.1791.8188.7187.0991.36 Classification results of synset replacement experiment using different similarity metrics; * PP-Positive Precision (%), NP-Negative Precision(%), PR-Positive Recall (%), NR-Negative Recall (%)

Conclusions  Proposed that sentiment content can aid in similarity measurement, which to date has been done on the basis of meaning alone.  Verified this hypothesis by taking the correlation between annotators using different annotation strategies  Annotator correlation on including sentiment as an additional parameter for similarity measurement was higher than just semantic similarity  SenSim, based on this hypothesis, performs better than the existing metrics, which fail to account for sentiment while calculating similarity

References

References (1/2)  Balamurali, A., Joshi, A. & Bhattacharyya, P. (2011), Harnessing wordnet senses for supervised sentiment classification, in ‘Proc. Of EMNLP-2011’.  Banerjee, S. & Pedersen, T. (2002), An adapted lesk algorithm for word sense disambiguation using wordnet, in ‘Proc. of CICLing-02’. Banerjee, S. & Pedersen, T. (2003), Extended gloss overlaps as a measure of semantic relatedness, in ‘Proc. of IJCAI-03’.  Esuli, A. & Sebastiani, F. (2006), SentiWordNet: A publicly available lexical resource for opinion mining, in ‘Proceedings of LREC-06’, Genova, IT.  Grishman, R. (2001), Adaptive information extraction and sublanguage analysis, in ‘Proc. Of IJCAI-01’.  Hirst, G. & St-Onge, D. (1997), ‘Lexical chains as representation of context for the detection and correction malapropisms’.  Jiang, J. J. & Conrath, D. W. (1997), Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy, in ‘Proc. of ROCLING X’.  Leacock, C. & Chodorow, M. (1998), Combining local context with wordnet similarity for word sense identification, in ‘WordNet: A Lexical Reference System and its Application’.  Leacock, C., Miller, G. A. & Chodorow, M. (1998), ‘Using corpus statistics and wordnet relations for sense identification’, Comput. Linguist. 24.  Lin, D. (1998), An information-theoretic definition of similarity, in ‘Proc. of ICML ’98’. Partington, A. (2004), ‘Utterly content in each others company semantic prosody and semantic preference’, International Journal of Corpus Linguistics 9(1).

 Patwardhan, S. (2003), Incorporating dictionary and corpus information into a context vector measure of semantic relatedness, Master’s thesis, University of Minnesota, Duluth.  Pedersen, T., Patwardhan, S. & Michelizzi, J. (2004), Wordnet::similarity: measuring the relatedness of concepts, in ‘Demonstration Papers at HLT-NAACL’04’.  Rada, R., Mili, H., Bicknell, E. & Blettner, M. (1989), ‘Development and application of a metric on semantic nets’, IEEE Transactions on Systems Management and Cybernetics 19(1).  Resnik, P. (1995a), Disambiguating noun groupings with respect to Wordnet senses, in ‘Proceedings of the Third Workshop on Very Large Corpora’, Somerset, New Jersey.  Resnik, P. (1995b), Using information content to evaluate semantic similarity in a taxonomy, in ‘Proc. of IJCAI-95’.  Richardson, R., Smeaton, A. F. & Murphy, J. (1994), Using wordnet as a knowledge base for measuring semantic similarity between words, Technical report, Proc. of AICS-94.  Sinclair, J. (2004), Trust the Text: Language, Corpus and Discourse, Routledge.  Wan, S. & Angryk, R. A. (2007), Measuring semantic similarity using wordnet-based context vectors., in ‘Proc. of SMC’07’.  Wu, Z. & Palmer, M. (1994), Verb semantics and lexical selection, in ‘Proc. of ACL-94’, New Mexico State University, Las Cruces, New Mexico.  Zhong, Z. & Ng, H. T. (2010), It makes sense: A wide-coverage word sense disambiguation system  for free text., in ‘ACL (System Demonstrations)’ 10’. References (2/2)

Back up Slides…

Metrics used for Comparison  LIN  uses the information content individually possessed by two concepts in addition to that shared by them.  Lesk  based on the overlap of words in their individual glosses  Leacock & Chodorow (LCH)  the shortest path through hypernymy relation sim LIN (A,B) = 2 X log Pr (lso (A,B)) log Pr (A) + log Pr (B) sim LCH (A,B) = -log ( len (A,B) 2D )

How?  Use of WordNet Similarity Metrics If Test_Synset T is not in Train_synset_list Get Train_Synset_List Get Test_Synset_List Training corpus(Synset) Test corpus(Synset) Using similarity metric S find a similar synset R Replace T with R New Test corpus(Synset ) Yes No Metrics used: LIN (Lin, 1998),LCH (Leacock and Chodorow 1998), Lesk (Banerjee and Pedersen, 2002)

Synset Replacement using Similarity Metric

Experimental Setup  Datasets Used  Intrinsic Evaluation  Gold Standard Data containing 48 sense-marked word pairs  Extrinsic Evaluation  Dataset provided by Balamurali et al (2011)  Word Sense Disambiguation carried out using the WSD engine by Zhong & Ng (2010) (82% accuracy)  WordNet::Similarity 2.05 package used for computing similarity by other metric scores  Pearson Correlation Coefficient used to find inter-annotator agreement  Sentiment Classification done using C-SVM; all results are average of five-fold cross-validation accuracies

Leveraging Sentiment to Compute Word Similarity GWC 2012, Matsue, Japan Balamurali A R *,+ Subhabrata Mukherjee + Akshat Malu + Pushpak Bhattacharyya +

Similar presentations

Presentation on theme: "Leveraging Sentiment to Compute Word Similarity GWC 2012, Matsue, Japan Balamurali A R *,+ Subhabrata Mukherjee + Akshat Malu + Pushpak Bhattacharyya +"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Leveraging Sentiment to Compute Word Similarity GWC 2012, Matsue, Japan Balamurali A R *,+ Subhabrata Mukherjee + Akshat Malu + Pushpak Bhattacharyya +

Similar presentations

Presentation on theme: "Leveraging Sentiment to Compute Word Similarity GWC 2012, Matsue, Japan Balamurali A R *,+ Subhabrata Mukherjee + Akshat Malu + Pushpak Bhattacharyya +"— Presentation transcript:

Similar presentations

About project

Feedback