Sentiment Analysis CMPT 733. Outline What is sentiment analysis? Overview of approach Feature Representation Term Frequency – Inverse Document Frequency.

Slides:



Advertisements
Similar presentations
Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.
Advertisements

Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
IR Models: Overview, Boolean, and Vector
Information Retrieval in Practice
Information Retrieval Ling573 NLP Systems and Applications April 26, 2011.
Ch 4: Information Retrieval and Text Mining
Vector Space Model CS 652 Information Extraction and Integration.
The Vector Space Model …and applications in Information Retrieval.
Recommender systems Ram Akella November 26 th 2008.
TTI's Gender Prediction System using Bootstrapping and Identical-Hierarchy Mohammad Golam Sohrab Computational Intelligence Laboratory Toyota.
Distributed Representations of Sentences and Documents
Chapter 5 Data mining : A Closer Look.
Overview of Search Engines
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
Opinion mining in social networks Student: Aleksandar Ponjavić 3244/2014 Mentor: Profesor dr Veljko Milutinović.
Text mining.
1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.
Sentiment Analysis of Social Media Content using N-Gram Graphs Authors: Fotis Aisopos, George Papadakis, Theordora Varvarigou Presenter: Konstantinos Tserpes.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
Learn to Comment Lance Lebanoff Mentor: Mahdi. Emotion classification of text  In our neural network, one feature is the emotion detected in the image.
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
Feature selection LING 572 Fei Xia Week 4: 1/29/08 1.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Friends and Locations Recommendation with the use of LBSN By EKUNDAYO OLUFEMI ADEOLA
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
1 Computing Relevance, Similarity: The Vector Space Model.
Text mining. The Standard Data Mining process Text Mining Machine learning on text data Text Data mining Text analysis Part of Web mining Typical tasks.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 5. Document Representation and Information Retrieval.
Vector Space Models.
Neural Net Language Models
CIS 530 Lecture 2 From frequency to meaning: vector space models of semantics.
Automating Readers’ Advisory to Make Book Recommendations for K-12 Readers by Alicia Wood.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
Information Retrieval and Web Search IR models: Vector Space Model Term Weighting Approaches Instructor: Rada Mihalcea.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.

IR 6 Scoring, term weighting and the vector space model.
Information Retrieval in Practice
A Simple Approach for Author Profiling in MapReduce
Plan for Today’s Lecture(s)
Jonatas Wehrmann, Willian Becker, Henry E. L. Cagnini, and Rodrigo C
Dimensionality Reduction and Principle Components Analysis
Zheng ZHANG 1-st year PhD candidate Group ILES, LIMSI
Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :
Sentiment analysis algorithms and applications: A survey
Deep learning David Kauchak CS158 – Fall 2016.
Intro to NLP and Deep Learning
Enhancing User identification during Reading by Applying Content-Based Text Analysis to Eye- Movement Patterns Akram Bayat Amir Hossein Bayat Marc.
Distributed Representations of Words and Phrases and their Compositionality Presenter: Haotian Xu.
Vector-Space (Distributional) Lexical Semantics
Representation of documents and queries
Principles of Data Mining Published by Springer-Verlag. 2007
From frequency to meaning: vector space models of semantics
Word Embedding Word2Vec.
Creating Data Representations
CS 430: Information Discovery
Vector Representation of Text
Word2Vec.
Word embeddings (continued)
Attention for translation
Introduction to Sentiment Analysis
From Unstructured Text to StructureD Data
Word representations David Kauchak CS158 – Fall 2016.
Natural Language Processing Is So Difficult
Topic: Semantic Text Mining
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Vector Representation of Text
Visual Grounding.
Presentation transcript:

Sentiment Analysis CMPT 733

Outline What is sentiment analysis? Overview of approach Feature Representation Term Frequency – Inverse Document Frequency (TF-IDF) Word2Vec Skip-gram Model Training Linear Regression Assignment 2 Computing Science/Apala Guha

Outline What is sentiment analysis? Overview of approach Feature Representation Term Frequency – Inverse Document Frequency (TF-IDF) Word2Vec Skip-gram Model Training Linear Regression Assignment 2 Computing Science/Apala Guha

What is sentiment analysis? Wikipedia: Aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. Examples: Full of zany characters and richly applied satire, and some great plot twists: is this a positive or negative review? Public opinion on the stock market mined from Tweets What do people think about a political candidate or issue? Can we predict election outcomes or market performance from sentiment analysis? Computing Science/Apala Guha

Outline What is sentiment analysis? Overview of approach Feature Representation Term Frequency – Inverse Document Frequency (TF-IDF) Word2Vec Skip-gram Model Training Linear Regression Assignment 2 Computing Science/Apala Guha

Overview of Approach Running Example: Sentiment analysis in Amazon Reviews Amazon reviews consist of both a text and a rating We learn the relationship between the text content and the rating Computing Science/Apala Guha

Overview of Approach Computing Science/Apala Guha I purchased one of these feom Walmart ……. Review Text Feature / Representation Feature Extraction Linear Regression [1- 5] Score

Outline What is sentiment analysis? Overview of approach Feature Representation Term Frequency – Inverse Document Frequency (TF-IDF) Word2Vec Skip-gram Model Training Linear Regression Assignment 2 Computing Science/Apala Guha

TF-IDF Term Frequency (TF): the number of times each word appears in a review Review: My small cat loves this carrier. It is very soft inside and it has a small window that my cat can use to look outside. What are the potential problems with this representation? Computing Science/Apala Guha

TF-IDF Raw term frequency will give too much weight to terms used in long reviews We should give equal importance to each review Some words are ubiquitous but without significant meaning These words will receive unnecessary importance Usually common words occur 1-2 orders of magnitude more times than uncommon words We need to suppress less significant, ubiquitous words while enhancing more significant, rare words Computing Science/Apala Guha

TF-IDF tf (term, review) = termFreqInDoc (term) / totalTermsInReview (review) How does this solve the problem of variable-length reviews? idf (term) = log ((totalReviews + 1) / (termFreqInCorpus (term) + 1)) How does this solve the problem of ubiquitous versus rare words? tf-idf (term, review) = tf (term, review) * idf (review) How does this overall reflect the importance of a particular term in a particular review? Computing Science/Apala Guha

TF-IDF Computing Science/Apala Guha TF-IDF

Can you spot any problems with the TF-IDF representation? Computing Science/Apala Guha

TF-IDF Pays no attention to word semantics Words with similar meanings are considered separately Words having different meanings in different contexts are considered to be the same It would be nice to incorporate some word semantics information into the feature representation Computing Science/Apala Guha

Outline What is sentiment analysis? Overview of approach Feature Representation Term Frequency – Inverse Document Frequency (TF-IDF) Word2Vec Skip-gram Model Training Linear Regression Assignment 2 Computing Science/Apala Guha

Word2Vec Word semantics are based on their context i.e. nearby words. Example: I love having cereal in morning for breakfast. My breakfast is usually jam with butter. The best part of my day is morning’s fresh coffee with a hot breakfast. ‘cereal’, ‘jam’, ‘butter’, and ‘coffee’ are related. We need to represent each word such that similar words have similar representation. Computing Science/Apala Guha

Word2Vec: Skip-gram Insurgents killed in ongoing fighting. Bi-grams = {insurgents killed, killed in, in ongoing, ongoing fighting} 2-skip-bi-grams = {insurgents killed, insurgents in, insurgents ongoing, killed in, killed ongoing, killed fighting, in ongoing, in fighting, ongoing fighting} Tri-grams = {insurgents killed in, killed in ongoing, in ongoing fighting} 2-skip-tri-grams = {insurgents killed in, insurgents killed ongoing, insurgents killed fighting, insurgents in ongoing, insurgents in fighting, insurgents ongoing fighting, killed in ongoing, killed in fighting, killed ongoing fighting, in ongoing fighting}. Computing Science/Apala Guha

Word2Vec: skip-gram Neural network trained on context of each word. Predicts the context given a word. The predicted context is used as the feature representation of a particular word in a review. We need to combine the feature vectors of the words in a review to get the overall feature vector of the review. Computing Science/Apala Guha

Word2Vec: skip-gram Computing Science/Apala Guha 1xV VxN 1xN NxV 1xV V = #distinct words I love for morning cereal

Word2Vec: skip-gram Input layer selects a single word among V words Output layer gives C (size of context) vectors, each of which selects one word among V words A weight matrix W of dimension VxN, transforms input vector into a 1xN vector N can be informally thought of as the number of characteristics of a word The value at each position reflects how strongly a particular characteristic is present. A weight matrix W’ of dimension NxV is associated with each output word vector to transform the projection layer into the output layer. We are seeing which output word at a particular skip position matches best the features of the input word. Computing Science/Apala Guha

Word2Vec: skip-gram Unsupervised learning Semantic representation ‘cat’ will be close to ‘kitten’ Computing Science/Apala Guha

Word2Vec: skip-gram Suggest some ways to combine feature vectors of the words appearing in a review to get the overall feature vector of the review. Computing Science/Apala Guha

Word2Vec: skip-gram Computing Science/Apala Guha My cat loves this Carrier review r: Average

Word2Vec: skip-gram Computing Science/Apala Guha. cat. kitten. pet. love. like. favor. dog. puppy. pup. doggy. kitty. happy. satisfied. fulfilled x x x x

Word2Vec: skip-gram Computing Science/Apala Guha Words: My cat loves this Carrier Word2Vec Average Use cluster representation

Word2Vec: skip-gram We need to represent an overall review with a feature vector, not just individual words We could average the feature vectors of the individual words in the review Or we could cluster words in the corpus, and use the degree of presence of different clusters in a review as the feature vector Computing Science/Apala Guha

Outline What is sentiment analysis? Overview of approach Feature Representation Term Frequency – Inverse Document Frequency (TF-IDF) Word2Vec Skip-gram Model Training Linear Regression Assignment 2 Computing Science/Apala Guha

Model Training Find the relationship between review feature vectors and rating scores Train a linear regression model Use the model to predict the rating of a test review Broader classification such as positive/negative is also possible by using a threshold on the score rating Computing Science/Apala Guha

Outline What is sentiment analysis? Overview of approach Feature Representation Term Frequency – Inverse Document Frequency (TF-IDF) Word2Vec Skip-gram Model Training Linear Regression Assignment 2 Computing Science/Apala Guha

Assignment 2 TF-IDF representation Train Linear Regression model Train Word2Vec representation Extract average Word2Vec for each review Cluster Word2Vec features Computing Science/Apala Guha