A Brief Overview
Contents Introduction to NLP Sentiment Analysis Subjectivity versus Objectivity Determining Polarity Statistical & Linguistic Approaches Application Conclusion
Introduction to NLP What is Natural Language Processing? - Use of Computer technology - Analyse written or spoken human language - To get computers to process the supplied data - Respond adequately if necessary
Sentiment Analysis What is Sentiment Analysis? - An attempt to determine the direction of opinion - Positive, negative or neutral opinion - In a body of text (written) - Using computer technology
Subjectivity Versus Objectivity Subjectivity – Opinions/Objectivity – Facts - We need to distinguish between these two A number of ways to achieve this - Manually tag words that express opinion, emotion, evaluation and speculation - Give strength values: low, medium, high or extreme - Sentence with a word having medium or more value is subjective while all others are objective
Determining Polarity Main approaches to determining polarity - Statistical approach - Linguistic approach
Statistical Approach Uses calculations to determine polarity Example: Naïve Bayes classifier - Documents are manually classed - Calculates the probability of a word appearing in any document contained in a particular class - Compares the probability of same word in new documents and groups where closest match exists
Statistical Approach cont. Limitation: - Independence assumption of words - E.g. Assuming a review on Laptops; has the occurrence of ‘small’ referring to the size of the Laptops, which is then tagged as positive. What if ‘small’ was followed by ‘computer’ then ‘memory’?
Linguistic Approach Uses structure and meaning of language Mostly uses the predetermined polarity of adjectives Limitation - Ambiguity of human language - E.g. the word ‘sucks’ is tagged as negative ‘the movie sucks’ versus ‘it appears that the baby sucks his thumb whenever he wants to sleep’
Application Influencing decision - From an individual trying to purchase an item to a Government trying to please its people Market intelligence - Getting information on competitor’s strong points Dealing with customer experience - People may write about experience online but not fill questionnaires
Conclusion A difficult area of study Huge ongoing effort to solve challenges Due to its importance I believe this challenges will be solved in the future Still a cost effective way to retrieve opinion
Questions?