Insight Ahmad Jabi | Yazan Shakhshir | Saleem Abu Dhair

Slides:



Advertisements
Similar presentations
This document contains information and data that AAUM considers confidential. Any disclosure of Confidential Information to, or use of it by any other.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
By Klejdi Muca & Stephen Quinn. A method used by companies like IMDB or Netlfix to turn raw data into useful information, for example It helps companies.
Chapter 5: Introduction to Information Retrieval
Distant Supervision for Emotion Classification in Twitter posts 1/17.
A Brief Overview. Contents Introduction to NLP Sentiment Analysis Subjectivity versus Objectivity Determining Polarity Statistical & Linguistic Approaches.
Introduction to Machine Learning Approach Lecture 5.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
Introduction to Text and Web Mining. I. Text Mining is part of our lives.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
Prediction of Influencers from Word Use Chan Shing Hei.
A Repetition Based Measure for Verification of Text Collections and for Text Categorization Dmitry V.Khmelev Department of Mathematics, University of Toronto.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
RESEARCH MOTHODOLOGY SZRZ6014 Dr. Farzana Kabir Ahmad Taqiyah Khadijah Ghazali (814537) SENTIMENT ANALYSIS FOR VOICE OF THE CUSTOMER.
Twitter Based Research Benny Bornfeld Mentors Professor Sheizaf Rafaeli Dr. Daphne Raban.
Sparse Coding: A Deep Learning using Unlabeled Data for High - Level Representation Dr.G.M.Nasira R. Vidya R. P. Jaia Priyankka.
A presentation on ElasticSearch
Big data classification using neural network
Jonatas Wehrmann, Willian Becker, Henry E. L. Cagnini, and Rodrigo C
Search in Google's N-grams
Sentiment Analysis of Twitter Messages Using Word2Vec
AP CSP: What is Big Data?.
Evaluation Anisio Lacerda.
Taking a Tour of Text Analytics
Sentiment analysis algorithms and applications: A survey
ANOMALY DETECTION FRAMEWORK FOR BIG DATA
Lecture 1: Introduction and the Boolean Model Information Retrieval
Chapter 4 – Requirements Engineering
Text Mining CSC 600: Data Mining Class 20.
Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.
Sentiment analysis tools
Advantages of ICT over Manual Methods of Processing Data
Future-oriented Benchmarking Through Social Media Analysis
Deriving value from structured data analytics is now commonplace
Big-Data Fundamentals
Preface to the special issue on context-aware recommender systems
FEASIBILITY STUDY Feasibility study is a means to check whether the proposed system is correct or not. The results of this study arte used to make decision.
Summary Presented by : Aishwarya Deep Shukla
By Dr. Abdulrahman H. Altalhi
MID-SEM REVIEW.
SOCIAL COMPUTING Homework 3 Presentation
Kbv Research | +1 (646) | Global Text Analytics Market Knowledge Based Value (KBV) Research Full report -
Artificial Intelligence with Heart: Improving Customer Experience through Sentiment Analysis.
Multimedia Information Retrieval
European Network of e-Lexicography
7 powerful questions of Data Science
Understanding Connections: Amazon Customer Reviews
Chapter GS Getting Started.
Overview of big data tools
PROJECTS SUMMARY PRESNETED BY HARISH KUMAR JANUARY 10,2018.
Chapter GS Getting Started.
CSE 635 Multimedia Information Retrieval
Course Summary ChengXiang “Cheng” Zhai Department of Computer Science
Ying Dai Faculty of software and information science,
Text Mining & Natural Language Processing
Sentiment Analysis of Social Netizens
CS246: Information Retrieval
Text Mining CSC 576: Data Mining.
Lecture # 7 System Requirements
Chapter GS Getting Started.
Big DATA.
Introduction to Sentiment Analysis
Chapter GS Getting Started.
Big Data Big Data first appeared towards the end of the 1990’s and has become a buzz word in the last few years.
HappyAImen WANG, Chenghui SHEN, Kairan WU, Shukun
THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU
System Model Acquisition from Requirements Text
OU BATTLECARD: Oracle Utilities Learning Subscription
Presentation transcript:

Insight Ahmad Jabi | Yazan Shakhshir | Saleem Abu Dhair Supervised by Dr.Nizar Awartani

Contents Abstract Tools Introduction What is NLP APIs Sentiment Analysis Scope of “Insight” Insight Objectives Methodology Tools Stanford CORENLP Model APIs Elasticsearch Machine Learning Problem Statement

Abstract Companies nowadays have too many clients following them on social media. Therefore companies have to get use of the feedback these users give. This process can be achieved using a software that collect the data and perform the analysis operation. In order to know the sentiment of these clients on a certain thing and to predict things.

Introduction “Insight” is all about bringing data from social media and NYT website and analyze it . Analyzing data from social media has many benefits for companies. Helps cut through vast amounts of data to understand audience perception, and therefore, to determine the most strategic response.

Introduction Social media sentiment analysis can be an excellent source of information and can provide Insights that can: Determine marketing strategy. Improve campaign success. Improve customer service. Based on Natural language processing .

Natural Language Processing(NLP) What is Natural Language Processing?  The application of computational techniques to the analysis of natural language and speech.   The use of computers to process written and spoken language for some practical and useful purposes and applications.

Natural Language Processing(NLP) Applications on NLP: Question answering. Information extraction. Sentiment Analysis. Machine translation. We used the Stanford CORENLP Model for sentiment analysis.

Sentiment Analysis The process of computationally identifying and categorizing opinions expressed in a piece of text. Especially in order to determine whether the writer's attitude towards a particular topic, product, etc. is positive, negative, or neutral.

Sentiment Analysis Why sentiment Analysis? Movie:  is this review positive or negative? Products: what do people think about the new iPhone? Politics: what do people think about this candidate or issue? Prediction: predict election outcomes or market trends from sentiment

Scope of “Insight” Companies. Normal users.

Scope of “Insight” Companies: get use of customers’ feedback on social media. Example: Commercial Brands Follow up for clients’ sentiment about a product allows improving it immediately. Digital Marketing Agencies Dazzle your clients by producing advanced analytical reports about marketing campaign they made and how much effective it was on social media. Showing your work results to customer.

Scope of “Insight” Normal users: It is also aimed for persons who are concerned with exploring the general impression or sentiment about a certain topic. It also useful for people who are looking to directly get the recommendation of a certain movie to be watched or not.

“Insight” Objectives The main idea of our project is to develop an application that provides a service for companies who need to improve campaign of success and customer care. Best decisions that have the best impact are decisions that basically depends on accurate and precise data not on intuition and guessing.

“Insight” Objectives Eases the mission of reading the BIG data Achieving the most usefulness out of the data you have. Enhance utilization of huge amount of available data by analyze and extract information and general sentiment this information give about a certain topic.

Methodology Learn about analyzing (Sentiment Analysis), Indexing Information retrieval Use of social media APIs.

Tools LingPipe's language classification framework Stanford CORENLP Model APIs: Twitter API New York Times API Elasticsearch

LingPipe's language classification framework LingPipe is tool kit for processing text using computational linguistics. LingPipe is used to do tasks like: Find the names of people, organizations or locations in news Automatically classify Twitter search results into categories Suggest correct spellings of queries classifying opinions in text into categories like "positive" or "negative“ based on Logistic regression is a discriminitive probabilistic classification model.

LingPipe's Logistic regression model  Logistic regression is one of the best probabilistic classifiers Also Known As Neural Network Binary logistic regression is equivalent to a one-layer, single-output neural network.

LingPipe's language classification framework LingPipe's architecture is designed to be efficient, scalable, reusable, and robust. Features :. Highly configurable Can be trained in several languages.

LingPipe's language classification framework An essential part of creating a Sentiment Analysis algorithm is to have a comprehensive dataset or corpus to learn from. to ensure that the accuracy of your algorithm meets the standards we expect. We used a corpus of already classified tweets in terms of sentiment The corpus sources are : University of Michigan Sanders Analytics LLC The Dataset contains 1,578,627 classified tweets

Stanford CORENLP Model Most sentiment prediction systems work just by looking at words in isolation, giving positive points for positive words and negative points for negative words and then summing up these points. That way, the order of words is ignored and important information is lost. In contrast, Stanford new deep learning model actually builds up a representation of whole sentences based on the sentence structure.

Stanford CORENLP Model It computes the sentiment based on sentence. This way, the model is not as easily fooled as other models. For example, Stanford model learned that funny and witty are positive but the following sentence is still negative overall: “This movie was actually neither that funny, nor super witty.” The underlying technology of this model is based on a new type of Recursive Neural Network that builds on top of grammatical structures.

APIs What Is an API? API stands for application programming interface. An API is essentially a way for programmers to communicate with a certain application.

Twitter API Provide programmatic access to read and write Twitter data. Responses are in JSON format.

New York Times API The Article Search API is a way to find any article. You can search New York Times articles from 1851 up to today.

Elasticsearch We used Elasticsearch for storing data and for information retrieval processes. Elasticsearch uses a structure called an inverted index, which is designed to allow very fast full-text searches. An inverted index consists of a list of all the unique words that appear in any document, and for each word, a list of the documents in which document it appears. STRUCTUR & UN STRUCTUR

Elasticsearch(Inverted index ) DOC1: Good morning. DOC2:Good job. Good morning job DOC1 DOC2 DOC1 DOC2 Postings lists Dictionary

Elasticsearch Speed Why use Elasticsearch Support unstructured data Scalability

Machine learning Using the training data set, the system will be trained by the following machine learning algorithm: The Recursive Neural Networks

Problem Statement NLP is Hard: The main NLP challenges are: 1.    Ambiguity is pervasive: “Fed raises interest rates”. Which is the verb “raises” or “interest”? 2.    Segmentation issues: The New York-New Haven Railroad: 1. [The] [New] [York-New] [Haven] [Railroad] 2. [The] [New York]- [New Haven] [Railroad].

Problem Statement 3. Non-standard English: 4. Neologisms: 5. Idioms Twitter status: “gr8 job john, youve a good jop in contest #acm2016 J” 4.    Neologisms: “Unfriend” “Retweet” 5.    Idioms “days are flying” “get cold feet”

Problem Statement NLP topic is totally new for us. We had to learn from scratch about NLP and IR. It is known that existing social media sentiment analysis algorithms is not very reliable and sometimes inaccurate. Therefore, we tried to do our best in using the most reliable algorithms. Another big challenge is that we are dealing with a huge data which means a lot of processes and using a lot of resources.

Insight Ahmad Jabi | Yazan Shakhshir | Saleem Abu Dhair Supervised by Dr.Nizar Awartani