Presentation is loading. Please wait.

Presentation is loading. Please wait.

Insight Ahmad Jabi | Yazan Shakhshir | Saleem Abu Dhair

Similar presentations


Presentation on theme: "Insight Ahmad Jabi | Yazan Shakhshir | Saleem Abu Dhair"— Presentation transcript:

1 Insight Ahmad Jabi | Yazan Shakhshir | Saleem Abu Dhair
Supervised by Dr.Nizar Awartani

2 Contents Abstract Tools Introduction What is NLP APIs
Sentiment Analysis Scope of “Insight” Insight Objectives Methodology Tools Stanford CORENLP Model APIs Elasticsearch Machine Learning Problem Statement

3 Abstract Companies nowadays have too many clients following them on social media. Therefore companies have to get use of the feedback these users give. This process can be achieved using a software that collect the data and perform the analysis operation. In order to know the sentiment of these clients on a certain thing and to predict things.

4 Introduction “Insight” is all about bringing data from social media and NYT website and analyze it . Analyzing data from social media has many benefits for companies. Helps cut through vast amounts of data to understand audience perception, and therefore, to determine the most strategic response.

5 Introduction Social media sentiment analysis can be an excellent source of information and can provide Insights that can: Determine marketing strategy. Improve campaign success. Improve customer service. Based on Natural language processing .

6 Natural Language Processing(NLP)
What is Natural Language Processing?  The application of computational techniques to the analysis of natural language and speech.   The use of computers to process written and spoken language for some practical and useful purposes and applications.

7 Natural Language Processing(NLP)
Applications on NLP: Question answering. Information extraction. Sentiment Analysis. Machine translation. We used the Stanford CORENLP Model for sentiment analysis.

8 Sentiment Analysis The process of computationally identifying and categorizing opinions expressed in a piece of text. Especially in order to determine whether the writer's attitude towards a particular topic, product, etc. is positive, negative, or neutral.

9 Sentiment Analysis Why sentiment Analysis?
Movie:  is this review positive or negative? Products: what do people think about the new iPhone? Politics: what do people think about this candidate or issue? Prediction: predict election outcomes or market trends from sentiment

10 Scope of “Insight” Companies. Normal users.

11 Scope of “Insight” Companies: get use of customers’ feedback on social media. Example: Commercial Brands Follow up for clients’ sentiment about a product allows improving it immediately. Digital Marketing Agencies Dazzle your clients by producing advanced analytical reports about marketing campaign they made and how much effective it was on social media. Showing your work results to customer.

12 Scope of “Insight” Normal users:
It is also aimed for persons who are concerned with exploring the general impression or sentiment about a certain topic. It also useful for people who are looking to directly get the recommendation of a certain movie to be watched or not.

13 “Insight” Objectives The main idea of our project is to develop an application that provides a service for companies who need to improve campaign of success and customer care. Best decisions that have the best impact are decisions that basically depends on accurate and precise data not on intuition and guessing.

14 “Insight” Objectives Eases the mission of reading the BIG data
Achieving the most usefulness out of the data you have. Enhance utilization of huge amount of available data by analyze and extract information and general sentiment this information give about a certain topic.

15 Methodology Learn about analyzing (Sentiment Analysis), Indexing
Information retrieval Use of social media APIs.

16 Tools LingPipe's language classification framework
Stanford CORENLP Model APIs: Twitter API New York Times API Elasticsearch

17 LingPipe's language classification framework
LingPipe is tool kit for processing text using computational linguistics. LingPipe is used to do tasks like: Find the names of people, organizations or locations in news Automatically classify Twitter search results into categories Suggest correct spellings of queries classifying opinions in text into categories like "positive" or "negative“ based on Logistic regression is a discriminitive probabilistic classification model.

18 LingPipe's Logistic regression model
 Logistic regression is one of the best probabilistic classifiers Also Known As Neural Network Binary logistic regression is equivalent to a one-layer, single-output neural network.

19 LingPipe's language classification framework
LingPipe's architecture is designed to be efficient, scalable, reusable, and robust. Features :. Highly configurable Can be trained in several languages.

20 LingPipe's language classification framework
An essential part of creating a Sentiment Analysis algorithm is to have a comprehensive dataset or corpus to learn from. to ensure that the accuracy of your algorithm meets the standards we expect. We used a corpus of already classified tweets in terms of sentiment The corpus sources are : University of Michigan Sanders Analytics LLC The Dataset contains 1,578,627 classified tweets

21 Stanford CORENLP Model
Most sentiment prediction systems work just by looking at words in isolation, giving positive points for positive words and negative points for negative words and then summing up these points. That way, the order of words is ignored and important information is lost. In contrast, Stanford new deep learning model actually builds up a representation of whole sentences based on the sentence structure.

22 Stanford CORENLP Model
It computes the sentiment based on sentence. This way, the model is not as easily fooled as other models. For example, Stanford model learned that funny and witty are positive but the following sentence is still negative overall: “This movie was actually neither that funny, nor super witty.” The underlying technology of this model is based on a new type of Recursive Neural Network that builds on top of grammatical structures.

23 APIs What Is an API? API stands for application programming interface.
An API is essentially a way for programmers to communicate with a certain application.

24 Twitter API Provide programmatic access to read and write Twitter data. Responses are in JSON format.

25 New York Times API The Article Search API is a way to find any article. You can search New York Times articles from 1851 up to today.

26 Elasticsearch We used Elasticsearch for storing data and for information retrieval processes. Elasticsearch uses a structure called an inverted index, which is designed to allow very fast full-text searches. An inverted index consists of a list of all the unique words that appear in any document, and for each word, a list of the documents in which document it appears. STRUCTUR & UN STRUCTUR

27 Elasticsearch(Inverted index )
DOC1: Good morning. DOC2:Good job. Good morning job DOC1 DOC2 DOC1 DOC2 Postings lists Dictionary

28 Elasticsearch Speed Why use Elasticsearch Support unstructured data
Scalability

29 Machine learning Using the training data set, the system will be trained by the following machine learning algorithm: The Recursive Neural Networks

30 Problem Statement NLP is Hard: The main NLP challenges are:
1.    Ambiguity is pervasive: “Fed raises interest rates”. Which is the verb “raises” or “interest”? 2.    Segmentation issues: The New York-New Haven Railroad: 1. [The] [New] [York-New] [Haven] [Railroad] 2. [The] [New York]- [New Haven] [Railroad].

31 Problem Statement 3. Non-standard English: 4. Neologisms: 5. Idioms
Twitter status: “gr8 job john, youve a good jop in contest #acm2016 J” 4.    Neologisms: “Unfriend” “Retweet” 5.    Idioms “days are flying” “get cold feet”

32 Problem Statement NLP topic is totally new for us.
We had to learn from scratch about NLP and IR. It is known that existing social media sentiment analysis algorithms is not very reliable and sometimes inaccurate. Therefore, we tried to do our best in using the most reliable algorithms. Another big challenge is that we are dealing with a huge data which means a lot of processes and using a lot of resources.

33 Insight Ahmad Jabi | Yazan Shakhshir | Saleem Abu Dhair
Supervised by Dr.Nizar Awartani


Download ppt "Insight Ahmad Jabi | Yazan Shakhshir | Saleem Abu Dhair"

Similar presentations


Ads by Google