Download presentation
Presentation is loading. Please wait.
Published byKristopher O’Connor’ Modified over 9 years ago
1
GSAT (General Sentiment Analysis Tool) Design Review By Asaf Bruner
2
Problem Description
3
Big Data & Sentiment Analysis O Let’s start with a short video: http://www.youtube.com/watch?v=ij5yC-moPCM http://www.youtube.com/watch?v=ij5yC-moPCM O Textual information is either facts or opinions. O Very little research has been made on the processing of opinions until only recently. Yet, opinions are so important that whenever we need to make a decision we want to hear others’ opinions.
4
The specific problem I will be dealing with O Currently there is no unified solution that can answer the problem which was discussed above. O I will design and build a system which does the following: O Automatically collects the talkbacks from websites O Analyzes the data using NLP tools O Draws conclusions from the gathered information O Displays it in an easy to understand way O Answer some very interesting and important questions.
5
Where else can we use GSAT? O Individuals making purchasing decisions. O Organizations can use this tool to replace opinion polls, surveys, and focus groups. O Trend analysis.
6
General scheme of the proposed solution
7
The Data O I am using an open source, java based, web crawler – crawler4j by Google to collect my data. O Using regular expression and DOM analysis I extract the main text & talkbacks from the article while cleaning advertisements and unrelated text. O The list of sites I am crawling is defined in advance.
8
The algorithm – Design review Integrate with a crawler and extract articles and their talkbacks Integrate with an NLP code and analyze the articles and talkback for their entities and emotions Build a database to hold this information Design and build an algorithm to answer the above mentioned questions Build easy to interpret GUI to display the data and conclusions
9
The algorithm – Design review
10
The Tools and infrastructure I am using O The program is written in java (eclipse IDE). O Crawling using crawler4j. O NLP & sentiment analysis using AlchemyAPI. O Database using MySQL. O GUI using Google visualize.
11
Expected deliverables
12
What is actually going to be delivered and how it can be used O I am going to present a specific use case – analyzing ynetnews.com and haaretz.com for political entities and sentimental information relating to them. O Other then that this will be a fully functional program. Meaning only slight changes will have to be made to generalize this use case.
13
Potential intellectual property that could come out of the project O Integration between several tools O Algorithm
14
Competing solutions
15
Well… O Currently no free open source tool is available that does what GSAT offers!
16
Other ways the problem can be solved O Currently there are 30 US based companies that offer paid sentiment analysis. None of them offers freely the combination of data mining and text analysis.
17
Characterization of the users
18
Initial group of users and the most general group O Everyone who wants to know what is being wrote and thought about entities in which they have interest. O Everyone who has interest in analyzing trends.
19
How do you think one could make money out of your product O Advertisement market (campaign evaluating). O Product comparison (retail companies). O Trend analysis. O And many more…
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.