Download presentation
Presentation is loading. Please wait.
Published byΒαράκ Ζερβός Modified over 5 years ago
1
Analysing Public Perceptions of International Events by using Geo-located Twitter Data.
2
Big Data Big Data first appeared towards the end of the 1990’s and has become a buzz word in the last few years.
3
Storing and analysing Big Data
Traditional methods of storing and analysing data are unable to cope the data generated by a social media platform like Twitter. Big Data excels at processing massive volumes of data at near real-time speed and it has the ability to store both structured and unstructured data seamlessly - while having the ability to run on commodity hardware.
4
Infrastructure 8 Local machines 6 Data Nodes.
2 Master Nodes (Name Node and Secondary Name Node).
5
Big Data Environment
6
Data Processing Twitter data is collected in the form of JSON and a SERDE (Serialise and De- serialise) was used to structured the data. Hive implements a schema on read.
7
Analysis of geo-location map
Twitter data’s metadata contains time-zone data, location data and geo- coordinate data. A Global Map Table, containing each country, the related date time-stamp and regional information, was used to lookup the data.
8
Sentiment analysis Sentiment Analysis is the process of opinion mining by identifying and extracting subjective information from text. This is performed by “exploding” or separating each tweet into multiple sentences, and then breaking each sentence down into a number of words.
9
Using a data dictionary
The Data Dictionary used is a sentiment lexicon containing 6800 words from the English language, their part of speech (e.g. noun, verb, adjective), and a sentiment (positive, negative or neutral). A polarity was given to each word (i.e. positive = +1, negative = -1 or neutral = 0)
10
An example of classification
‘Justice being served or not, being a victim carries a life sentence.’#JubJub #DewaniTrial #Oscar Justice (+1) being (+0) served (+1) or (+0) not (-1) , being (+0) a (+0) victim (- 1) carries (-1) a (+0) life (-1) sentence (-1). #JubJub #DewaniTrial #Oscar = -4 = tweet embeds a negative sentiment
11
Data analytics focus areas
Determine influential users. Determine influential tweets. Identify topics. Identify themes.
12
Data Collection
13
Top Influential Users - Oscar Pistorius case
14
FIFA World Cup - Top 20
15
FIFA World Cup – Top Topics
16
Oscar Trial – Top 20
17
Oscar Trial - Top Themes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.