Big Data Big Data first appeared towards the end of the 1990’s and has become a buzz word in the last few years.

Slides:



Advertisements
Similar presentations
2015 SLA IT Webinar Using Analytics to Understand Social Media Activity Michelle Chen School of Information San José State University February 4 th, 2015.
Advertisements

Extract from various presentations: Bing Liu, Aditya Joshi, Aster Data … Sentiment Analysis January 2012.
Analysis of Twitter Data NIKHIL PURANIK CMSC 601 – Research Skills 25 th April 2011UNIVERSITY OF MARYLAND BALTIMORE COUNTY.
Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why.
Running Hadoop-as-a-Service in the Cloud
An Information Architecture for Hadoop Mark Samson – Systems Engineer, Cloudera.
Chapter 3 Foundations of Business Intelligence: Databases and Information Management.
More than words: Social networks’ text mining for consumer brand sentiments A Case on Text Mining Key words: Sentiment analysis, SNS Mining Opinion Mining,
Opinion mining in social networks Student: Aleksandar Ponjavić 3244/2014 Mentor: Profesor dr Veljko Milutinović.
WORKSHOP- BIG DATA ANALYTICS Israeli Social Protest Osher Arbib Winter Tel-Aviv University 1.
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
CHAPTER 8: MANAGING DATA RESOURCES. File Organization Terms Field: group of characters that represent something Record: group of related fields File:
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
A TOOL FOR STANAG Military and Political Issues for STANAG 3333.
CSC 594 Topics in AI – Text Mining and Analytics
Breaking points of traditional approach What if you could handle big data?
Big Data Using Big Data for Cultures and Communities Jeremy Reffin Simon Wibberley CASM, University of Sussex Carl Miller CASM, Demos July 2014.
Think of a sentence to go with this picture. Can you use any of these words? then if so while though since when Try to use interesting adjectives, powerful.
Info Start-up company founded by academicians and graduate students from Sabanci University. We offer social media analysis tools and services including.
2014 Lexicon-Based Sentiment Analysis Using the Most-Mentioned Word Tree Oct 10 th, 2014 Bo-Hyun Kim, Sr. Software Engineer With Lina Chen, Sr. Software.
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
Indexing The World Wide Web: The Journey So Far Abhishek Das, Ankit Jain 2011 Paper Presentation : Abhishek Rangnekar 1.
BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.
Foundations of Business Intelligence: Databases and Information Management Chapter 6 VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors.
More than words: Social network’s text mining for consumer brand sentiments Expert Systems with Applications 40 (2013) 4241–4251 Mohamed M. Mostafa Reporter.
A Sentiment-Based Approach to Twitter User Recommendation BY AJAY ABDULPUR RAJARAM NIKKAM.
Mary Ganesan and Lora Strother Campus Tours Using a Mobile Device.
Making Sense of Large Volumes of Unstructured Responses K. M. P. N. Jayathilaka Department of Statistics University of Colombo.
A Tutorial on Hadoop Cloud Computing : Future Trends.
Big Data-An Analysis. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult.
DATA Storage and analytics with AZURE DATA LAKE
Real Time Analysis in Twitter
Presentation by: ABHISHEK KAMAT ABHISHEK MADHUSUDHAN SUYAMEENDRA WADKI
The Sellout: Readers Sentiment Analysis of 2016 Man Booker Prize Winner Paper ID : 748.
Name: Sushmita Laila Khan Affiliation: Georgia Southern University
An Analysis of Czech Political Events Reflections in Facebook
English-Korean Machine Translation System
Introduction to Corpus Linguistics
Taking a Tour of Text Analytics
Queries Over Graph Data: Presidential Election
Sentiment analysis tools
Future-oriented Benchmarking Through Social Media Analysis
Future-Oriented Benchmarking through Social Media Analysis
Memory Standardization
Big Data Intro.
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Fuel Cell Market size worth $25.5bn by 2024 Text Analytics Market share.
Insight Ahmad Jabi | Yazan Shakhshir | Saleem Abu Dhair
Power of Social Media Analytics
MID-SEM REVIEW.
Paging and Segmentation
Weichuan Dong Qingsong Liu Zhengyong Ren Huanyang Zhao
Corpus Linguistics I ENG 617
Proportion of Original Tweets
Rob Gleasure robgleasure.com
Parts of Speech Mr. White English I.
Overview of big data tools
Rob Gleasure robgleasure.com
Feature Extraction on Twitter Streaming data using Spark RDD
2/2 - Newspaper Step 1 – Silent Read
Text Mining & Natural Language Processing
Data Warehousing in the age of Big Data (1)
Text Type: Dictionaries
Text Mining & Natural Language Processing
Zoie Barrett and Brian Lam
AGENDA Buzz word. AGENDA Buzz word What is BIG DATA ? Big Data refers to massive, often unstructured data that is beyond the processing capabilities.
Big Data Environment. Analysing Public Perceptions of South Africa’s Local Elections by using Geo-located Twitter Data.
Analytics, BI & Data Integration
Reading Comprehension
Big Data.
Presentation transcript:

Analysing Public Perceptions of International Events by using Geo-located Twitter Data.

Big Data Big Data first appeared towards the end of the 1990’s and has become a buzz word in the last few years.

Storing and analysing Big Data Traditional methods of storing and analysing data are unable to cope the data generated by a social media platform like Twitter. Big Data excels at processing massive volumes of data at near real-time speed and it has the ability to store both structured and unstructured data seamlessly - while having the ability to run on commodity hardware.

Infrastructure 8 Local machines 6 Data Nodes. 2 Master Nodes (Name Node and Secondary Name Node).

Big Data Environment

Data Processing Twitter data is collected in the form of JSON and a SERDE (Serialise and De- serialise) was used to structured the data. Hive implements a schema on read.

Analysis of geo-location map Twitter data’s metadata contains time-zone data, location data and geo- coordinate data. A Global Map Table, containing each country, the related date time-stamp and regional information, was used to lookup the data.

Sentiment analysis Sentiment Analysis is the process of opinion mining by identifying and extracting subjective information from text. This is performed by “exploding” or separating each tweet into multiple sentences, and then breaking each sentence down into a number of words.

Using a data dictionary The Data Dictionary used is a sentiment lexicon containing 6800 words from the English language, their part of speech (e.g. noun, verb, adjective), and a sentiment (positive, negative or neutral). A polarity was given to each word (i.e. positive = +1, negative = -1 or neutral = 0)

An example of classification ‘Justice being served or not, being a victim carries a life sentence.’#JubJub #DewaniTrial #Oscar Justice (+1) being (+0) served (+1) or (+0) not (-1) , being (+0) a (+0) victim (- 1) carries (-1) a (+0) life (-1) sentence (-1). #JubJub #DewaniTrial #Oscar +1+0+1+0-1+0-1+0+0-1-1+0-1-1 = -4 = tweet embeds a negative sentiment

Data analytics focus areas Determine influential users. Determine influential tweets. Identify topics. Identify themes.

Data Collection

Top Influential Users - Oscar Pistorius case

FIFA World Cup - Top 20

FIFA World Cup – Top Topics

Oscar Trial – Top 20

Oscar Trial - Top Themes