2.3 Methods for Big Data What is “Big Data”? Summarizing Big Data
The Flood of “Big Data” 90% of all data created by humankind has been created in the last 2 years
Data Creation Data Flow a Decade AgoData Flow Now Marketing Survey
What Exactly is “BIG DATA”? n BIG DATA refers to a collection of tools, techniques and technologies that make it possible to work with data at any scale. n BIG DATA is less about size, more about flow and velocity
The 3 V’s of BIG DATA 1. Volume Larger than conventional databases can handle 2. Velocity High rate at which data is generated, processed and analyzed in real time 3. Variety Data formats are unstructured and inconsistent
Volume
n Walmart collects more than 2.5 petabytes of data every hour from its customer transactions.
Velocity n Twitter Twitter
Variety: Data formats are Unstructured and Inconsistent
“Big Data” Technologies n n n wer/ wer/ n cloud/solutions/big-data.aspx cloud/solutions/big-data.aspx Word walls, word clouds, correlation wheels, heat maps, fusion tables, NOSQL, networks
Correlation Wheel (sort of) n football-schedule/ football-schedule/
Time Warner Outage 8/27
End of Section 2.3