Billion Gigabytes 1 Exabyte
Origin
Name? Age? Where from? Strengths? Weaknesses? Oddities? Expertise? Bias? Politics? Groups? Relatives?
Integrity
Data is categorized N - Nominal (labels) Apples; Oranges; Bananas; O - Ordinal (ordered) Small; Medium; Large;... Q - Interval (location of zero arbitrary) Jan 19, 2010; Oct 23, 2011,... Q - Ratio (zero fixed)
Variable Structured Unstructured Uncertain Margins of error Needs Context Sample
It is sampled (Census) manipulated or biased (Politics) biased (Advocacy) error prone (human mistakes)
So We Evaluate, filter, clean, question and attribute our data SEE: Harvard Business Review
Any Tool Will Do Tableau, JavaScript, D3, Google API, Many Others
Too much stuff
Data-ink Ratio DATA INK Total Ink
Too fancy – Edward Tufte Example