Download presentation
Presentation is loading. Please wait.
Published byOpal Carr Modified over 9 years ago
1
Chapter 4: Correlation 2/2015
2
Correlation −Statistical relationship between two data values −Strong correlation: if A changes, B is likely to change −Weak correlation 2/2015
3
Case: Walmart −Tracked every product and analyzed every transaction −Before hurricanes, sales of flashlights and Pop-Tarts increased => If storm was approaching Walmart boosted stocks for these products 2/2015
4
Correlation analysis: How to find suitable proxies? −In small-data world −Hypothesis-driven approach −Proxies chosen first and then tested −Slow and repetitive −Subject to false intuition −Hard to find non-linear relationships because small sample size 2/2015
5
Correlation analysis: How to find suitable proxies? −In big-data world −Data-driven approach (n = all) −Optimal proxies can be found by computer-driven analysis (e.g. Google Flu Trends and search terms) −Hypothesis not needed −Possible to find complex non-linear relationships −Predictive analysis −use data to predict events before they happen −Case: Health care −very constant vital signs before infection and not other way around 2/2015
6
What, not why −Data-driven analysis aims to to find non-causal links (what) −Possibility to use mathematic and statistical methods (not possible with causality) −Can be used to further investigate if there is causality between links 2/2015
7
The end of theory? −We can just look at the data and not be limited by theories and hypothesis 2/2015
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.