Statistics and Probability CLAST Review Workshop Statistics and Probability 5. Inferring Relations and Making Predictions from Statistical Data (skill III D 1)
Overview Trends in data Scatter plots and correlation Interpolation estimating variation within the observed range Extrapolation predicting variation beyond the observed range 11/27/2018 Dave Saha
Trends in Data Trends may be discerned best from graphs. Both interpolation and extrapolation are used. General trends are expected to continue. Company Profit/Loss Graphing the profit/loss over time clearly indicates that there is a trend of increasing profit. 30 25 27 25 Profits (thousands) 20 22 20 15 16 14 10 ’87 ’88 ’89 ’90 ’91 ’92 ’93 11/27/2018 Dave Saha
Scatter Plots and Correlation The relationship, if any, between two variables is called correlation. A scatter diagram may reveal correlation. Correlation may vary from –1 to +1. * * * * * * * * * * * * * * * * * * * * * * The line of regression is the best linear approximation to how the two variables relate. 11/27/2018 Dave Saha
WARNING: Strong correlation does not mean causation! B B A A Strong Negative Correlation as A increases, B decreases as A decreases, B increases Strong Positive Correlation as A increases, B increases as A decreases, B decreases WARNING: Strong correlation does not mean causation! 11/27/2018 Dave Saha
Weak or No Correlation B B B A A A Weak Negative Correlation B B B A A A Weak Negative Correlation the enveloping area gets wider Weak Positive Correlation the enveloping area gets wider No Correlation no apparent relation between A and B 11/27/2018 Dave Saha
Interpret Beyond the Data Interpolation Relies on reading the graph correctly (skill ID1) Extrapolation Requires the assumption that any observed trends will continue Relies on reading the graph correctly (skill ID1) The scale(s) of the graph may have to be extended 11/27/2018 Dave Saha