Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots
Section 1 Topic 32 Summarising metric data: Median, IQR & Box Plots Can we describe a distribution with just one or two numbers? What is the median, how is it calculated and what does it tell us? What is the interquartile range, how is it calculated and what does it tell us? What is a five number summary? What is a box plot and why is it useful?
Section 1 Topic 33 Will less than the whole picture do? Summary Statistics Measures of centre Median Mean Measures of spread Range Interquartile Range Standard Deviation
Section 1 Topic 34 Median Firstly numerically order the data set % higher than or equal to median 50% lower than or equal to median Location of Median = (n+1)/2 = (5+1)/2 = 3 rd observation Notes p.97
For an odd number of data values the median will be one of the data values Median = 4 For an even number of data values the median may not coincide with an actual data value Median = 4.5 Location of Median = (4+1)/2 = (5)/2 = 2.5 observation
Section 1 Topic 36 Limitations: Range Depends on only two extreme values. Data set Range = = 7 Data set
Section 1 Topic 37 Interquartile range Quartiles are the points that divide a distribution into quarters Q1Q2Q3Q1Q2Q3 25%50%75% Median IQR = Q 3 - Q 1 The interquartile range (IQR) is defined to be the spread of the middle 50% of data values, so that Notes p.99
Section 1 Topic 38 Why is the IQR more useful that the range? IQR describes the middle 50% of observations. Upper 25% and lower 25% of observations are discarded. IQR generally not affected by outliers.
Section 1 Topic 39 Picturing quartiles with histogram Notes p.97
Section 1 Topic 310 Five number summary Minimum value, Q 1, Median, Q 3, Maximum value
Section 1 Topic 311 The Boxplot Graphical representation of five number summary Notes p.98
Section 1 Topic 312 Constructing a Boxplot Notes p.99
Section 1 Topic 313 *Exercise 4 Notes p.103
Section 1 Topic 314 Relating a boxplot to the shape of the distribution : Symmetric Notes p.104
Section 1 Topic 315 Positively skewed distributions
Section 1 Topic 316 Negatively skewed distributions
Section 1 Topic 317 Boxplot with outliers Possible outliers defined as any values outside of the interval (Q X IQR, Q X IQR) We say possible, since the point may just be part of the tail of the distribution but we may not have enough data to be sure Notes p.101
Section 1 Topic 318 Boxplot with outliers Min Q 1 M Q 3 Max
Section 1 Topic 319 *Exercise 5 Notes p.107