Measuring Variation – The Five-Number Summary Lecture 18 Sec. 5.3.1 – 5.3.3 Mon, Feb 19, 2007
The Five-Number Summary A five-number summary of a sample or population consists of five numbers that divide the sample or population into four equal parts. These numbers are called the quartiles. 0th Quartile = minimum. 1st Quartile = Q1. 2nd Quartile = median. 3rd Quartile = Q3. 4th Quartile = maximum.
Example If the distribution were uniform from 0 to 10, what would be the five-number summary? 1 5 6 7 8 9 2 3 4 10
Example If the distribution were uniform from 0 to 10, what would be the five-number summary? 1 5 6 7 8 9 2 3 4 10 50% 50% Median
Example If the distribution were uniform from 0 to 10, what would be the five-number summary? 1 5 6 7 8 9 2 3 4 10 25% 25% 25% 25% Q1 Median Q3
Example Where would the median and quartiles be in the following non-uniform distribution? 1 2 3 4 5 6 7
Example The five-number summary is Minimum = 0, Q1 = 2.5, Median = 5, Q3 = 7.5, Maximum = 10. If the distribution is not uniform, or if we are dealing with a list of numbers, the answer will not be so clear.
Quartiles – TI-38’s Method To find the quartiles, first find the median (2nd quartile). Then the 1st quartile is the “median” of all the numbers that are listed before the median. The 3rd quartile is the “median” of all the numbers that are listed after the median.
Example Find the quartiles of the sample 5, 8, 10, 15, 17, 19, 20, 24, 25, 30, 32
Example Find the quartiles of the sample 5, 8, 10, 15, 17, 19, 20, 24, 25, 30, 32 Median
Example Find the quartiles of the sample 5, 8, 10, 15, 17, 19, 20, 24, 25, 30, 32 Q1 Median Q3
Example Find the quartiles of the sample 5, 8, 10, 15, 17, 19, 20, 24, 25, 30, 32 Min Q1 Median Q3 Max
Example Find the quartiles of the sample 5, 8, 10, 15, 17, 19, 20, 24, 25, 30, 32, 33
Example Find the quartiles of the sample 5, 8, 10, 15, 17, 19, 20, 24, 25, 30, 32, 33 Median 19.5
Example Find the quartiles of the sample 5, 8, 10, 15, 17, 19, 20, 24, 25, 30, 32, 33 Q1 12.5 Median 19.5 Q3 27.5
Example Find the quartiles of the sample 5, 8, 10, 15, 17, 19, 20, 24, 25, 30, 32, 33 Min Q1 12.5 Median 19.5 Q3 27.5 Max
Percentiles – Textbook’s Method The pth percentile – A value that separates the lower p% of a sample or population from the upper (100 – p)%. p% or more of the values fall at or below the pth percentile, and (100 – p)% or more of the values fall at or above the pth percentile.
Percentiles – Textbook’s Method Find the 25th percentile of the following sample: 5, 8, 10, 15, 17, 19, 20, 24, 25, 30, 32.
Percentiles – Textbook’s Method Value % at or below % at or above 5 8 10 15 17 19 20 24 25 30 32
Percentiles – Textbook’s Method Value % at or below % at or above 5 9% 8 18% 10 27% 15 36% 17 45% 19 55% 20 64% 24 73% 25 82% 30 91% 32 100%
Percentiles – Textbook’s Method Value % at or below % at or above 5 9% 100% 8 18% 91% 10 27% 82% 15 36% 73% 17 45% 64% 19 55% 20 24 25 30 32
Percentiles – Textbook’s Method Value % at or below % at or above 5 9% 100% 8 18% 91% 10 27% 82% 15 36% 73% 17 45% 64% 19 55% 20 24 25 30 32
Percentiles – Textbook’s Method Value % at or below % at or above 5 9% 100% 8 18% 91% 10 27% 82% 15 36% 73% 17 45% 64% 19 55% 20 24 25 30 32 Min Q1 Median Q3 Max
Percentiles – Excel’s Formula To find position, or rank, of the pth percentile, compute the value
Excel’s Percentile Formula This gives the position (r = rank) of the pth percentile. Round r to the nearest whole number. The number in that position is the pth percentile.
Excel’s Percentile Formula Special case: If r is a “half-integer,” for example 10.5, then take the average of the numbers in positions r and r + 1, just as we did for the median when n was even. Microsoft Excel will interpolate whenever r is not a whole number. Therefore, by rounding, our answers may differ from Excel.
Example Use Excel’s formula to find a 5-number summary of 5, 8, 10, 15, 17, 19, 20, 24, 25, 30, 32. FiveNumberSummary.xls
The Principle Excel’s formula is based on the gaps between the numbers, not the numbers themselves.
Example Find the quartiles for the sample 5, 20, 30, 45, 60, 80, 100, 140, 175, 200, 240.
Excel’s Percentile Formula The formula may be reversed to find the percentile rank of a number, given its position, or rank, in the sample. The formula is
Example In the sample 22, 28, 31, 40, 42, 56, 78, 88, 97 what percentile rank is associated with 40?
Example For the sample what is the percentile rank of 45? 5, 20, 30, 45, 60, 80, 100, 140, 175, 200, 240, what is the percentile rank of 45?
The Interquartile Range The interquartile range (IQR) is the difference between Q3 and Q1. The IQR is a commonly used measure of spread, or variability. Like the median, it is not affected by extreme outliers.
Example The IQR of 22, 28, 31, 40, 42, 56, 78, 88, 97 is IQR = Q3 – Q1 = 78 – 31 = 47. Find the IQR for the sample 5, 20, 30, 45, 60, 80, 100, 140, 175, 200, 240.
Two Homework Problems For the % on-time-arrival data (p. 252), use the formula, with rounding, to find The 10th percentile. The 43rd percentile. The 69th percentile. The 95th percentile. Use the formula to find the percentile percentages, with rounding, of the following % on-time arrivals. 76.0, 81.1, 85.8, 90.3.