Lecture 16 Sec – Mon, Oct 2, 2006 Measuring Variation – The Five-Number Summary
The Five-Number Summary A five-number summary of a sample or population consists of five numbers that divide the sample or population into four equal parts. A five-number summary of a sample or population consists of five numbers that divide the sample or population into four equal parts. These numbers are called the quartiles. These numbers are called the quartiles. 0 th Quartile = minimum. 0 th Quartile = minimum. 1 st Quartile = Q1. 1 st Quartile = Q1. 2 nd Quartile = median. 2 nd Quartile = median. 3 rd Quartile = Q3. 3 rd Quartile = Q3. 4 th Quartile = maximum. 4 th Quartile = maximum.
Example If the distribution were uniform from 0 to 10, what would be the five-number summary? If the distribution were uniform from 0 to 10, what would be the five-number summary?
Example If the distribution were uniform from 0 to 10, what would be the five-number summary? If the distribution were uniform from 0 to 10, what would be the five-number summary? %
Example If the distribution were uniform from 0 to 10, what would be the five-number summary? If the distribution were uniform from 0 to 10, what would be the five-number summary? %
Example The five-number summary is The five-number summary is Minimum = 0, Minimum = 0, Q1 = 2.5, Q1 = 2.5, Median = 5, Median = 5, Q3 = 7.5, Q3 = 7.5, Maximum = 10. Maximum = 10. If the distribution is not uniform, or if we are dealing with a list of numbers, the answer will not be so clear. If the distribution is not uniform, or if we are dealing with a list of numbers, the answer will not be so clear.
Percentiles – Textbook’s Definition The p th percentile – A value that separates the lower p% of a sample or population from the upper (100 – p)%. The p th percentile – A value that separates the lower p% of a sample or population from the upper (100 – p)%. p% or more of the values fall at or below the p th percentile, and p% or more of the values fall at or below the p th percentile, and (100 – p)% or more of the values fall at or above the p th percentile. (100 – p)% or more of the values fall at or above the p th percentile.
Percentiles – Textbook’s Definition Find the 25 th percentile of the following sample: Find the 25 th percentile of the following sample: 22, 28, 31, 40, 42, 56, 78, 88, 97.
Percentiles – Excel’s Formula To find position, or rank, of the p th percentile, compute the value To find position, or rank, of the p th percentile, compute the value
Excel’s Percentile Formula This gives the position (r = rank) of the p th percentile. This gives the position (r = rank) of the p th percentile. Round r to the nearest whole number. Round r to the nearest whole number. The number in that position is the p th percentile. The number in that position is the p th percentile.
Excel’s Percentile Formula Special case: If r is a “half-integer,” for example 10.5, then take the average of the numbers in positions r and r + 1, just as we did for the median when n was even. Special case: If r is a “half-integer,” for example 10.5, then take the average of the numbers in positions r and r + 1, just as we did for the median when n was even. Microsoft Excel will interpolate whenever r is not a whole number. Microsoft Excel will interpolate whenever r is not a whole number. Therefore, by rounding, our answers may differ from Excel. Therefore, by rounding, our answers may differ from Excel.
Example Find the 25 th percentile of Find the 25 th percentile of 22, 28, 31, 40, 42, 56, 78, 88, 97. p = 25 and n = 9. p = 25 and n = 9. Compute r = 1 + (25/100)(9 – 1) = 3. Compute r = 1 + (25/100)(9 – 1) = 3. The 25 th percentile is the 3 rd number, i.e., 31. The 25 th percentile is the 3 rd number, i.e., 31.
The Principle Excel’s formula is based on the gaps between the numbers, not the numbers themselves. Excel’s formula is based on the gaps between the numbers, not the numbers themselves
The Principle Excel’s formula is based on the gaps between the numbers, not the numbers themselves. Excel’s formula is based on the gaps between the numbers, not the numbers themselves
The Principle Excel’s formula is based on the gaps between the numbers, not the numbers themselves. Excel’s formula is based on the gaps between the numbers, not the numbers themselves
The Principle Excel’s formula is based on the gaps between the numbers, not the numbers themselves. Excel’s formula is based on the gaps between the numbers, not the numbers themselves MinQ1MedQ3Max
Example Find the quartiles for the sample Find the quartiles for the sample 5, 20, 30, 45, 60, 80, 100, 140, 175, 200, 240.
Excel’s Percentile Formula The formula may be reversed to find the percentage of the percentile of a number, given its position, or rank, in the sample. The formula may be reversed to find the percentage of the percentile of a number, given its position, or rank, in the sample. The formula is The formula is
Example In the sample In the sample 22, 28, 31, 40, 42, 56, 78, 88, 97 what percentile percentage is associated with 40? n = 9 and r = 4. n = 9 and r = 4. Compute p = 100(4 – 1)/(9 – 1) = Compute p = 100(4 – 1)/(9 – 1) = Therefore, 40 is the 37.5 th percentile. Therefore, 40 is the 37.5 th percentile.
Example For the sample For the sample 5, 20, 30, 45, 60, 80, 100, 140, 175, 200, 240, what is the percentile percentage of 45?
The Interquartile Range The interquartile range (IQR) is the difference between Q3 and Q1. The interquartile range (IQR) is the difference between Q3 and Q1. The IQR is a commonly used measure of spread, or variability. The IQR is a commonly used measure of spread, or variability. Like the median, it is not affected by extreme outliers. Like the median, it is not affected by extreme outliers.
Example The IQR of The IQR of 22, 28, 31, 40, 42, 56, 78, 88, 97 is IQR = Q3 – Q1 = 78 – 31 = 47. Find the IQR for the sample Find the IQR for the sample 5, 20, 30, 45, 60, 80, 100, 140, 175, 200, , 20, 30, 45, 60, 80, 100, 140, 175, 200, 240.
TI-83 – Five-Number Summary Follow the procedure used to find the mean and the median. Follow the procedure used to find the mean and the median. Scroll down the display to find the minimum, Q1, the median, Q3, and the maximum. Scroll down the display to find the minimum, Q1, the median, Q3, and the maximum.
Example Use the TI-83 to find Q1 and Q3 for the sample Use the TI-83 to find Q1 and Q3 for the sample 5, 20, 30, 45, 60, 80, 100, 140, 175, 200, , 20, 30, 45, 60, 80, 100, 140, 175, 200, 240.
Homework (2 problems – 10 points each) For the % on-time-arrival data (p. 252), use the formula, with rounding, to find For the % on-time-arrival data (p. 252), use the formula, with rounding, to find The 10 th percentile. The 10 th percentile. The 43 rd percentile. The 43 rd percentile. The 69 th percentile. The 69 th percentile. The 95 th percentile. The 95 th percentile. Use the formula to find the percentile percentages, with rounding, of the following % on-time arrivals. Use the formula to find the percentile percentages, with rounding, of the following % on-time arrivals. 76.0, 81.1, 85.8, , 81.1, 85.8, 90.3.