Presentation is loading. Please wait.

Presentation is loading. Please wait.

Summary Statistics: Mean, Median, Standard Deviation, and More “Seek simplicity and then distrust it.” (Dr. Monticino)

Similar presentations


Presentation on theme: "Summary Statistics: Mean, Median, Standard Deviation, and More “Seek simplicity and then distrust it.” (Dr. Monticino)"— Presentation transcript:

1 Summary Statistics: Mean, Median, Standard Deviation, and More “Seek simplicity and then distrust it.” (Dr. Monticino)

2 Assignment Sheet n Read Chapter 4 n Homework #3: Due Wednesday Feb. 9th Ù Chapter 4  exercise set A: 1 -6, 8, 9  exercise set C: 1, 2, 3  exercise set D: 1 - 4, 8,  exercise set E: 4, 5, 7, 8, 11, 12 n Quiz #2 will be over Chapter 2 n Quiz #3 on basic summary statistic calculations – mean, median, standard deviation, IQR, SD units n If you’d like a copy of notes - email me

3 Overview n Measures of central tendency Ù Mean (average) Ù Median Ù Outliers n Measures of dispersion Ù Standard deviation  Standard deviation units Ù Range Ù IQR n Review and applications

4 Central Tendency n Measures of central tendency - mean and median - are useful in obtaining a single number summary of a data set Ù Mean is the arithmetic average Ù Median is a value such that at least 50% of the data is less and at least 50% is greater

5 Example n Calculate mean and median for following data sets 37445578100111125151161 3744556990120125152157161

6 Outliers and Robustness n Mean can be sensitive to outliers in data set Ù Not robust to data collection errors or a single unusual measurement Ù Blind calculation can give misleading results mean = 170.35 median = 151

7 Outliers and Robustness n Always a good idea to plot data in the order that it was collected Ù Spot outliers Ù Identify possible data collection errors mean without outliers = 150.14 median without outliers = 149

8 Outliers and Robustness n Median can be a more robust measure of central tendency than mean Ù Life expectancy  U.S. males: mean = 80.1, median = 83  U.S. females: mean = 84.3, median = 87 Ù Household income  Mean = $51,855, median = $38,885 .3% account for 12% of income Ù Net worth  Mean = $282,500, median = $71,600

9 Which Central Tendency Measure? n Calculate mean, median and mode n Plot data n Create histogram to inspect mode(s) n Do not delete data points Ù If analyze data without outliers, report and explain outliers n Many statistical studies involve studying the difference between population means Ù Reporting the mean may be dictated by objective of study

10 Which Central Tendency Measure? n If data is  Unimodal  Fairly symmetric  Mean is approximately equal to median  Then mean is a reasonable measure of central tendency

11 Which Central Tendency Measure? n If data is  Unimodal  Asymmetric  Then report both median and mean n Difference between mean and median indicates asymmetry  Median will usually be the more reasonable summary of central tendency

12 Which Central Tendency Measure? n If data is  Not unimodal  Then report modes and cautiously mean and median  Analyze data for differences in groups around the modes

13 Limitations of Central Tendency n Any single number summary may not adequately represent data and may hide differences between data sets Ù Example

14 Measures of Dispersion n Including an additional statistic - a measure of dispersion - can help distinguish between data sets which have similar central tendencies Ù Range: max - min Ù Standard deviation: root mean square difference from the mean

15 Measures of Dispersion n Examples Ù Range

16 Measures of Dispersion n Examples Ù Standard deviation m = 100

17 Measures of Dispersion n Both range and standard deviation can be sensitive to outliers Ù However, many data sets can be characterized by mean and SD Ù If the values of the data set are distributed in an approximately bell shape, the  ~68% of the data will be within 1 SD unit of mean, ~95% will be within 2 SD units and nearly all will be within 3 SD units

18 Measures of Dispersion n Example Ù Suppose data set has mean = 35 and SD = 7 Ù How many SD units away from the mean is 42? Ù How many SD units away from the mean is 38? Ù How many SD units away from the mean is 30? Ù Assuming bell shape distribution, ~95% are between what two values?

19 Measures of Dispersion n A robust measure of dispersion is the interquartile range Ù Q 1 : value such that 25% of data less than, and 75% greater than Ù Q 3 : value such that 75% less than, and 25% greater than  IQR = Q 3 - Q 1

20 Example n Calculate range, standard deviation and interquartile range for the following data sets 19899100100100102102104107 959899100100100102102104107

21 Assignment, Discussion, Evaluation n Read Chapter 4 n Discussion problems Ù Chapter 4  exercise set A: 1 -6, 8, 9  exercise set C: 1, 2, 3  exercise set D: 1 - 4, 8,  exercise set E: 4, 5, 7, 8, 11, 12 n Quiz #3 on basic summary statistic calculations – mean, median, standard deviation, IQR, SD units

22 Review of Definitions n Measures of central tendency Ù Mean (average): Ù Median  If odd number of data points, “middle” value  If even number of data points, average of two “middle” values

23 Question and Examples n Can mean be larger than median? Can median be larger than mean? Ù Give examples n Can mean be a negative number? Can the median? n The average height of three men is 69 inches. Two other men enter the room of heights 73 and 70 inches. What is the average height of all five men?

24 Questions and Examples n The average of a data set is 30. Ù A value of 8 is added to each element in the data set. What is the new average? Ù Each element of the data set is increased by 5%. What is the new average? n Suppose that data consists of only 1’s and 0’s Ù What does the average represent?  Application: an experiment is performed and only two outcomes can occur  Label one type of outcome 1 and the other 0 n For the data set 31, 45, 72, 86, 62, 78, 50, find the median, Q 1 (25 th percentile) and Q 3 (75 th percentile)

25 Review of Definitions n Measures of dispersion Ù Standard deviation = Ù Range = max - min  IQR = Q 3 - Q 1

26 Questions and Examples n Can the SD be negative? Can the range? Can the IQR? n Can the SD equal 0? n For the data set 3,1,5,2,1,6 find the SD, range and IQR n The average weight for U.S. men is 175 lbs and the standard deviation is 20 lbs Ù If a man weighs 190 lbs., how many standard deviation units away from the mean weight is he? Ù Assuming a normal (bell-shaped) distribution for weight, ninety-five percent of U.S. men weigh between what two values?

27 Questions and Examples n The average of a data set is 23 and the standard deviation is 5 Ù A value of 8 is added to each element in the data set. What is the new standard deviation? Ù Each element of the data set is increased by 5%. What is the new standard deviation? (Dr. Monticino)


Download ppt "Summary Statistics: Mean, Median, Standard Deviation, and More “Seek simplicity and then distrust it.” (Dr. Monticino)"

Similar presentations


Ads by Google