Summary Statistics: Mean, Median, Standard Deviation, and More “Seek simplicity and then distrust it.” (Dr. Monticino)

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Chapter 2 Exploring Data with Graphs and Numerical Summaries
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Descriptive Measures MARE 250 Dr. Jason Turner.
Class Session #2 Numerically Summarizing Data
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 4. Measuring Averages.
Introduction to Summary Statistics
Ana Jerončić. about half (71+37=108)÷200 = 54% of the bills are “small”, i.e. less than 30 EUR There are only a few telephone bills in the middle range.
Introduction to Summary Statistics
Calculating & Reporting Healthcare Statistics
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Intro to Descriptive Statistics
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Introduction to Statistics Chapter 3 Using Statistics to summarize.
Measures of Dispersion
Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)
Measures of a Distribution’s Central Tendency, Spread, and Shape Chapter 3 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Describing Data: Numerical
Describing distributions with numbers
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Summary statistics Using a single value to summarize some characteristic of a dataset. For example, the arithmetic mean (or average) is a summary statistic.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 PROBABILITIES FOR CONTINUOUS RANDOM VARIABLES THE NORMAL DISTRIBUTION CHAPTER 8_B.
Statistics Recording the results from our studies.
Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,
Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.
Objectives Vocabulary
Review Measures of central tendency
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Describing distributions with numbers
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Skewness & Kurtosis: Reference
Warm up The following graphs show foot sizes of gongshowhockey.com users. What shape are the distributions? Calculate the mean, median and mode for one.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 4 Describing Numerical Data.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Measures of Dispersion How far the data is spread out.
Foundations of Math I: Unit 3 - Statistics
Find out where you can find rand and randInt in your calculator. Write down the keystrokes.
Introduction to Statistics Santosh Kumar Director (iCISA)
Introduction to the Normal Distribution (Dr. Monticino)
Copyright © 2011 Pearson Education, Inc. Describing Numerical Data Chapter 4.
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
LIS 570 Summarising and presenting data - Univariate analysis.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Honors Statistics Chapter 3 Measures of Variation.
Chapter 131 Normal Distributions. Chapter 132 Thought Question 2 What does it mean if a person’s SAT score falls at the 20th percentile for all people.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
CHAPTER 3 – Numerical Techniques for Describing Data 3.1 Measures of Central Tendency 3.2 Measures of Variability.
StatisticsStatistics Unit 5. Example 2 We reviewed the three Measures of Central Tendency: Mean, Median, and Mode. We also looked at one Measure of Dispersion.
CHAPTER 11 Mean and Standard Deviation. BOX AND WHISKER PLOTS  Worksheet on Interpreting and making a box and whisker plot in the calculator.
Minds on! Two students are being considered for a bursary. Sal’s marks are Val’s marks are Which student would you award the bursary.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
3.3 Measures of Spread Chapter 3 - Tools for Analyzing Data Learning goal: calculate and interpret measures of spread Due now: p. 159 #4, 5, 6, 8,
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 10 Descriptive Statistics Numbers –One tool for collecting data about communication.
INTRODUCTION TO STATISTICS
Numerical Descriptive Measures
Descriptive Statistics (Part 2)
Introduction to Summary Statistics
Introduction to Summary Statistics
Chapter 3: Averages and Variation
Introduction to Summary Statistics
Introduction to Summary Statistics
Introduction to Summary Statistics
Data Analysis and Statistical Software I Quarter: Spring 2003
Numerical Descriptive Statistics
Chapter 1: Exploring Data
Introduction to Summary Statistics
Introduction to Summary Statistics
Introduction to Summary Statistics
Presentation transcript:

Summary Statistics: Mean, Median, Standard Deviation, and More “Seek simplicity and then distrust it.” (Dr. Monticino)

Assignment Sheet n Read Chapter 4 n Homework #3: Due Wednesday Feb. 9th Ù Chapter 4  exercise set A: 1 -6, 8, 9  exercise set C: 1, 2, 3  exercise set D: 1 - 4, 8,  exercise set E: 4, 5, 7, 8, 11, 12 n Quiz #2 will be over Chapter 2 n Quiz #3 on basic summary statistic calculations – mean, median, standard deviation, IQR, SD units n If you’d like a copy of notes - me

Overview n Measures of central tendency Ù Mean (average) Ù Median Ù Outliers n Measures of dispersion Ù Standard deviation  Standard deviation units Ù Range Ù IQR n Review and applications

Central Tendency n Measures of central tendency - mean and median - are useful in obtaining a single number summary of a data set Ù Mean is the arithmetic average Ù Median is a value such that at least 50% of the data is less and at least 50% is greater

Example n Calculate mean and median for following data sets

Outliers and Robustness n Mean can be sensitive to outliers in data set Ù Not robust to data collection errors or a single unusual measurement Ù Blind calculation can give misleading results mean = median = 151

Outliers and Robustness n Always a good idea to plot data in the order that it was collected Ù Spot outliers Ù Identify possible data collection errors mean without outliers = median without outliers = 149

Outliers and Robustness n Median can be a more robust measure of central tendency than mean Ù Life expectancy  U.S. males: mean = 80.1, median = 83  U.S. females: mean = 84.3, median = 87 Ù Household income  Mean = $51,855, median = $38,885 .3% account for 12% of income Ù Net worth  Mean = $282,500, median = $71,600

Which Central Tendency Measure? n Calculate mean, median and mode n Plot data n Create histogram to inspect mode(s) n Do not delete data points Ù If analyze data without outliers, report and explain outliers n Many statistical studies involve studying the difference between population means Ù Reporting the mean may be dictated by objective of study

Which Central Tendency Measure? n If data is  Unimodal  Fairly symmetric  Mean is approximately equal to median  Then mean is a reasonable measure of central tendency

Which Central Tendency Measure? n If data is  Unimodal  Asymmetric  Then report both median and mean n Difference between mean and median indicates asymmetry  Median will usually be the more reasonable summary of central tendency

Which Central Tendency Measure? n If data is  Not unimodal  Then report modes and cautiously mean and median  Analyze data for differences in groups around the modes

Limitations of Central Tendency n Any single number summary may not adequately represent data and may hide differences between data sets Ù Example

Measures of Dispersion n Including an additional statistic - a measure of dispersion - can help distinguish between data sets which have similar central tendencies Ù Range: max - min Ù Standard deviation: root mean square difference from the mean

Measures of Dispersion n Examples Ù Range

Measures of Dispersion n Examples Ù Standard deviation m = 100

Measures of Dispersion n Both range and standard deviation can be sensitive to outliers Ù However, many data sets can be characterized by mean and SD Ù If the values of the data set are distributed in an approximately bell shape, the  ~68% of the data will be within 1 SD unit of mean, ~95% will be within 2 SD units and nearly all will be within 3 SD units

Measures of Dispersion n Example Ù Suppose data set has mean = 35 and SD = 7 Ù How many SD units away from the mean is 42? Ù How many SD units away from the mean is 38? Ù How many SD units away from the mean is 30? Ù Assuming bell shape distribution, ~95% are between what two values?

Measures of Dispersion n A robust measure of dispersion is the interquartile range Ù Q 1 : value such that 25% of data less than, and 75% greater than Ù Q 3 : value such that 75% less than, and 25% greater than  IQR = Q 3 - Q 1

Example n Calculate range, standard deviation and interquartile range for the following data sets

Assignment, Discussion, Evaluation n Read Chapter 4 n Discussion problems Ù Chapter 4  exercise set A: 1 -6, 8, 9  exercise set C: 1, 2, 3  exercise set D: 1 - 4, 8,  exercise set E: 4, 5, 7, 8, 11, 12 n Quiz #3 on basic summary statistic calculations – mean, median, standard deviation, IQR, SD units

Review of Definitions n Measures of central tendency Ù Mean (average): Ù Median  If odd number of data points, “middle” value  If even number of data points, average of two “middle” values

Question and Examples n Can mean be larger than median? Can median be larger than mean? Ù Give examples n Can mean be a negative number? Can the median? n The average height of three men is 69 inches. Two other men enter the room of heights 73 and 70 inches. What is the average height of all five men?

Questions and Examples n The average of a data set is 30. Ù A value of 8 is added to each element in the data set. What is the new average? Ù Each element of the data set is increased by 5%. What is the new average? n Suppose that data consists of only 1’s and 0’s Ù What does the average represent?  Application: an experiment is performed and only two outcomes can occur  Label one type of outcome 1 and the other 0 n For the data set 31, 45, 72, 86, 62, 78, 50, find the median, Q 1 (25 th percentile) and Q 3 (75 th percentile)

Review of Definitions n Measures of dispersion Ù Standard deviation = Ù Range = max - min  IQR = Q 3 - Q 1

Questions and Examples n Can the SD be negative? Can the range? Can the IQR? n Can the SD equal 0? n For the data set 3,1,5,2,1,6 find the SD, range and IQR n The average weight for U.S. men is 175 lbs and the standard deviation is 20 lbs Ù If a man weighs 190 lbs., how many standard deviation units away from the mean weight is he? Ù Assuming a normal (bell-shaped) distribution for weight, ninety-five percent of U.S. men weigh between what two values?

Questions and Examples n The average of a data set is 23 and the standard deviation is 5 Ù A value of 8 is added to each element in the data set. What is the new standard deviation? Ù Each element of the data set is increased by 5%. What is the new standard deviation? (Dr. Monticino)