Describing Quantitative Data with Numbers

Slides:



Advertisements
Similar presentations
DESCRIBING DISTRIBUTION NUMERICALLY
Advertisements

CHAPTER 1 Exploring Data
1 Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Range Standard Deviation Interquartile Range (IQR)
Organizing Data AP Stats Chapter 1. Organizing Data Categorical Categorical Dotplot (also used for quantitative) Dotplot (also used for quantitative)
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
CHAPTER 1 Exploring Data
Measures of Position – Quartiles and Percentiles
CHAPTER 1 Exploring Data
Notes 13.2 Measures of Center & Spread
Describing Distributions Numerically
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Describing Distributions Numerically
CHAPTER 2: Describing Distributions with Numbers
Objective: Given a data set, compute measures of center and spread.
CHAPTER 2: Describing Distributions with Numbers
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
DAY 3 Sections 1.2 and 1.3.
Please take out Sec HW It is worth 20 points (2 pts
Warmup What is the shape of the distribution? Will the mean be smaller or larger than the median (don’t calculate) What is the median? Calculate the.
CHAPTER 1 Exploring Data
Displaying and Summarizing Quantitative Data
Organizing Data AP Stats Chapter 1.
1.3 Describing Quantitative Data with Numbers
Describing Quantitative Data with Numbers
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Exploratory Data Analysis
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Statistics and Data (Algebraic)
Describing Distributions Numerically
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
The Five-Number Summary
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Compare and contrast histograms to bar graphs
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
The Practice of Statistics, 4th edition STARNES, YATES, MOORE
Chapter 1: Exploring Data
Presentation transcript:

Describing Quantitative Data with Numbers 08.28.2017

Section 1.3

Mean, median, and mode What is the difference?

The Mean Note: the x-bar notation only applies to the mean of a sample, not the mean of a population However, the calculations are the same

Let’s Try it I randomly select 4 AP Stats students For that sample of four people, the individuals have the following GPAs: 3.595, 4.095, 3.214, and 3.524 What is the mean of these data?

Let’s Try it I randomly select 4 AP Stats students For that sample of four people, the individuals have the following GPAs: 3.595, 4.095, 3.214, and 3.524 What is the mean of these data? 3.607 Now let’s remove the 4.095 What happens to the mean?

Let’s Try it I randomly select 4 AP Stats students For that sample of four people, the individuals have the following GPAs: 3.595, 4.095, 3.214, and 3.524 What is the mean of these data? 3.607 Now let’s remove the 4.095 What happens to the mean? Now 3.444—a fairly large change (change of .163) What does this tell us about the mean as a way to measure the center of the data?

An Alternative: The Median

Let’s Try it Same data: 3.595, 4.095, 3.214, and 3.524 What is the median?

Let’s Try it Same data: 3.595, 4.095, 3.214, and 3.524 What is the median? 3.5595 Now remove the 4.095 observation again

Let’s Try it Same data: 3.595, 4.095, 3.214, and 3.524 What is the median? 3.5595 Now remove the 4.095 observation again Median now is 3.524 (change of .0355) What does this tell us about the median as a measure of center?

Mean or Median? It depends… When describing a distribution, median is often more useful For some calculations, the mean MIGHT be more appropriate Taxes Income Measures that are per capita

What about the mode? The least often used—except on standardized tests Simply the most common value for a variable So…in our 4-observation dataset of GPA, the mode is not very exciting, because they are all different values Technically we would have 4 modes But if we take the age (instead of GPA) of those 4 students, they are (not in order): 16, 17, 17, 17 What is the mode?

Beyond the Center In practice, we often care about much more than just the center of the data The average temperature is the same in San Francisco as in Springfield (MO) Despite very different temperatures What does the mean/median fail to capture?

Beyond the Center In practice, we often care about much more than just the center of the data The average temperature is the same in San Francisco as in Springfield (MO) Despite very different temperatures What does the mean/median fail to capture? Variability Can be measured in terms of the range Any problems with using the range to describe variability?

Variability The Range Interquartile range (IQR) Weakness: depends on the minimum and maximum values Particularly if they are outliers, this could be a problem Interquartile range (IQR) Looks at the range of the middle half (50%) of the data 1st quartile is the point that separates the bottom quarter of data from the second-from-the-bottom 2nd quartile is the median 3rd quartile is the point that separates the top quarter of data from the second- from-the-top

Variability

Back to the Tennis Serves 124.5, 122.1, 120.3, 119.7, 118.7, 116.5, 115.6, 114.5, 114, 113.9, 113.7, 112.6, 112.4, 112.3, 112.2, 110.5, 109.4, 108.3, 107.3, 103.1, 101.9 Find the mean Find the median Find the 1st quartile Find the 3rd quartile

Back to the Tennis Serves 124.5, 122.1, 120.3, 119.7, 118.7, 116.5, 115.6, 114.5, 114, 113.9, 113.7, 112.6, 112.4, 112.3, 112.2, 110.5, 109.4, 108.3, 107.3, 103.1, 101.9 Find the mean 113.5 Find the median 113.7 Find the 1st quartile 109.95 Find the 3rd quartile 117.6 So the IQR is 117.6-109.95 = 7.65

Defining Outliers

Were there any outliers? So our IQR was 7.65 1.5*7.65= 11.475 On the high end, an observation would have to be 11.475 ABOVE the 3rd quartile (117.6) 117.7+11.475= 129.075 On the low end, an observation would have to be 11.475 BELOW the 1st quartile (109.95) 109.95-11.475= 98.475 Were there any outliers?

5-number summary So, let’s do it using the tennis serves: Min Q1 Med Q3 Max 101.9 109.95 113.7 117.6 124.5

Boxplots AP Statistics Height Min Q1 Med Q3 Max 60 64 67 70 77

Boxplots In our example, the median is exactly in between the 1st and third quartiles. This does not always happen Similarly, you’ll notice that one whisker is longer than the other This is totally normal What does that tell us about the skewness of our data?

Standard Deviation Most common way of measuring the spread of a distribution Essentially measuring how far, on average, the values in the distribution are from the mean So the mean is important here If you have reason to think the mean is not ideal, standard deviation might not be ideal either

Standard Deviation

Standard Deviation On the AP test, you will be given the formula for standard deviation You do not need to memorize it But you do need to understand what the formula means

Let’s Try it Back to our GPA example: find the standard deviation of the following GPAs: 3.595, 4.095, 3.214, and 3.524 Remember, we calculated the mean as 3.607

Let’s Try it Back to our GPA example: find the standard deviation of the following GPAs: 3.595, 4.095, 3.214, and 3.524 Remember, we calculated the mean as 3.607 Standard deviation= .365