1.2 Describing Distributions with Numbers Is the mean a good measure of center? Ex. Roger Maris’s yearly homerun production: 8 13 14 16 23 26 28 33 39.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Describing Distributions With Numbers
Lecture 4 Chapter 2. Numerical descriptors
CHAPTER 2: Describing Distributions with Numbers
Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Describing distributions with numbers
Chapter 1 Exploring Data
Why use boxplots? ease of construction convenient handling of outliers construction is not subjective (like histograms) Used with medium or large size.
CHAPTER 2: Describing Distributions with Numbers ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
1.3: Describing Quantitative Data with Numbers
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Describing distributions with numbers
Chapter 3 Looking at Data: Distributions Chapter Three
Describing Quantitative Data with Numbers Section 1.3.
Essential Statistics Chapter 21 Describing Distributions with Numbers.
Here is a back-to-back stemplot of the pulse rates of female and male students in one AP Statistics class. Write a few sentences comparing the two distributions.
Bell Ringer Write your answers in the Friday box. Employees at a large company are surveyed about their tobacco usage. Employees are coded as “1” if they.
Chapter 2 Describing Distributions with Numbers. Numerical Summaries u Center of the data –mean –median u Variation –range –quartiles (interquartile range)
1 Chapter 4 Numerical Methods for Describing Data.
Standard Deviation. Warm-up Do girls study more than boys? We asked the students in three AP Statistics classes how many minutes they studied on a typical.
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Numerical descriptions of distributions
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
More Univariate Data Quantitative Graphs & Describing Distributions with Numbers.
Chapter 1: Exploring Data, cont. 1.2 Describing Distributions with Numbers Measuring Center: The Mean Most common measure of center Arithmetic average,
Section 1.2 Part II Special Problem Guidelines posted online – start today!
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers.
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Numerical descriptions of distributions
Describing Distributions with Numbers
CHAPTER 2: Describing Distributions with Numbers
CHAPTER 2: Describing Distributions with Numbers
CHAPTER 1 Exploring Data
DAY 3 Sections 1.2 and 1.3.
Chapter 5: Describing Distributions Numerically
1.2 Describing Distributions with Numbers
POPULATION VS. SAMPLE Population: a collection of ALL outcomes, responses, measurements or counts that are of interest. Sample: a subset of a population.
1.3 Describing Quantitative Data with Numbers
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1 Warm Up .
Exploratory Data Analysis
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
Essential Statistics Describing Distributions with Numbers
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Compare and contrast histograms to bar graphs
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

1.2 Describing Distributions with Numbers Is the mean a good measure of center? Ex. Roger Maris’s yearly homerun production:

Mean/Mean…(Centers) Both measure center in different ways, but both are useful. Use median if you want a “typical” number. Mean = “Arithmetic Average Value” Mean/Median of a symmetric distribution are close together. If a distribution is exactly symmetric, mean = median. In a skewed distribution, the mean is farther out in the long tail than the median.

Measures of Spread Range = Largest – Smallest Observations in a list. What’s the problem with this? Better measure of spread: Quartiles. Range Quartiles 5 # Summary Variance Standard Deviation

Male/Female Surgeons (# of hysterectomies performed) Put in ascending order (male dr.s): odd # Min Q1 M Q3 Max Put in ascending order (female dr.s): even # Min Q1 M = 18.5 Q3 Max

Boxplots You can instantly see that female dr.’s perform less hysterectomies than male doctors. Also, there is less variation among female doctors.

Notes on boxplots Best used for side-by-side comparisons of more than 1 distribution. Less detail than histograms or stem plots. Always include the numerical scale...\Simulations\Hotdog Data.xls

Travel Times to Work #1 How long does it take you to get from home to school? Here are the travel times from home to work in minutes for 15 workers in North Carolina, chosen at random by the Census Bureau:

The distribution… Describe Is the longest travel time (60 minutes) an outlier? How many of the travel times are larger than the mean? If you leave out the large time, how does that change the mean? The mean in this example is nonresistant because it is sensitive to the influence of extreme observations. The mean is the arithmetic average, but it may not be a “typical“number!

Travel Times to Work #2 Travel times to work in New York State are (on the average) longer than in North Carolina. Here are the travel times in minutes of 20 randomly chosen New York workers:

Interquartile Range (IQR) Measures the spread of the middle ½ of the data. An observation is an outlier if:  Less than Q1 – 1.5(IQR) or  Greater than Q (IQR)

Looking at the spread….  Quartiles show spread of middle ½ of data  Spacing of the quartiles and extremes about the median give an indication of the symmetry or skewness of the distribution. Symmetric distributions:1 st /3 rd quartiles equally distant from the median. In right-skewed distributions: 3 rd quartile will be farther above the median than the 1 st quartile is below it.

Is there a difference between the number of programmed telephone numbers in girls’ cell phones and the number of programmed numbers in boys’ cell phones? Do you think there is a difference? If so, in what direction? 1) Count the number of programmed telephone numbers in your cell phone and write the total on a piece of paper. 2) Make a back-to-back stemplot of this information, then draw boxplots. When you test for outliers, how many do you find for males and how many do you find for females using the 1.5 X IQR test? 3) Find the 5# Summary for each group. Compare the two distributions (SOCS!). 4) It is important in any study that you have “data integrity” (the data is reported accurately and truthfully). Do you think this is the case here? Do you see any suspicious observations? Can you think of any reason someone may make up a response or stretch the truth? If you DO see a difference between the two groups, can you suggest a possible reason for this difference? 5) Do you think a study of cell phone programmed numbers for a sophomore algebra class would yield similar results? Why or why not?

Spring ’09 Student Data Girls: Boys:

Standard Deviation: A measure of spread Standard deviation looks at how far observations are from their mean. It’s the natural measure of spread for the Normal distribution We like s instead of s-squared (variance) since the units of measurement are easier to work with (original scale) S is the average of the squares of the deviations of the observations from their mean.

S, like the mean, is strongly influenced by extreme observations. A few outliers can make s very large. Skewed distributions with a few observations in the single long tail = large s. (S is therefore not very helpful in this case) As the observations become more spread about the mean, s gets larger.

Mean vs. Median Standard Deviation vs. 5-Number Summary The mean and standard deviation are more common than the median and the five number summary as a measure of center and spread. No single # describes the spread well. Remember: A graph gives the best overall picture of a distribution. ALWAYS PLOT YOUR DATA! The choice of mean/median depends upon the shape of the distribution.  When dealing with a skewed distribution, use the median and the 5# summary.  When dealing with reasonably symmetric distributions, use the mean and standard deviation.

The variance and standard deviation are… LARGE if observations are widely spread about the mean SMALL if observations are close to the mean

Degrees of Freedom (n-1) Definition: the number of independent pieces of information that are included in your measurement. Calculated from the size of the sample. They are a measure of the amount of information from the sample data that has been used up. Every time a statistic is calculated from a sample, one degree of freedom is used up. If the mean of 4 numbers is 250, we have degrees of freedom  (4-1) = 3. Why? ____ ____ ____ ____ mean = 250 If we freely choose numbers for the first 3 blanks, the 4 th number HAS to be a certain number in order to obtain the mean of 250.

A person’s metabolic rate is the rate at which the body consumes energy. Metabolic rate is important in studies of weight gain, dieting, and exercise. Here are the metabolic rates of 7 men who took part in a study of dieting: Find the mean Column 1: Observations (x) Column 2: Deviations Column 3: Squared deviations (TI-83: STAT/Calc/1-var-Stats L1 after entering list into L1)

You do! (By Hand) Let X = What is the variance and standard deviation?

You do! (using 1 Var Stats) During the years of the Great Depression, the weekly average hours worked in manufacturing jobs were 45, 43, 41, 39, 39, 35, 37, 40, 39, 36, and 37. What is the variance and standard deviation?

Miami Heat Salaries 1) Suppose that each member receives a $100,000 bonus. How will this effect the center, shape, and spread? 2) Suppose that each player is offered 10% increase in base salary. What happened to the centers and spread? PlayerSalary Shaq27.7 Eddie Jones13.46 Wade2.83 Jones2.5 Doleac2.4 Butler1.2 Wright1.15 Woods1.13 Laettner1.10 Smith1.10 Anderson.87 Dooling.75 Wang.75 Haslem.62 Mourning.33