3.1 - 1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1.

Slides:



Advertisements
Similar presentations
Statistics It is the science of planning studies and experiments, obtaining sample data, and then organizing, summarizing, analyzing, interpreting data,
Advertisements

Slide 1 Copyright © 2004 Pearson Education, Inc..
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Created by Tom Wegleitner, Centreville, Virginia Section 3-1.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Measures of Relative Standing and Boxplots
3-2 Descriptive Statistics Inferential Statistics
3-3 Measures of Variation. Definition The range of a set of data values is the difference between the maximum data value and the minimum data value. Range.
1 Descriptive Statistics Frequency Tables Visual Displays Measures of Center.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Copyright © 2004 Pearson Education, Inc.
Lecture Slides Elementary Statistics Twelfth Edition
Slide 1 Lecture 4: Measures of Variation Given a stem –and-leaf plot Be able to find »Mean ( * * )/10=46.7 »Median (50+51)/2=50.5 »mode.
Statistics Workshop Tutorial 3
Chapter 3 Statistics for Describing, Exploring, and Comparing Data
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Created by Tom Wegleitner, Centreville, Virginia Section 3-1 Review and.
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Slide Slide 1 Section 3-3 Measures of Variation. Slide Slide 2 Key Concept Because this section introduces the concept of variation, which is something.
Review Measures of central tendency
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Elementary Statistics Eleventh Edition Chapter 3.
Copyright © 2004 Pearson Education, Inc.. Chapter 2 Descriptive Statistics Describe, Explore, and Compare Data 2-1 Overview 2-2 Frequency Distributions.
Created by Tom Wegleitner, Centreville, Virginia Section 2-4 Measures of Center.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Measures of Center.
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
© Copyright McGraw-Hill CHAPTER 3 Data Description.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
1 Measures of Center. 2 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Chapter 2 Descriptive Statistics
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Section 3-4 Measures of Relative Standing and Boxplots.
Honors Statistics Chapter 3 Measures of Variation.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Section 3-3 Measures of Variation.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
3-1 Review and Preview 3-2 Measures of Center 3-3 Measures of Variation 3-4 Measures of Relative Standing and Boxplots.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Measures of Center.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Section 3.1 & 3.2 Preview & Measures of Center. Important Statistics Mean, median, standard deviation, variance Understanding and Interpreting important.
Relative Standing and Boxplots
Section 3.3 Measures of Variation.
Elementary Statistics
Elementary Statistics
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Midrange (rarely used)
Lecture Slides Essentials of Statistics 5th Edition
Chapter 3 Statistics for Describing, Exploring, and Comparing Data
Lecture Slides Elementary Statistics Twelfth Edition
Lecture Slides Elementary Statistics Twelfth Edition
Elementary Statistics
Chapter 3 Statistics for Describing, Exploring, and Comparing Data
Lecture Slides Elementary Statistics Twelfth Edition
Lecture Slides Elementary Statistics Twelfth Edition
Overview Created by Tom Wegleitner, Centreville, Virginia
Lecture Slides Elementary Statistics Eleventh Edition
Lecture Slides Essentials of Statistics 5th Edition
Lecture Slides Elementary Statistics Eleventh Edition
Chapter 2 Describing, Exploring, and Comparing Data
Presentation transcript:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures of Center 3-3 Measures of Variation 3-4 Measures of Relative Standing and Boxplots

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.  Descriptive Statistics In this chapter we’ll learn to summarize or describe the important characteristics of a known set of data  Inferential Statistics In later chapters we’ll learn to use sample data to make inferences or generalizations about a population Preview

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 3-2 Measures of Center

Measure of Center  Measure of Center the value at the center or middle of a data set  The three common measures of center are the mean, the median, and the mode.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.  Mean the measure of center obtained by adding the data values and then dividing the total by the number of values What most people call an average also called the arithmetic mean.

Notation  Greek letter sigma used to denote the sum of a set of values. x is the variable usually used to represent the data values. n represents the number of data values in a sample.

Example of summation If there are n data values that are denoted as: Then:

Example of summation data Then: 21,25,32,48,53,62,62,64

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Sample Mean x = n  x x is pronounced ‘x-bar’ and denotes the mean of a set of sample values x

Example of Sample Mean data Then: 21,25,32,48,53,62,62,64

Notation µ Greek letter mu used to denote the population mean N represents the number of data values in a population.

Population Mean N µ =  x x Note: here x represents the data values in the population

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.  Advantages  Is relatively reliable: means of samples drawn from the same population don’t vary as much as other measures of center  Takes every data value into account Mean

Mean  Disadvantage Is sensitive to every data value, one extreme value can affect it dramatically; is not a resistant measure of center Example: 21,25,32,48,53,62,62,64 → 21,25,32,48,53,62,62,300 →

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Median  Median the measure of center which is the middle value when the original data values are arranged in order of increasing (or decreasing) magnitude  often denoted by x (pronounced ‘x-tilde’) ~  is not affected by an extreme value - is a resistant measure of the center

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Finding the Median 1.If the number of data values is odd, the median is the number located in the exact middle of the list. 2.If the number of data values is even, the median is found by computing the mean of the two middle numbers. First sort the values (arrange them in order), the follow one of these

Example of Median 6 data values: Sorted data: (even number of values – no exact middle)

Example of Median 7 data values: Sorted data:

 Mode the value that occurs with the greatest frequency  Data set can have one, more than one, or no mode

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Mode Mode is the only measure of central tendency that can be used with nominal data Bimodal two data values occur with the same greatest frequency Multimodalmore than two data values occur with the same greatest frequency No Modeno data value is repeated

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. a) b) c) Mode - Examples  Mode is 1.10  Bimodal - 27 & 55  No Mode

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.  Midrange the value midway between the maximum and minimum data values in the original data set (average of highest and lowest data values) Definition = maximum data value + minimum data value 2

Example of Midrange data values:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.  Sensitive to extremes because it uses only the maximum and minimum values, so rarely used Midrange  Redeeming Features (1)very easy to compute (2) reinforces that there are several ways to define the center (3)avoids confusion with median

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Best Measure of Center

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Carry one more decimal place than is present in the original set of values. Round-off Rule for Measures of Center

Example of Round-off Rule data values:

Example problem 12, page 95 These data values represent weight gain or loss in kg for a simple random sample (SRS) of18 college freshman (negative data values indicate weight loss) Do these values support the legend that college students gain 15 pounds (6.8 kg) during their freshman year? Explain

Sample Mean Example problem 12, page 95 Median

Example problem 12, page 95 Mode is -2 Midrange

Example problem 12, page 95 All of the measures of center are below 6.8 kg (15 pounds) Based on measures of center, these data values do not support the idea that college students gain 15 pounds (6.8 kg) during their freshman year

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Think about the method used to collect the sample data. See example 7, page 89 Critical Thinking Think about whether the results are reasonable. See example 6, page 89

First, enter the list of data values Then select 2nd STAT (LIST) and arrow right to MATH option 3:mean( or 4: median( and input the desired list Mean/Median with Graphing Calculator

Example of Computing the Mean Using Calculator Sorted amounts of Strontium-90 (in millibecquerels) in a simple random sample of baby teeth obtained from Philadelphia residents born after 1979 Note: this data is related to Three Mile Island nuclear power plant Accident in x = n  x x = 149.2

Example of Computing the Mean Using Calculator Median is 150

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Beyond the Basics of Measures of Center Part 2

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Assume that all sample values in each class are equal to the class midpoint. Mean from a Frequency Distribution

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. use class midpoints for variable x Mean from a Frequency Distribution

Example of Computing the Mean from a Frequency Distribution Heights (inches) of 25 Women

Example (cont.) Frequency distribution of heights of women HEIGHTFREQUENCY Class midpoints are: 59.5, 61.5, 63.5, 65.5, 67.5, 69.5, 71.5

Example (cont.) HEIGHTFREQUENCY

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Weighted Mean x = w  (w x)  When data values are assigned different weights, we can compute a weighted mean.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: Weighted Mean In this class, homework/quizz average is weighted 10%, 2 exams are weighted 60%, and final exam is weighted 30%. Suppose a student makes homework/quiz average 87, exam scores of 80 and 92, and final exam score 85. Compute the weighted average.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: Weighted Mean Weighted average is:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.  Symmetric distribution of data is symmetric if the left half of its histogram is roughly a mirror image of its right half  Skewed distribution of data is skewed if it is not symmetric and extends more to one side than the other Skewed and Symmetric

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.  Skewed to the left (also called negatively skewed) have a longer left tail, mean and median are to the left of the mode  Skewed to the right (also called positively skewed) have a longer right tail, mean and median are to the right of the mode Skewed Left or Right

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. The mean and median cannot always be used to identify the shape of the distribution. Shape of the Distribution

Shape of the Distribution Consider the data: Mean: Median: The distribution is skewed and the mean is “pulled” in the direction of the outliers 40 and 45

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Distribution Skewed Left Outliers that are much less than the median

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Symmetric Distribution No outliers

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Distribution Skewed Right Outliers that are much more than the median

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Recap In this section we have discussed:  Types of measures of center Mean Median Mode  Mean from a frequency distribution  Weighted means  Best measures of center  Skewness

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Section 3-3 Measures of Variation

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Key Concept Discuss characteristics of variation, in particular, measures of variation, such as standard deviation, for analyzing data. Make understanding and interpreting the standard deviation a priority.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Basics Concepts of Measures of Variation Part 1

Why is it important to understand variation? A measure of the center by itself can be misleading Example: Two nations with the same median family income are very different if one has extremes of wealth and poverty and the other has little variation among families (see the following table).

Example of variation Data Set A Data Set B 50,00010,000 60,00020,000 70,00070,000 80,000120,000 90,000130,000 MEAN70,00070,000 MEDIAN70,00070,000 Data set B has more variation about the mean

Histograms: example of variation Data set B has more variation about the mean (Target).

How do we quantify variation?

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Definition The range of a set of data values is the difference between the maximum data value and the minimum data value. Range = (maximum value) – (minimum value)

Range = = 24 Example of range. Data:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Range (cont.) This shows that the range is very sensitive to extreme values; therefore not as useful as other measures of variation. Ignoring the outlier of 6 in the previous data set gives data Range = = 5

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Definition The standard deviation of a set of sample values, denoted by s, is a measure of variation of values about the mean.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Sample Standard Deviation Formula  ( x – x ) 2 n – 1 s =s =

Steps to calculate the sample standard deviation 1.Calculate the sample mean 2.Find the squared deviations from the sample mean for each sample data value: 3.Add the squared deviations 4.Divide the sum in step 3 by n-1 5.Take the square root of the quotient in step 4

Example: Standard Deviation Given the data set: 8, 5, 12, 8, 9, 15, 21, 16, 3 Find the standard deviation

Example: Standard Deviation Find the mean

Data Value Squared Deviations From the Mean

Example: Standard Deviation Add the squared deviations (last column in the table above)

Divide the sum by 9-1=8: Take the square root: Example: Standard Deviation

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Round-Off Rule for Measures of Variation When rounding the value of a measure of variation, carry one more decimal place than is present in the original set of data. Round only the final answer, not values in the middle of a calculation.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Sample Standard Deviation (Shortcut Formula) n ( n – 1) s =s = n  x 2 ) – (  x ) 2

Example: Standard Deviation Page 110, problem 10 Data: The range is = 3 Determine the standard deviation using the previous formula

Example: Standard Deviation We need to find each the following:

Data Table (25 data values) TOTALS:

Example: Standard Deviation Thus:

Example: Standard Deviation And:

Example: Standard Deviation Page 110, problem 10 Do the results make sense? NO: although we can determine the standard deviation, the results make no sense because the data is actually categorical (describing phenotype) and do not represent counts. Note that if we renumber the categories, the standard deviation would change but the results would not.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Standard Deviation - Important Properties  The standard deviation is a measure of variation of all values from the mean.  The value of the standard deviation s is never negative and usually not zero.  The value of the standard deviation s can increase dramatically with the inclusion of one or more outliers (data values far away from all others).  The units of the standard deviation s are the same as the units of the original data values.

Calculator Example Data from problem on page 111, problem 20 is strontium- 90 data from previous class examples

Calculator Example Amounts of Strontium-90 (in millibecquerels) in a simple random sample of baby teeth obtained from Philadelphia residents born after 1979 Note: this data is related to Three Mile Island nuclear power plant Accident in Find standard deviation DIRECTIONS:

To compute standard deviation: 1) Enter the list of data values 2) Select 2nd STAT (LIST) and arrow right to MATH option 7:stdDev( 3) Input the data by choosing the desired list Previous example: s = 15.0 millibecquerels Calculator Example

Calculator Example Press STAT. Arrow to the right to CALC. Now choose option #1: 1-Var Stats. When 1-Var Stats appears on the home screen, tell the calculator the name of the list containing the data

Calculator Example Strontium-90 data, 1-Var Stats Displays: ETC.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Population Standard Deviation 2  ( x – µ ) N  = This formula is similar to the previous formula, but instead, the population mean and population size are used.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education  Population variance:  2 - Square of the population standard deviation  Variance  The variance of a set of values is a measure of variation equal to the square of the standard deviation.  Sample variance: s 2 - Square of the sample standard deviation s

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Unbiased Estimator The sample variance s 2 is an unbiased estimator of the population variance  2, which means values of s 2 tend to target the value of  2 instead of systematically tending to overestimate or underestimate  2.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Variance  Unlike standard deviation, the units of variance do not match the units of the original data set, they are the square of the units in the original data set.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Variance - Notation s = sample standard deviation s 2 = sample variance  = population standard deviation  2 = population variance

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Beyond the Basics of Measures of Variation Part 2

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Range Rule of Thumb for many data sets, the vast majority (such as 95%) of sample values lie within two standard deviations of the mean.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Range Rule of Thumb usual values in a data set are those that are “typical” and not too extreme rough estimates of the minimum and maximum “usual” sample values can be computed using the “range rule of thumb”

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Range Rule of Thumb minimum “usual” value = (mean) – 2  (standard deviation)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Range Rule of Thumb maximum “usual” value = (mean) + 2  (standard deviation)

Range Rule of Thumb  Usual values fall between the maximum and minimum usual values:  Otherwise the value is unusual

Example: Standard Deviation Page 110, problem 12 Data (kg):

Example (cont.) Thus:

Example (cont.)

Example (cont.) Minimum usual value Maximum usual value

Example (cont.) Because 6.8 kg falls between the minimum usual value and the maximum usual value, it is not considered “unusual”

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Range Rule of Thumb for Estimating Standard Deviation s To roughly estimate the standard deviation from a collection of known sample data use where range = (maximum value) – (minimum value)

Example: Standard Deviation Page 110, problem 12 Data (kg): Range rule of thumb:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Properties of the Standard Deviation Measures the variation among data values Values close together have a small standard deviation, but values with much more variation have a larger standard deviation Has the same units of measurement as the original data

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Properties of the Standard Deviation For many data sets, a value is unusual if it differs from the mean by more than two standard deviations Compare standard deviations of two different data sets only if the they use the same scale and units, and they have means that are approximately the same

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Empirical (or ) Rule For data sets having a distribution that is approximately bell shaped, the following properties apply:  About 68% of all values fall within 1 standard deviation of the mean.  About 95% of all values fall within 2 standard deviations of the mean.  About 99.7% of all values fall within 3 standard deviations of the mean.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education The Empirical Rule

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education The Empirical Rule

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education The Empirical Rule

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Example: The Empirical Rule Page 113, problem 34 One standard deviation from the mean is between: 68% of the data are between volts and volts

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Example: The Empirical Rule Page 113, problem 34 Two standard deviations from the mean is between: 95% of the data are between volts and volts

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Example: The Empirical Rule Page 113, problem 34 Three standard deviations from the mean is between: 99.7% of the data are between volts and volts

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Example: The Empirical Rule Page 113, problem 34 a) 95% of the data are between volts and volts b) 99.7% of the data are between volts and volts

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Chebyshev’s Theorem The proportion (or fraction) of any set of data lying within K standard deviations of the mean is always at least 1–1/K 2, where K is any positive number greater than 1.  For K = 2, at least 3/4 (or 75%) of all values lie within 2 standard deviations of the mean.  For K = 3, at least 8/9 (or 89%) of all values lie within 3 standard deviations of the mean.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Example: Chebyshev’s Theorem Page 113, problem 36 By Chebyshev’s Theorem there must be within three standard deviations of the mean

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Example: Chebyshev’s Theorem Page 113, problem 36 The minimum voltage amount within 3 standard deviations of the mean is and the maximum amount is

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Comparing Variation in Different Samples It’s a good practice to compare two sample standard deviations with s only when the sample means are approximately the same. In general, it is better to use the coefficient of variation when comparing any two standard deviations.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Coefficient of Variation The coefficient of variation (or CV) for a set of nonnegative sample or population data, expressed as a percent, describes the standard deviation relative to the mean. SamplePopulation s x CV =  100%  CV =   100%

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Example: Comparing Variation Page 112, problem 24 Waiting times for Jefferson Valley:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Example: Comparing Variation Page 112, problem 24 Waiting times for Providence:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Example: Comparing Variation Page 112, problem 24 CV for Jefferson Valley wait times much smaller than CV for Providence wait times implies that variation for Jefferson Valley is much smaller than that for Providence

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education  ( x – x ) 2 n – 1 s =s = Formula for Standard Deviation of a Sample Why do we use n-1 instead of n?

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Rationale for using n – 1 versus n There are only n – 1 independent values. With a given mean, only n – 1 values can be freely assigned any number before the last value is determined (see homework problem from 3-2 number 35).

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Rationale for using n – 1 versus n Dividing by n – 1 yields better results than dividing by n. It causes s 2 to target  2 whereas division by n causes s 2 to underestimate  2.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Recap In this section we have looked at:  Range  Standard deviation of a sample and population  Variance of a sample and population  Coefficient of variation (CV)  Range rule of thumb  Empirical distribution  Chebyshev’s theorem

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 3-4 Measures of Relative Standing and Boxplots

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Key Concepts This section introduces measures of relative standing, which are numbers showing the location of data values relative to the other values within a data set. Measures of relative standing can be used to compare values from different data sets, or to compare values within the same data set.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Key Concepts The most important concept is the z score. We will also discuss percentiles and quartiles, as well as a new statistical graph called the boxplot.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Basics of z Scores, Percentiles, Quartiles, and Boxplots Part 1

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.  z Score (or standardized value) the number of standard deviations that a given value x is above or below the mean Z score

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Sample Population x – µ z =  Round z scores to 2 decimal places Measures of Position z Score z = x – x s

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Interpreting Z Scores Whenever a value is less than the mean, its corresponding z score is negative Ordinary values: –2 ≤ z score ≤ 2 Unusual Values: z score 2

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Z score Problem 6 on page 127 Given mean and standard deviation: a) Difference between Hoffman’s age and mean age is

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Z score Problem 6 on page 127 Note: book answer is absolute value of the difference between Hoffman’s age and mean age which is

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Z score Problem 6 on page 127 b) How many standard deviations is that? Here take the ratio of the absolute value of the difference and the standard deviation:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Z score Problem 6 on page 127 c) Convert Hoffman’s age to a z score Here take the ratio of the difference and the standard deviation:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Z score Problem 6 on page 127 d) Is Hoffman’s age usual or unusual? “Usual” means z score is between - 2 and 2 and since is between - 2 and 2, Hoffman’s age is usual.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Z score Problem 10 on page 127 Given mean and standard deviation for heights of women:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Z score Problem 10 on page 127 Z scores for height requirements for women in army Minimum height requirement z score

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Z score Problem 10 on page 127 Z scores for height requirements for women in army Maximum height requirement z score

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Z score Problem 10 on page 127 Minimum height requirement z score is less than -2 so it is considered unusual Maximum height requirement z score is greater than 2 so it is considered unusual

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Z score Problem 13 on page 128 Score on SAT test is 1190 with mean of scores on SAT test 1518 and standard deviation 325 has z score of

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Z score Problem 13 on page 128 Score on ACT test is 26 with mean of scores on ACT test 21 and standard deviation 4.8 has z score of

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Z score Problem 13 on page 128 Comparing z scores the ACT z score was better than the SAT z score which means the ACT score was higher above the mean than the SAT score (relative to the standard deviation) so the ACT score is better.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Percentiles are measures of location. There are 99 percentiles denoted P 1, P 2,... P 99, which divide a set of data into 100 groups with about 1% of the values in each group.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Finding the Percentile of a Data Value Percentile of value x =

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Finding Percentile Problem 16 on page 128 Data (NFL superbowl total points): Find the percentile corresponding to 65

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Finding Percentile Problem 16 on page 128 Percentile of value x =

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. n total number of values in the data set k percentile being used L locator that gives the position of a value P k k th percentile L = n k 100 Notation Converting from the kth Percentile to the Corresponding Data Value

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Converting from the kth Percentile to the Corresponding Data Value

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Finding Data Value Given Its Percentile Problem 22 on page 128 Data (NFL superbowl total points): Find the data value at P 80

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Finding Data Value Given Its Percentile Problem 22 on page 128 For 80 th percentile, the k value in the locator formula is 80 If L is not a whole number, then round up to next whole number: P 80

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Finding Data Value Given Its Quartile The 20 th score is 61: Thus:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Quartiles  Q 1 (First Quartile) separates the bottom 25% of sorted values from the top 75%.  Q 2 (Second Quartile) same as the median; separates the bottom 50% of sorted values from the top 50%.  Q 3 (Third Quartile) separates the bottom 75% of sorted values from the top 25%. Are measures of location, denoted Q 1, Q 2, and Q 3, which divide a set of data into four groups with about 25% of the values in each group.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Quartiles The first quartile is the 25 th percentile The second quartile is the 50 th percentile The third quartile is the 75 th percentile

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Finding Data Value Given Its Quartile Problem 20 on page 128 Data (NFL superbowl total points): Find the data value at

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Finding Data Value Given Its Quartile Convert quartile to percentile: For 25 th percentile, the k value in the locator formula is 25 P 25

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Finding Data Value Given Its Quartile If L is a whole number, then round up to next whole number, find the percentile by adding the Lth value and the next value then dividing by 2 Sixth value is 41 and seventh value is 43:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Q 1, Q 2, Q 3 divide ranked scores into four equal parts Quartiles 25% Q3Q3 Q2Q2 Q1Q1 (minimum)(maximum) (median)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.  Interquartile Range (or IQR): Q 3 – Q 1  Percentile Range: P 90 – P 10  Semi-interquartile Range: 2 Q 3 – Q 1  Midquartile: 2 Q 3 + Q 1 Some Other Statistics

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.  For a set of data, the 5-number summary consists of the minimum value; the first quartile Q 1 ; the median (or second quartile Q 2 ); the third quartile, Q 3 ; and the maximum value. 5-Number Summary

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.  A boxplot (or box-and-whisker- diagram) is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile, Q 1 ; the median; and the third quartile, Q 3. Boxplot

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Boxplots Boxplot of Movie Budget Amounts (See example 7 and 8 on page 121)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Boxplots - Normal Distribution Normal Distribution: Heights from a Simple Random Sample of Women

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Boxplots - Skewed Distribution Skewed Distribution: Salaries (in thousands of dollars) of NCAA Football Coaches

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Five Number Summary and Boxplots Problem 30 on page 128 uses 10 data values from Strontium-90 data ordered as follows:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Five Number Summary and Boxplots Min data value is 128 First quartile

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Five Number Summary and Boxplots Second quartile

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Five Number Summary and Boxplots Third quartile

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Five Number Summary and Boxplots Max data value is 172 Five Number Summary Boxplot?

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example of Five Number Summary Boxplot

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Calculator: Five Number Summary Five Number Summary (from website mathbits.com) Enter the data in a list:

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Calculator: Five Number Summary Go to STAT - CALC and choose 1-Var Stats

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Calculator: Five Number Summary On the HOME screen, when 1-Var Stats appears, type the list containing the data.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Calculator: Five Number Summary Arrow down to the five number summary (last five items in the list)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Calculator: Boxplot CLEAR out the graphs under y = (or turn them off). Enter the data into the calculator lists. (choose STAT, #1 EDIT and type in entries)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Calculator: Boxplot CLEAR out the graphs under y = (or turn them off). Enter the data into the calculator lists. (choose STAT, #1 EDIT and type in entries)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Calculator: Boxplot Press 2nd STATPLOT and choose #1 PLOT 1. Be sure the plot is ON, the second box-and- whisker icon is highlighted, and that the list you will be using is indicated next to Xlist.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Calculator: Boxplot To see the box-and-whisker plot, press ZOOM and #9 ZoomStat. Press the TRACE key to see on-screen data about the box-and-whisker plot.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Outliers and Modified Boxplots OMIT THIS PART OF 3-4 Part 2

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Recap In this section we have discussed:  z Scores  z Scores and unusual values  Quartiles  Percentiles  Converting a percentile to corresponding data values  Other statistics  5-number summary

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Putting It All Together Always consider certain key factors:  Context of the data  Source of the data  Measures of Center  Sampling Method  Measures of Variation  Outliers  Practical Implications  Changing patterns over time  Conclusions  Distribution