McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
4-2 Descriptive Statistics Chapter Contents 4.1 Numerical Description 4.2 Measures of Center 4.3 Measures of Variability 4.4 Standardized Data 4.5 Percentiles, Quartiles, and Box Plots 4.6 Correlation and Covariance 4.7 Grouped Data 4.8 Skewness and Kurtosis Chapter 4
4-3 Chapter Learning Objectives LO4-1: Explain the concepts of center, variability, and shape. LO4-2: Use Excel to obtain descriptive statistics and visual displays. LO4-3: Calculate and interpret common measures of center. LO4-4: Calculate and interpret common measures of variability. LO4-5: Transform a data set into standardized values. LO4-6:Apply the Empirical Rule and recognize outliers. LO4-6: Apply the Empirical Rule and recognize outliers. LO4-7: Calculate quartiles and other percentiles. LO4-8: Make and interpret box plots. LO4-9: Calculate and interpret a correlation coefficient and covariance. LO4-10: Calculate the mean and standard deviation from grouped data. LO4-11: Assess skewness and kurtosis in a sample Chapter 4 Descriptive Statistics
4-4 Chapter Numerical Description LO4-1: Explain the concepts of center, variability, and shape. Three key characteristics of numerical data: LO4-1
4-5 Chapter 4 LO4-2: Use Excel to obtain descriptive statistics and visual displays. EXCEL Displays for Table 4.3 LO Numerical Description
4-6 Chapter Measures of Center LO4-3: Calculate and interpret common measures of center. LO4-3
4-7 Compare mean and median or look at histogram to determine degree of skewness.Compare mean and median or look at histogram to determine degree of skewness. Shape Shape Chapter Measures of Center LO4-1 LO4-1: Explain the concepts of center, variability, and shape.
4-8 Variation is the “spread” of data points about the center of the distribution in a sample. Consider the following measures of variability:Variation is the “spread” of data points about the center of the distribution in a sample. Consider the following measures of variability: Chapter Measures of Variability LO4-4: Calculate and interpret common measures of variability. LO4-4
4-9 For any population with mean and standard deviation , the percentage of observations that lie within k standard deviations of the mean must be at least 100[1 – 1/k 2 ]. For k = 2 standard deviations, 100[1 – 1/2 2 ] = 75%. So, at least 75.0% will lie within + 2 For k = 3 standard deviations, 100[1 – 1/3 2 ] = 88.9% So, at least 88.9% will lie within + 3 Although applicable to any data set, these limits tend to be too wide to be useful. Chebyshev’s Theorem Chebyshev’s Theorem Chapter Standardized Data
4-10 LO4-6: Apply the Empirical Rule and recognize outliers. The Empirical Rule The Empirical Rule Chapter 4 Unusual observations are those that lie beyond + 2 . Outliers are observations that lie beyond + 3 . 4.4 Standardized Data LO4-6
4-11 standardized variableA standardized variable (Z) redefines each observation in terms the number of standard deviations from the mean. Standardization formula for a population: Standardization formula for a sample: Defining a Standardized Variable Defining a Standardized Variable A negative z value means the observation is below the mean. Positive z means the observation is above the mean. Chapter 4LO Standardized Data LO4-5: Transform a data set into standardized values.
4-12 PercentilesPercentiles are data that have been divided into 100 groups. For example, you score in the 83 rd percentile on a standardized test. That means that 83% of the test-takers scored below you.For example, you score in the 83 rd percentile on a standardized test. That means that 83% of the test-takers scored below you. Deciles are data that have been divided into 10 groups.Deciles are data that have been divided into 10 groups. Quintiles are data that have been divided into 5 groups.Quintiles are data that have been divided into 5 groups. Quartiles are data that have been divided into 4 groups.Quartiles are data that have been divided into 4 groups. Percentiles Percentiles Chapter Percentiles, Quartiles, and Box-Plots LO4-7: Calculate quartiles and other percentiles. LO4-7
4-13 exploratory data analysisA useful tool of exploratory data analysis (EDA). box-and-whisker plot.Also called a box-and-whisker plot. five-number summary:Based on a five-number summary: X min, Q 1, Q 2, Q 3, X max Chapter 4 central tendencydispersionshape. A box plot shows central tendency, dispersion, and shape. Fences and Unusual Data Values Values outside the inner fences are unusual unusual while those outside the outer outliers fences are outliers 4.5 Percentiles, Quartiles, and Box-Plots LO4-8 LO4-8: Make and interpret box plots.
4-14 The sample correlation coefficient r is a statistic that describes the degree of linearity between paired observations on two quantitative variables X and Y. Note: -1 ≤ r ≤ +1.The sample correlation coefficient r is a statistic that describes the degree of linearity between paired observations on two quantitative variables X and Y. Note: -1 ≤ r ≤ +1. Correlation Coefficient Correlation Coefficient Chapter Correlation and Covariance The covariance of two random variables X and Y (denoted σ XY ) measures the degree to which the values of X and Y change together. PopulationSample LO4-9 LO4-9: Calculate and interpret a correlation coefficient and covariance.
4-15 A correlation coefficient is the covariance divided by the product of the standard deviations of X and Y. Covariance Covariance Chapter Correlation and Covariance LO4-9 LO4-9: Calculate and interpret a correlation coefficient and covariance.
4-16 Group Mean and Standard Deviation Group Mean and Standard Deviation Chapter Grouped Data LO4-10 LO4-10: Calculate the mean and standard deviation from grouped data.
4-17 Skewness Skewness Chapter Skewness and Kurtosis LO4-11 LO4-11: Assess skewness and kurtosis in a sample.
4-18 Kurtosis Kurtosis Chapter 4LO Skewness and Kurtosis LO4-11: Assess skewness and kurtosis in a sample.