Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 4 Describing Numerical Data.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Chapter 2 Exploring Data with Graphs and Numerical Summaries
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
Displaying and Summarizing Quantitative Data Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Copyright © 2010 Pearson Education, Inc. Chapter 4 Displaying and Summarizing Quantitative Data.
Copyright © 2009 Pearson Education, Inc. Chapter 4 Displaying and Summarizing Quantitative Data.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 4 Displaying and Summarizing Quantitative Data.
Displaying & Summarizing Quantitative Data
Descriptive Statistics
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Describing Data: Numerical
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Programming in R Describing Univariate and Multivariate data.
Describing distributions with numbers
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Chapter 4 Displaying and Summarizing Quantitative Data Math2200.
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
Copyright © 2010 Pearson Education, Inc. Chapter 4 Displaying and Summarizing Quantitative Data.
Categorical vs. Quantitative…
INVESTIGATION 1.
Displaying Quantitative Data Graphically and Describing It Numerically AP Statistics Chapters 4 & 5.
To be given to you next time: Short Project, What do students drive? AP Problems.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Describing Distributions Numerically.
Copyright © 2011 Pearson Education, Inc. Describing Numerical Data Chapter 4.
UNIT #1 CHAPTERS BY JEREMY GREEN, ADAM PAQUETTEY, AND MATT STAUB.
Notes Unit 1 Chapters 2-5 Univariate Data. Statistics is the science of data. A set of data includes information about individuals. This information is.
LIS 570 Summarising and presenting data - Univariate analysis.
More Univariate Data Quantitative Graphs & Describing Distributions with Numbers.
Statistics and Data Analysis
Descriptive Statistics(Summary and Variability measures)
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
UNIT ONE REVIEW Exploring Data.
Displaying and Summarizing Quantitative Data
Exploratory Data Analysis
ISE 261 PROBABILISTIC SYSTEMS
Describing Distributions Numerically
Objective: Given a data set, compute measures of center and spread.
CHAPTER 2: Describing Distributions with Numbers
1st Semester Final Review Day 1: Exploratory Data Analysis
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Bell Ringer Create a stem-and-leaf display using the Super Bowl data from yesterday’s example
Description of Data (Summary and Variability measures)
Laugh, and the world laughs with you. Weep and you weep alone
Chapter 3 Describing Data Using Numerical Measures
CHAPTER 1 Exploring Data
Descriptive Statistics
DAY 3 Sections 1.2 and 1.3.
Chapter 5: Describing Distributions Numerically
Histograms: Earthquake Magnitudes
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Displaying and Summarizing Quantitative Data
Displaying and Summarizing Quantitative Data
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
Summary (Week 1) Categorical vs. Quantitative Variables
Describing Distributions Numerically
Honors Statistics Review Chapters 4 - 5
Basic Practice of Statistics - 3rd Edition
Advanced Algebra Unit 1 Vocabulary
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
Presentation transcript:

Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 4 Describing Numerical Data

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables Can 500 different songs fit on the iPod Shuffle?  To answer this question we must understand the typical length of a song and the variation of song sizes around the typical length  We can do this using summary statistics

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables A Subset of the Data

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables The Median  Value in the middle of a sorted list of numerical values (a typical value)  Half of the values fall below the median; half fall above  It is the 50 th Percentile

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables Common Percentiles  Lower Quartile = 25 th Percentile  Upper Quartile = 75 th Percentile  One quarter of the values fall below the lower quartile and one quarter fall above the upper quartile

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables The Interquartile Range (IQR) IQR = 75 th Percentile – 25 th Percentile  A measure of variation based on quartiles  Used to accompany the median

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables The Range Range = Maximum - Minimum  Maximum Value = 100 th Percentile  Minimum Value = 0 th Percentile  Another measure of variation; not preferred because based on extreme values

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables The Five Number Summary  Minimum  Lower Quartile  Median  Upper Quartile  Maximum

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables The Five Number Summary for Song Sizes  Minimum = MB  Lower Quartile = 2.85 MB  Median = MB  Upper Quartile = 4.32 MB  Maximum = MB

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables Summary Statistics for Song Sizes  Median = MB  IQR = 4.32 MB – 2.85 MB = 1.47 MB  Range = MB – MB = MB

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables The Mean (Average)  Arithmetic average; divide the sum of the values by the number of values (another typical value)  The symbol y represents the variable of interest  The symbol read “y bar” represents the mean

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables The Mean (Average)

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables The Variance (s 2 )  Is a measure of variation based on the mean  How far a value is from the mean is known as its deviation; the variance is the average of the squared deviations

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables The Variance

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables The Standard Deviation (SD)  Is the square root of the variance  Is a measure of variability in the original units of the data (the variance results in squared units)

Copyright © 2014, 2011 Pearson Education, Inc Summaries of Numerical Variables Summary Statistics for Song Sizes  Mean = MB  Variance = MB²  SD = MB

Copyright © 2014, 2011 Pearson Education, Inc. 17 4M Example 4.1: MAKING M&M’s Motivation How many M&M’s are needed to fill a bag labeled to weigh 1.6 ounces?

Copyright © 2014, 2011 Pearson Education, Inc. 18 4M Example 4.1: MAKING M&M’s Method Data are weights of 72 plain chocolate M&M’s taken from several packages. To get a measure of the amount of variation relative to the typical size, we use the ratio of the standard deviation to the mean (known as the coefficient of variation).

Copyright © 2014, 2011 Pearson Education, Inc. 19 4M Example 4.1: MAKING M&M’s Mechanics Mean Weight = 0.86 gm SD = 0.04 gm C v = 0.04 gm / 0.86 gm =

Copyright © 2014, 2011 Pearson Education, Inc. 20 4M Example 4.1: MAKING M&M’s Message Since the SD is quite small compared to the mean (with a c v of about 5%) the results suggest that 53 pieces are usually enough to fill a bag. A bag labeled 1.6 ounces weighs about grams. Since there is little variability around the typical weight of an M&M, we can calculate the number of pieces to fill a 1.6 ounce bag as 45.36/0.86.

Copyright © 2014, 2011 Pearson Education, Inc Histograms Histograms  Plot the distribution of a numerical variable by showing counts of values occurring within adjacent intervals  Similar to bar charts but designed for continuous quantitative data (bar charts are only appropriate for discrete categories)

Copyright © 2014, 2011 Pearson Education, Inc Histograms Histogram of Song Sizes

Copyright © 2014, 2011 Pearson Education, Inc Histograms Histogram of Song Sizes  Indicates a few very long songs (outliers)  The graph devotes more than half of its area to show less than 1% of the songs (white space rule: graphs with mostly white space can be improved by changing the interval of the plot to focus on the data rather than the white space)

Copyright © 2014, 2011 Pearson Education, Inc Histograms Histogram of Song Sizes  Using intervals of different lengths yield different histograms  Narrow intervals expose details smoothed over by wider intervals  Most software packages determine the right length to use automatically

Copyright © 2014, 2011 Pearson Education, Inc Histograms Histograms of Song Sizes – Different Intervals

Copyright © 2014, 2011 Pearson Education, Inc Boxplots Graph of the Five Number Summary

Copyright © 2014, 2011 Pearson Education, Inc Boxplots Combining Boxplots with Histograms  Boxplots locate the median and quartiles and highlight outliers  The median splits the area of the histogram in half (unlike the mean, it is resistant or robust to the effects of outliers)

Copyright © 2014, 2011 Pearson Education, Inc Boxplots Boxplot with Histogram of Song Sizes

Copyright © 2014, 2011 Pearson Education, Inc Boxplots Boxplot with Histogram of Song Sizes

Copyright © 2014, 2011 Pearson Education, Inc Shape of a Distribution Modes  Position of an isolated peak in a histogram  A histogram with one peak is unimodal; two is bimodal; three or more is multimodal  A flat histogram with all bars about the same height is uniform

Copyright © 2014, 2011 Pearson Education, Inc Shape of a Distribution Symmetry and Skewness  A distribution is symmetric if the two sides of its histogram are mirror images  A distribution is skewed if one tail of the histogram stretches out farther than the other Copyright © 2011 Pearson Education, Inc.

Copyright © 2014, 2011 Pearson Education, Inc Shape of a Distribution Distribution of Song Sizes  The mode lies between 3 and 4 MB  The distribution is right skewed (the right tail stretches out farther than the left tail)

Copyright © 2014, 2011 Pearson Education, Inc. 33 4M Example 4.2: EXECUTIVE COMPENSATION Motivation What can we say about the salaries of CEO’s in 2010?

Copyright © 2014, 2011 Pearson Education, Inc. 34 4M Example 4.2: EXECUTIVE COMPENSATION Method Data consist of salaries for 1,766 CEO’s reported in thousands of dollars (obtained from Compustat).

Copyright © 2014, 2011 Pearson Education, Inc. 35 4M Example 4.2: EXECUTIVE COMPENSATION Mechanics

Copyright © 2014, 2011 Pearson Education, Inc. 36 4M Example 4.2: EXECUTIVE COMPENSATION Message The salaries of CEOs in 2010 range from less than $100,000 into the millions. The distribution is right skewed. The median is $725,000 with half of salaries within the range of $520,000 to $970,000. A few exceed $3,000,000.

Copyright © 2014, 2011 Pearson Education, Inc Shape of a Distribution Bell-Shaped Distributions and Empirical Rule  A bell-shaped distribution is symmetric and unimodal  The empirical rule uses the standard deviation to describe how data with a bell- shaped distribution cluster around the mean

Copyright © 2014, 2011 Pearson Education, Inc Shape of a Distribution The Empirical Rule

Copyright © 2014, 2011 Pearson Education, Inc Shape of a Distribution Standardizing  Converting data to z-scores  Z- scores measure the distance from the mean in standard deviations

Copyright © 2014, 2011 Pearson Education, Inc Epilog Can 500 different songs fit on the iPod Shuffle? Because of variation, not every collection of 500 songs will fit. The longest 500 songs won’t fit. However, based on the typical song size, the amount of variation in song sizes and the shape of its distribution, we can say that most collections of 500 songs will fit!

Copyright © 2014, 2011 Pearson Education, Inc. 41 Best Practices  Be sure that data are numerical when using histograms and summaries such as the mean and standard deviation.  Summarize the distribution of a numerical variable with a graph.  Choose interval widths appropriate to the data when preparing a histogram.

Copyright © 2014, 2011 Pearson Education, Inc. 42 Best Practices (Continued)  Scale your plots to show data, not empty space.  Anticipate what you will see in a histogram.  Label clearly.  Check for gaps.

Copyright © 2014, 2011 Pearson Education, Inc. 43 Pitfalls  Do not use the methods of this chapter for categorical variables.  Do not assume that all numerical data have a bell-shaped distribution.  Do not ignore the presence of outliers.

Copyright © 2014, 2011 Pearson Education, Inc. 44 Pitfalls (Continued)  Do not remove outliers unless you have a good reason.  Do not forget to take the square root of a variance.