Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 12: Describing Distributions with Numbers We create graphs to give us a picture of the data. We also need numbers to summarize the center and spread.

Similar presentations


Presentation on theme: "Chapter 12: Describing Distributions with Numbers We create graphs to give us a picture of the data. We also need numbers to summarize the center and spread."— Presentation transcript:

1 Chapter 12: Describing Distributions with Numbers We create graphs to give us a picture of the data. We also need numbers to summarize the center and spread of a distribution. Two types of descriptive statistics for categorical variables: 1) Counts (Frequencies) 2) Rates or Proportions (Relative Frequencies) Many statistics available to summarize quantitative variables.

2 Homeruns in Baseball Question: Who is the best home run hitter ever in major league baseball? Players with high numbers of homeruns in seasons: Babe Ruth Roger Maris Mark McGwire Sammy Sosa Barry Bonds

3 Median and Quartiles The median (M) is the midpoint of a distribution when the observations are arranged in increasing order. Number such that half the observations are smaller and the other half are larger. (p. 219) List the data in order from smallest to largest If n is odd, the median is the middle value. If n is even, the median is the mean of the middle two values.

4 M for Sosa and Maris Calculate M for Sosa’s homeruns in a season (8 seasons, to 1999). Data: 15, 10, 33, 25, 36, 40, 36, 66 Calculate M for Maris’s homeruns in a season (11 seasons). Data: 14, 28, 16, 39, 61, 33, 23, 26, 13, 9, 5

5 Percentiles p×100% percentile – the value of a variable such that p×100% of the values are below it and (1-p)×100% of the values are above it where 0 < p < 1. For the 35 th percentile, p=0.35. Where have you seen percentiles before?

6 Quartiles First Quartile (Q1): The value such that 25% of the data values lie below Q1 and 75% of the data values lie above Q1. (25 th percentile) Third Quartile (Q3): The value such that 75% of the data values lie below Q3 and 25% of the data values lie above Q3. (75 th percentile) The median is the second quartile (Q2). (50 th percentile)

7 Calculating percentiles: Let n be the number of data values. Order the n values from largest to smallest. Calculate the product, n×p. –If the product is not an integer (0,1,2,3,…), then round it up to the next integer and take the corresponding ordered value. –If the product is an integer, say k, then average the kth and (k+1)-st ordered values.

8 5-Number Summary The 5-number summary of a data set consists of the following descriptive statistics (p. 221): Minimum, First Quartile (Q1), Median, Third Quartile (Q3), Maximum Give the 5-number summaries for Sosa and Maris’s homeruns.

9 Boxplot A boxplot is a graphical representation of the 5-number summary. (p. 221) A central box spans the quartiles (Q1 to Q3) Inter-quartile Range = IQR = Q3 - Q1 A line in the box marks the median Lines (whiskers) extend from box to the minimum and maximum observations.

10 Constructing Boxplots 1) Compute the 5-number summary. 2) Draw a vertical line at the Q1 and Q3. 3) Draw two horizontal lines to complete the box. 4) Draw a vertical line at the median. 5) Draw “whiskers” to the extremes (Min and Max). Draw boxplots for Sosa and Maris’s homeruns.


Download ppt "Chapter 12: Describing Distributions with Numbers We create graphs to give us a picture of the data. We also need numbers to summarize the center and spread."

Similar presentations


Ads by Google