Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and.

Slides:



Advertisements
Similar presentations
CHAPTER 1 Exploring Data
Advertisements

BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Basic Practice of Statistics - 3rd Edition
CHAPTER 2: Describing Distributions with Numbers
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Describing distributions with numbers
CHAPTER 2: Describing Distributions with Numbers ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
1.3: Describing Quantitative Data with Numbers
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Chapter 3 Looking at Data: Distributions Chapter Three
Describing Quantitative Data with Numbers Section 1.3.
Essential Statistics Chapter 21 Describing Distributions with Numbers.
Review BPS chapter 1 Picturing Distributions with Graphs What is Statistics ? Individuals and variables Two types of data: categorical and quantitative.
Chapter 2 Describing Distributions with Numbers. Numerical Summaries u Center of the data –mean –median u Variation –range –quartiles (interquartile range)
Chapter 5 Describing Distributions Numerically.
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Chapter 1: Exploring Data, cont. 1.2 Describing Distributions with Numbers Measuring Center: The Mean Most common measure of center Arithmetic average,
BPS - 5th Ed.Chapter 21 Describing Distributions with Numbers.
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 2: Describing Distributions with Numbers
CHAPTER 2: Describing Distributions with Numbers
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Please take out Sec HW It is worth 20 points (2 pts
CHAPTER 1 Exploring Data
Describing Quantitative Data with Numbers
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
Essential Statistics Describing Distributions with Numbers
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
Describing Distributions with Numbers
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

Chapter 2 Describing distributions with numbers

Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and the median 4. Measuring spread: the quartiles 5. The five-number summary and boxplots 6. Measuring spread: the standard deviation 7. Choosing measures of center and spread

Measuring center: the mean Notation: It is simply the ordinary arithmetic average. Suppose that we have n observations (data size, number of individuals). Observations are denoted as x 1, x 2, x 3, …x n.

Measuring center: the mean How to get ? Example 2.1 (P.33)

Measuring center: the median Notation: M midpoint Median M is the midpoint of a distribution  half the observations are smaller than M and the other half are larger than M.

Measuring center: the median How to find M? Sort all observations in increasing order (This step is important!!!) –1. Sort all observations in increasing order (This step is important!!!) –2. If n is odd, observation is M. if n is even, average of two center values is M. Note that is the location of the median in the ordered list, not the median value.

Measuring center: the median Examples Case 1. 11, 21, 13, 24, 15, 26, 17 Case 2. 11, 21, 13, 24, 15, 26 Example 2.2, 2.3 (P.35)

Mean vs. Median Median is more resistant than the mean. The mean and median of a symmetric distribution are close together. If the distribution is exactly symmetric, the mean and median are exactly the same. In a skewed distribution, the mean is farther out in the long tail than is the median. Example 1, 2, 3, 4, 5, 6, 10000

Inference : Strongly skewed distributions are reported with median than the mean.

Measuring Spread: The Quartiles The quartiles mark out the middle half of the distribution.

Calculating the Quartiles : –Step1. Arrange the observations in increasing order and locate the median M in the ordered list of observations. –Step2. The first quartile Q1 is the median of the observations whose position in the ordered list is to the left of the location of the overall median. –Step3. The third quartile Q3 is the median of the observations whose position in the ordered list is to the right of the location of the overall median.

Measuring spread: the quartiles Example 2.4 (P. 37) Example 2.5 (P. 38) Note: (1) It is important to sort data first before we try to find quartiles! (2) Quartiles are resistant.

The five-number summary and boxplots The five-number summary: Minimum, Q 1, M, Q 3, Maximum. Boxplot is a graph of five number summary. Boxplots are most useful for side-by-side comparison of several distributions.

Boxplot 1. A boxplot is a graph of the five- number summary 2. A central box spans the quartiles 3. A line in the box marks the median 4. Lines extended from the box out to the minimum and maximum 5. Range = maximum - minimum

The five-number summary and boxplot Figure 2.2(P.39): side-by-side boxplots comparing the distributions of earning for two levels of education.

The five-number summary and boxplots

Inference : Boxplot also gives an indication of the symmetry or skewness of a distribution. -- In a symmetric distribution Q1 and Q3 are equally distant from the median, but in case of right skewed one the third quartile would be further above the median than the first quartile bellow it.

Measuring spread: the standard deviation It says how far the observations are from their mean. The variance s 2 of a set of observations is an average of the squares of the deviations of the observations from their mean. Notation: s 2 for variance and s for standard deviation

Why (n-1) ? As the sum of the deviations always equals 0, so the knowledge of (n-1) of them determines the last one. --- Only (n-1) of the squared deviations are variable but not the last one, so we average by dividing the total by (n-1). The number (n-1) is called the degrees of freedom of the variance or standard deviation

Measuring spread: the standard deviation To find the variance and the standard deviation –1. Find the mean of the data set –2. Subtract the mean from each number (we call that deviation) –3. Square each result –4. Sum all the square –5. Divide the sum of square by n-1, where n is the number of all observations. Now you get variance –6. Standard deviation is just the positive square root of the variance.

Measuring spread: the standard deviation Example 2.6 (P.42)

Properties of s 2 and s s measures spread about the mean and should be used only when the mean is chosen as the measure of center. s 0 and s=0 only when each of the observation values does not differ from each other. S is not resistant.

Choosing measures of center and spread With a skewed distribution or with a distribution with extreme outliers, five- number summary is better. With a symmetric distribution (without outliers), mean and standard deviation are better.