Chapter 41. 2 In Chapter 3… … we used stemplots to look at shape, central location, and spread of a distribution. In this chapter we use numerical summaries.

Slides:



Advertisements
Similar presentations
DESCRIBING DISTRIBUTION NUMERICALLY
Advertisements

HS 67 - Intro Health Statistics Describing Distributions with Numbers
Descriptive Measures MARE 250 Dr. Jason Turner.
CHAPTER 1 Exploring Data
Descriptive Statistics
Sullivan – Statistics: Informed Decisions Using Data – 2 nd Edition – Chapter 3 Introduction – Slide 1 of 3 Topic 16 Numerically Summarizing Data- Averages.
Numerically Summarizing Data
1 Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Range Standard Deviation Interquartile Range (IQR)
Slides by JOHN LOUCKS St. Edward’s University.
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Basic Practice of Statistics - 3rd Edition
CHAPTER 2: Describing Distributions with Numbers
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
CHAPTER 2: Describing Distributions with Numbers ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Objectives 1.2 Describing distributions with numbers
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
© 2008 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
1.3: Describing Quantitative Data with Numbers
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Measures of Dispersion How far the data is spread out.
INVESTIGATION 1.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
1 Further Maths Chapter 2 Summarising Numerical Data.
Exploring Data 1.2 Describing Distributions with Numbers YMS3e AP Stats at LSHS Mr. Molesky 1.2 Describing Distributions with Numbers YMS3e AP Stats at.
Chapter 3 Looking at Data: Distributions Chapter Three
Describing Quantitative Data with Numbers Section 1.3.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 4 – Slide 1 of 23 Chapter 3 Section 4 Measures of Position.
1 Chapter 4 Numerical Methods for Describing Data.
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Summary Statistics: Measures of Location and Dispersion.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
CHAPTER 1 Exploring Data
Notes 13.2 Measures of Center & Spread
Chapter 5 : Describing Distributions Numerically I
CHAPTER 2: Describing Distributions with Numbers
Averages and Variation
Summary Statistics 9/23/2018 Summary Statistics
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Please take out Sec HW It is worth 20 points (2 pts
1.2 Describing Distributions with Numbers
Lecture 2 Chapter 3. Displaying and Summarizing Quantitative Data
Describing Distributions Numerically
CHAPTER 1 Exploring Data
Describing Quantitative Data with Numbers
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Honors Statistics Review Chapters 4 - 5
Measures of Center and Spread
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

Chapter 41

2 In Chapter 3… … we used stemplots to look at shape, central location, and spread of a distribution. In this chapter we use numerical summaries to look at central location and spread.

Chapter 43 Summary Statistics Central location statistics –Mean –Median –Mode Spread statistics –Range –Interquartile range (IQR) –Variance and standard deviation Shape statistics exist but are seldom used in practice (not covered)

Chapter 44 Notation n  sample size X  variable (e.g., ages of subjects) x i  value for individual i   sum all values (“capital sigma”) Example: Let X = AGE (n = 10): x 1 = 21, x 2 = 42, …, x 10 = 52  x i = x 1 + x 2 + … + x 10 = … + 52 = 290

Chapter 45 Central Location: Sample Mean For the data on the previous slide: Most common measure of central location

Chapter 46 Example: Sample Mean The mean is the gravitational center of a batch of numbers

Chapter 47 Gravitational Center A skew tips the distribution causing the mean to shift toward the tail

Chapter 48 Uses of the Sample Mean Predicts value of an observation drawn at random from the sample Predicts value of an observation drawn at random from the population Predicts population mean µ

Chapter 49 Population Mean Same operation as sample mean applied to entire population (N ≡ population size) Not readily (never?) available in practice but conceptually important

Chapter 410 Central Location: Median The value with a depth of (n+1) / 2 When n is even → median is obvious When n is even → average the two middle values Example (below): Depth (M) = (10+1) / 2 = 5.5 Median = Average (27 and 28) =  median Average the adjacent values: M = 27.5

Chapter 411 More Examples Example A: Median = 4 Example B: Median = 5 (average of 4 and 6) Example C: Median  2 (Values must be ordered first)

Chapter 412 The Median is Robust This data set has x-bar = 1636: The median is 1614 in both instances, The median is resistant to skews and outlier Same data set with a data entry error highlighted This data has x-bar = 2743

Chapter 413 Mode Most frequent value in the dataset This data set has a mode of 7 {4, 7, 7, 7, 8, 8, 9} This data set has no mode {4, 6, 7, 8} (each point appears once) The mode is useful only in large data sets with repeating values

Chapter 414 Comparison of Mean, Median, Mode Mean gets “pulled” by tail: mean = median → symmetrical mean > median → positive skew mean < median → negative skew

Chapter 415 Spread ≡ extent to which data vary around middle point Site 1| |Site |2| 8|2| 2|3|234 86|3|6689 2|4|0 |4| |5| |5| |6| 8|6| ×10 particulates in air (μg/m 3 ) Sites have similar central locations medians = 36 and 38, respectively Site 1 exhibits much greater spread (visually)

Chapter 416 Spread: Range Range = maximum – minimum Illustrative example: Site 1 range = 68 – 22 = 46 Site 2 range = 40 – 32 = 8 The sample range is not a good measure of spread: tends to underestimate population range Always supplement the range with at least one addition measure of spread Site 1| |Site |2| 8|2| 2|3|234 86|3|6689 2|4|0 |4| |5| |5| |6| 8|6| ×10

Chapter 417 Spread: Interquartile Range Quartile 1 (Q1): marks bottom quarter of data = middle of the lower half of the data set Quartile 3 (Q3): marks top quarter of data = middle of the top half of data set Interquartile Range (IQR) = Q3 – Q1 covers middle 50% of \distribution    Q1 median Q3 Q1 = 21, Q3 = 42, and IQR = 42 – 21 = 21

Chapter 418 Five-Point Summary Q0 (the minimum) Q1 (25 th percentile) Q2 (median) Q3 (75 th percentile) Q4 (the maximum)    Q1 median Q3 5 point summary = 5, 21, 27.5, 42, 52

Chapter 419 Quartiles: Tukey’s Hinges Data: metabolic rates (cal/day), n =  median When n is odd, include the median in both “halves” Bottom half: Top half: Q1 = Q3 = 1729

Chapter 420 §4.6 Boxplots 1.Draw box from Q1 to Q3 2.Draw line for median. 3.Calculate fences: Fence Lower = Q1 – 1.5(IQR) Fence Upper = Q (IQR) 4.Do not draw fences 5.Any values outside the fences (outside values). are plotted separately. 6.Determine most extreme values still inside the fences (inside values) 7.Draw whiskers quartiles to inside values

Chapter 421 Example 1: Boxplot 1.5 pt summary: {5, 21, 27.5, 42, 52}; Box from 21 to 42 with IQR = 42 – 21 = 21. F U = Q (IQR) = 42 + (1.5)(21) = 73.5 F L = Q1 – 1.5(IQR) = 21 – (1.5)(21) = – No values above upper fence None values below lower fence 4.Upper inside value = 52 Lower inside value = 5 Draws whiskers Data: Upper inside = 52 Q3 = 42 Q1 = 21 Lower inside = 5 Q2 = 27.5

Chapter 422 Example 2: Boxplot Data: point summary: 3, 22, 25.5, 29, 51; hinges at 22 and 29 2.IQR = 29 – 22 = 7 F U = Q (IQR) = 29 + (1.5)(7) = 39.5 F L = Q1 – 1.5(IQR) = 22 – (1.5)(7) = One upper outside value (51) One lower outside value (3) 4.Upper inside value is 31 Lower inside value is 21 Draw whiskers

Chapter 423 Example 3: Boxplot Seven metabolic rates (cal / day): point summary: 1362, , 1614, 1729, IQR = 1729 – = F U = Q (IQR) = (279.5) = F L = Q1 – 1.5(IQR) = –1.5(279.5) = None outside 4. Inside values: 1867 and 1362

Chapter 424 Boxplots: Interpretation Central location: position of median and box (IQR) Spread: Hinge-spread (IQR), whisker spread, range Shape: symmetry of median within box and box within whiskers, tail length (kurtosis), outside values

Chapter 425 Spread: Standard Deviation The standard deviation is the most popular measure of spread σ ≡ population standard deviation s = sample standard deviation Based on deviations around the mean

Chapter 426 Deviations Deviation = distance from the mean = This example shows a deviation of −3 for the data point 33 It show a deviation of 4 for data point 40

Chapter 427 “Sum of squares” ObservationDeviationsSq. deviations  36 = =  36 = =  36 = =  36 = =  36 = =  36 =  2  2 2 =  36 =  3  3 2 =  36 =  4  4 2 = 16 SUMS  0 SS = 58

Chapter 428 Sum of Squares (SS), variance (s 2 ), Standard Deviation (s)

Chapter 429 Variance & Standard Deviation Sample variance Standard deviation

Chapter 430 Interpretation of Sample Standard Deviation s Measure spread Estimator of population standard deviation  rule (Normal distributions) Chebychev’s rule (all distributions)

Chapter Rule Applies to Normal distributions only! 68% of values within μ ± σ 95% within μ ± 2σ 99.7% within μ ± 3σ Example: Normal distribution with μ = 30 and σ = 10 : 68% of values in 30 ± 10 = 20 to 40 95% in 30 ± (2)(10) = 10 to % in 30 ± (3)(10) = 0 to 60

Chapter 432 Chebychev’s Rule Applies to all distributions At least 75% of the values within μ ± 2σ Example: Distribution with μ = 30 and σ = 10 has at least 75% of the values within 30 ± (2)(10) = 30 ± 20 = 10 to 50

Chapter 433 Rounding There is no single rule for rounding. The number of significant digits should reflect the precision of the measurement U se judgment and “be kind to your reader” Rough guide: carry at least four significant digits during calculations & round as final stepsignificant digits

Chapter 434 Choosing Summary Statistics Always report a measure of central location, a measure of spread, and the sample size Symmetrical distributions  report the mean and standard deviation Asymmetrical distributions  report the 5- point summaries (or median and IQR)

Chapter 435 Software and Calculators Use ‘em