Copyright © 2011 Pearson Education, Inc. Alternative Approaches to Inference Chapter 17.

Slides:



Advertisements
Similar presentations
The Normal Distribution
Advertisements

Copyright © 2011 Pearson Education, Inc. Statistical Tests Chapter 16.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
5 - 1 © 1997 Prentice-Hall, Inc. Importance of Normal Distribution n Describes many random processes or continuous phenomena n Can be used to approximate.
Chapter 11- Confidence Intervals for Univariate Data Math 22 Introductory Statistics.
Confidence Interval and Hypothesis Testing for:
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Chapter 7 Estimation: Single Population
Copyright © 2010 Pearson Education, Inc. Chapter 24 Comparing Means.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Confidence Interval A confidence interval (or interval estimate) is a range (or an interval) of values used to estimate the true value of a population.
Copyright © 2011 Pearson Education, Inc. Multiple Regression Chapter 23.
10.3 Estimating a Population Proportion
Copyright © 2011 Pearson Education, Inc. The Normal Probability Model Chapter 12.
Copyright © 2009 Pearson Education, Inc. Chapter 23 Inferences About Means.
Copyright © 2010 Pearson Education, Inc. Slide
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 23, Slide 1 Chapter 23 Comparing Means.
Slide 1 Copyright © 2004 Pearson Education, Inc..
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 14 Sampling Variation and Quality.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Sections 6-1 and 6-2 Overview Estimating a Population Proportion.
Estimates and Sample Sizes Lecture – 7.4
PARAMETRIC STATISTICAL INFERENCE
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 24 Comparing Means.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Estimating a Population Mean: σ Known 7-3, pg 355.
Introduction to inference Tests of significance IPS chapter 6.2 © 2006 W.H. Freeman and Company.
Copyright © 2011 Pearson Education, Inc. Comparison Chapter 18.
CHAPTER 11 DAY 1. Assumptions for Inference About a Mean  Our data are a simple random sample (SRS) of size n from the population.  Observations from.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 23 Inference About Means.
Copyright © 2011 Pearson Education, Inc. Analysis of Variance Chapter 26.
Estimation (Point Estimation)
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 18 Inference for Counts.
Hypothesis Testing A procedure for determining which of two (or more) mutually exclusive statements is more likely true We classify hypothesis tests in.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 4 Describing Numerical Data.
Copyright © 2011 Pearson Education, Inc. The Simple Regression Model Chapter 21.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 16 Statistical Tests.
Statistical Inference Statistical Inference is the process of making judgments about a population based on properties of the sample Statistical Inference.
Copyright © 2012 Pearson Education. All rights reserved © 2010 Pearson Education Copyright © 2012 Pearson Education. All rights reserved. Chapter.
Sections 7-1 and 7-2 Review and Preview and Estimating a Population Proportion.
Section 10.1 Confidence Intervals
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 21 The Simple Regression Model.
Lesson 9 - R Chapter 9 Review.
Confidence Interval Estimation For statistical inference in decision making:
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 24 Comparing Means.
Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.
Copyright © 2009 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-4 Estimating a Population Mean:  Not Known.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall
Copyright © 2011 Pearson Education, Inc. Describing Numerical Data Chapter 4.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a measure of the population. This value is typically unknown. (µ, σ, and now.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 6 Putting Statistics to Work.
Introduction to inference Tests of significance IPS chapter 6.2 © 2006 W.H. Freeman and Company.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Confidence Interval Estimation For statistical inference in decision making: Chapter 9.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 26 Analysis of Variance.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Chapter 10 Confidence Intervals for Proportions © 2010 Pearson Education 1.
ESTIMATION.
Elementary Statistics
Chapter 25: Paired Samples and Blocks
BOOTSTRAPPING: LEARNING FROM THE SAMPLE
Confidence intervals for the difference between two means: Independent samples Section 10.1.
Estimates and Sample Sizes Lecture – 7.4
Presentation transcript:

Copyright © 2011 Pearson Education, Inc. Alternative Approaches to Inference Chapter 17

17.1 A Confidence Interval for the Median An auto insurance company is thinking about compensating agents by comparing the number of claims they produce to a standard. Annual claims average near $3,200 with a median claim of $2,000.  Claims are highly skewed  Use nonparametric methods that don’t rely on a normal sampling distribution Copyright © 2011 Pearson Education, Inc. 3 of 35

17.1 A Confidence Interval for the Median Distribution of Sample of Claims (n = 42) For this sample, the average claim is $3,632 with s = $4,254. The median claim is $2,456. Copyright © 2011 Pearson Education, Inc. 4 of 35

17.1 A Confidence Interval for the Median Is Sample Mean Compatible with µ=$3,200?  To answer this question, construct a 95% confidence interval for µ  This interval is $3,632 ± 2.02 x $4,254 / [$2,306 to $4,958] Copyright © 2011 Pearson Education, Inc. 5 of 35

17.1 A Confidence Interval for the Median Is Sample Mean Compatible with µ=$3,200?  The national average of $3,200 lies within the 95% confidence t-interval for the mean.  BUT…the sample does not satisfy the sample size condition necessary to use the t-interval.  The t-interval is unreliable with unknown coverage when the conditions are not met. Copyright © 2011 Pearson Education, Inc. 6 of 35

17.1 A Confidence Interval for the Median Nonparametric Statistics  Avoid making assumptions about the shape of the population.  Often rely on sorting the data.  Suited to parameters such as the population median θ (theta). Copyright © 2011 Pearson Education, Inc. 7 of 35

17.1 A Confidence Interval for the Median Nonparametric Statistics  For the claims data that are highly skewed to the right, θ < µ.  If the population distribution is symmetric, then θ = µ. Copyright © 2011 Pearson Education, Inc. 8 of 35

17.1 A Confidence Interval for the Median Nonparametric Confidence Interval  First step in finding a confidence interval for θ is to sort the observed data in ascending order (known as order statistics).  Order statistics are denoted as X (1) < X (2) < … < X (n) Copyright © 2011 Pearson Education, Inc. 9 of 35

17.1 A Confidence Interval for the Median Nonparametric Confidence Interval  If data are an SRS from a population with median θ, then we know 1. The probability that a random draw from the population is less than or equal to θ is ½, 2. The observations in the random sample are independent. Copyright © 2011 Pearson Education, Inc. 10 of 35

17.1 A Confidence Interval for the Median Nonparametric Confidence Interval  Determine the probabilities that the population median lies between ordered observations using the binomial distribution.  To form the confidence interval for θ combine several segments to achieve desired coverage. Copyright © 2011 Pearson Education, Inc. 11 of 35

17.1 A Confidence Interval for the Median Nonparametric Confidence Interval  In general, can’t construct a confidence interval for θ whose coverage is exactly  The 94.6% confidence interval for the median claim is [$1,217 to $3,168]. Copyright © 2011 Pearson Education, Inc. 12 of 35

17.1 A Confidence Interval for the Median Parametric versus Nonparametric  Limitations of nonparametric methods 1. Coverage is limited to certain values determined by sums of binomial probabilities (difficult to obtain exactly 95% coverage). 2. Median is not equal to the mean if the population distribution is skewed. This prohibits obtaining estimates for the total (total = nµ). Copyright © 2011 Pearson Education, Inc. 13 of 35

17.2 Transformations Transform Data into Symmetric Distributions Taking base 10 logs of the claims data results in a more symmetric distribution. Copyright © 2011 Pearson Education, Inc. 14 of 35

17.2 Transformations Transform Data into Symmetric Distributions Taking base 10 logs of the claims data results in data that could be from a normal distribution. Copyright © 2011 Pearson Education, Inc. 15 of 35

17.2 Transformations Transform Data into Symmetric Distributions  If y = log 10 x, then = with s y =  The 95% confidence t-interval for µ y is [3.16 to 3.47].  If we convert back to the original scale of dollars, this interval resembles that for the median rather than that for the mean. Copyright © 2011 Pearson Education, Inc. 16 of 35

17.3 Prediction Intervals  Prediction Interval: an interval that holds a future draw from the population with chosen probability.  For the auto insurance example, a prediction interval anticipates the size of the next claim, allowing for the random variation associated with an individual. Copyright © 2011 Pearson Education, Inc. 17 of 35

17.3 Prediction Intervals For a Normal Population The 100 (1 – α)% prediction interval for an independent draw from a normal population is where and s estimate µ and σ. Copyright © 2011 Pearson Education, Inc. 18 of 35

17.3 Prediction Intervals Nonparametric Prediction Interval  Relies on the properties of order statistics: P(X (i) ≤ X ≤ X (i+1) ) = 1/(n + 1) P(X ≤ X (1) ) = 1/(n + 1) P(X (n) ≤ X) = 1/(n + 1) Copyright © 2011 Pearson Education, Inc. 19 of 35

17.3 Prediction Intervals Nonparametric Prediction Interval  Combine segments to get desired coverage.  P (X (2) ≤ X ≤ X (41) ) = P ($255 ≤ X ≤ $17,305) = (41 – 2)/  There is a 91% chance that the next claim is between $255 and $17,305., Copyright © 2011 Pearson Education, Inc. 20 of 35

4M Example 17.1: EXECUTIVE SALARIES Motivation Fees earned by an executive placement service are 5% of the starting annual total compensation package. How much can the firm expect to earn by placing a current client as a CEO in the telecom industry? Copyright © 2011 Pearson Education, Inc. 21 of 35

4M Example 17.1: EXECUTIVE SALARIES Method Obtain data (n = 23 CEOs from telecom industry). Copyright © 2011 Pearson Education, Inc. 22 of 35

4M Example 17.1: EXECUTIVE SALARIES Method The distribution of total compensation for CEOs in the telecom industry is not normal. Construct a nonparametric prediction interval for the client’s anticipated total compensation package. Copyright © 2011 Pearson Education, Inc. 23 of 35

4M Example 17.1: EXECUTIVE SALARIES Mechanics Sort the data: Copyright © 2011 Pearson Education, Inc. 24 of 35

4M Example 17.1: EXECUTIVE SALARIES Mechanics The interval x (3) to x (21) is $743,801 to $29,863,393 and is a 75% prediction interval. Copyright © 2011 Pearson Education, Inc. 25 of 35

4M Example 17.1: EXECUTIVE SALARIES Message The compensation package of three out of four placements in this industry is predicted to be in the range from about $750,000 to $30,000,000. The implied fee ranges from $37,500 to $1,500,000. Copyright © 2011 Pearson Education, Inc. 26 of 35

17.4 Proportions Based on Small Samples Wilson’s Interval for a Proportion  An adjustment that moves the sampling distribution of closer to ½ and away from the troublesome boundaries at 0 and 1.  Add four artificial cases (2 successes and 2 failures) to create an adjusted proportion. Copyright © 2011 Pearson Education, Inc. 27 of 35

17.4 Proportions Based on Small Samples Wilson’s Interval for a Proportion Add 2 successes and 2 failures to the data and define = (# of successes+2)/n+4 ( = n+4). The z-interval is Copyright © 2011 Pearson Education, Inc. 28 of 35

4M Example 17.2: DRUG TESTING Motivation A company is developing a drug to prolong time before a relapse of cancer. The drug must cut the rate of relapse in half. To test this drug, the company first needs to know the current time to relapse. Copyright © 2011 Pearson Education, Inc. 29 of 35

4M Example 17.2: DRUG TESTING Method Data are collected for 19 patients who were observed for 24 months. Doctors found a relapse in 9 of the 19 patients. While the SRS condition is satisfied, the sample size condition is not. Use Wilson’s interval for a proportion. Copyright © 2011 Pearson Education, Inc. 30 of 35

4M Example 17.2: DRUG TESTING Mechanics By adding two successes and two failures, we have The interval is ± 1.96 = [0.27 to 0.68] Copyright © 2011 Pearson Education, Inc. 31 of 35

4M Example 17.2: DRUG TESTING Message We are 95% confident that the proportion of patients with this cancer that relapse within 24 months is between 27% and 68%. In order to cut this proportion in half, the drug will have to reduce this rate to somewhere between 13% and 34%. Copyright © 2011 Pearson Education, Inc. 32 of 35

Best Practices  Check the assumptions carefully when dealing with small samples.  Consider a nonparametric alternative if you suspect non-normal data.  Use the adjustment procedure for proportions from small samples.  Verify that your data are an SRS. Copyright © 2011 Pearson Education, Inc. 33 of 35

Pitfalls  Avoid assuming that populations are normally distributed in order to use a t – interval for the mean.  Do not use confidence intervals based on normality just because they are narrower than a nonparametric interval.  Do not think that you can prove normality using a normal quantile plot. Copyright © 2011 Pearson Education, Inc. 34 of 35

Pitfalls (Continued)  Do not rely on software to know which procedure to use.  Do not use a confidence interval when you need a prediction interval. Copyright © 2011 Pearson Education, Inc. 35 of 35