Presentation is loading. Please wait.

Presentation is loading. Please wait.

Use of Chebyshev’s Theorem to Determine Confidence Intervals

Similar presentations


Presentation on theme: "Use of Chebyshev’s Theorem to Determine Confidence Intervals"— Presentation transcript:

1 Use of Chebyshev’s Theorem to Determine Confidence Intervals
Use when a sample comes from a relatively small sample n>.05N Non normally distributed (1 – 1/k2) = C%

2 Find an 80% and a 90% confidence interval for a population of 500
with a sample size n = 100, X = 45 and a standard deviation s = 12. Use Chebyshev’s Theorem 100>.05(500) 1 – 1/k2 = .8, k = 2.23 [42.324, ] 1 – 1/k2 = .9 k = 3.16 [41.208, ]

3 Confidence Intervals for Population Proportions
Assumptions 1. The sample is a simple random sample. 2. The conditions for the binomial distribution are satisfied. 3. The normal distribution can be used to  approximate the distribution of sample  proportions because np  5 and n(1 – p)  5 are both satisfied. page 330 of text

4 Notation for Proportions

5 Notation for Proportions
π = population proportion p = x n sample proportion of x successes in a sample of size n

6 Definition Point Estimate
page 331 for text

7 Definition Point Estimate
The sample proportion p is the best point estimate of the population proportion π.

8 Standard Error SE = Ask students where the z score will be found (Table A-2 as in previous problems.) Instructors should ask students to practice this E computation with their calculators. Grouping symbols (parentheses) will be needed with the factors under the radical.

9 Confidence Interval for Population Proportion
P + Z(P) * SE Once E has been computed, the confidence interval is found similarly as previous intervals: using the sample statistic minus E for the lower value and the sample statistics plus E for the upper value.

10 Confidence Interval for Population Proportion
A poll of 100 students found that 60 prefer Mrs. Peloquin as a teacher compared to Mr. Roesler. Find a 95% and 85% confidence interval that prefer Mrs. Peloquin. p = .6 (need at least 80 see pg 340) 1 - p = .4 Once E has been computed, the confidence interval is found similarly as previous intervals: using the sample statistic minus E for the lower value and the sample statistics plus E for the upper value. P + Z(P) * SE * .048 [.505,.694] * .048 [.531,.669]

11 Round-Off Rule for Confidence Interval Estimates of p
Round the confidence interval limits to three significant digits. page 332 of text This rounding rule is similar to that of Chapter 3 when probabilities were given in decimal form.

12 Determining Sample Size
page 334 of text

13 Determining Sample Size
p(1-p) ME =  n

14 Determining Sample Size
p(1-p) ME =  n (solve for n by algebra)

15 Determining Sample Size
p(1-p) ME =  n (solve for n by algebra) Again the algebraic isolation of n will require a squaring of the formula to remove the square root.  z ( )2 p (1-p) n = ME2

16 Sample Size for Estimating Proportion p
When an estimate of p is known: ( )2 p (1-p) n = ME2  z When no estimate of p is known: The reason for the change in the formula to 6-6 can be demonstrated by the areas of rectangles. See author’s margin note on page 334. The table in the next two slides of (where p and q are increasing/decreasing to the highest possible value of pq) demonstrates the maximum area of such a rectangle, thus the use of 0.25 for pq.  z ( )2 0.25 n = ME2

17 Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using . Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used . Example on page 335 of text - part (a).

18 Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using . Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used . n = [z/2 ]2 p (1-p) ME2

19 Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using . Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used . n = [z/2 ]2 p(1-p) ME2 = [1.645]2 (0.169)(0.831) 0.042

20 Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using . Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used . n = [z/2 ]2 p (1-p) ME2 To be 90% confident that our sample percentage is within four percentage points of the true percentage for all households, we should randomly select and survey 238 households. = [1.645]2 (0.169)(0.831) 0.042 = = 238 households

21 Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using . Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage. Example on page 335 of text - part (b).

22 Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using . Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage. n = [z/2 ]2 (0.25) ME2

23 Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using . Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage. n = [z/2 ]2 (0.25) ME2 = (1.645)2 (0.25) 0.042 = = 423 households

24 Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using . Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage. n = [z/2 ]2 (0.25) ME2 = (1.645)2 (0.25) With no prior information, we need a larger sample to achieve the same results with 90% confidence and an error of no more than 4%. Part (a) had previous knowledge of a sample percentage and required 238 households. Part (b) had no previous knowledge and required 423 households - a substantial increase. This increase was necessary to maintain the requirements of 90% confidence and an error of no more that 4%. A lower degree of confidence or a higher margin of error allowance would have produced a smaller sample size with no previous sample information. 0.042 = = 423 households

25

26 Small Samples Assumptions
If 1) n  30 2) The sample is a simple random sample. 3) The sample is from a normally distributed population. Case 1 ( is known): Largely unrealistic; Use z-scores Case 2 (is unknown): Use Student t distribution #3 is a loose requirement, which can be met if the population has only one mode and is basically symmetric. Smaller samples will have means that are likely to vary more. The greater variation is accounted for by the t distribution. Students usually like hearing about the history of how the Student t distribution got its name. page of text See Note to Instructor in margin for other ideas.

27 Student t Distribution
If the distribution of a population is essentially normal, then the distribution of x - µ t = s n Student t distribution is usually referred to as the t distribution.

28 Student t Distribution
If the distribution of a population is essentially normal, then the distribution of x - µ t = s n Student t distribution is usually referred to as the t distribution. is essentially a Student t Distribution for all      samples of size n. is used to find critical values denoted by t/ 2

29 Degrees of Freedom (df )
Definition Degrees of Freedom (df ) corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values page 314 of text

30 Degrees of Freedom (df )
Definition Degrees of Freedom (df ) corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values df = n - 1 in this section Explain that even though this is the same formula for degrees of freedom as in Chapter 6, other statistical procedures will have different formulas.

31 Degrees of Freedom (df ) = n - 1
Definition Degrees of Freedom (df ) = n - 1 corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values Specific # Any # Any # Any # Any # Any # Any # Any # Any # Any # n = 10 df = = 9 Students usually relate to this quite well if they think of it as a group of exams during a semester and what would be needed on the final exam in order to make a certain average (assuming each exam carried the same weight). so that x = 80

32 Table A-3 t Distribution
.005 (one tail) .01 (two tails) .01 (one tail) .02 (two tails) .025 (one tail) .05 (two tails) .05 (one tail) .10 (two tails) .10 (one tail) .20 (two tails) .25 (one tail) .50 (two tails) Degrees of freedom 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Large (z) 63.657 9.925 5.841 4.604 4.032 3.707 3.500 3.355 3.250 3.169 3.106 3.054 3.012 2.977 2.947 2.921 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 2.787 2.779 2.771 2.763 2.756 2.575 31.821 6.965 4.541 3.747 3.365 3.143 2.998 2.896 2.821 2.764 2.718 2.681 2.650 2.625 2.602 2.584 2.567 2.552 2.540 2.528 2.518 2.508 2.500 2.492 2.485 2.479 2.473 2.467 2.462 2.327 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.132 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 2.060 2.056 2.052 2.048 2.045 1.960 6.314 2.920 2.353 2.132 2.015 1.943 1.895 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.746 1.740 1.734 1.729 1.725 1.721 1.717 1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.645 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 1.363 1.356 1.350 1.345 1.341 1.337 1.333 1.330 1.328 1.325 1.323 1.321 1.320 1.318 1.316 1.315 1.314 1.313 1.311 1.282 1.000 .816 .765 .741 .727 .718 .711 .706 .703 .700 .697 .696 .694 .692 .691 .690 .689 .688 .687 .686 .685 .684 .683 .675 Emphasize the difference in how Table A-3 is set up versus that of Table A-2. In Table A-3, the critical values are found in the body of the table. In Table A-2, the critical values (z scores) are found in the left column and across the top row. Table A-3 is set up for only certain common values of  - a limitation of the table.

33 Given a variable that has a t-distribution with the specified
degrees of freedom, what percentage of the time will it be in the indicated region? a. 10 df, between and b. 10 df, between and 2.23. c. 24 df, between and 2.06. d. 24 df, between and 2.80. e. 24 df, outside the interval from and 2.80. f. 24 df, to the right of 2.80. g. 10 df, to the left of 90% 95% 95 % 99 % 1 % .5 % 5 %

34 What are the appropriate t critical values for each of the confidence intervals?
95 % confidence, n = 17 b % confidence, n = 12 c % confidence, n = 14 d % confidence, n = 25 e % confidence, n = 13 f % confidence, n = 10 2.12 1.80 3.01 1.71 1.78 2.262

35 Margin of Error E for Estimate of 
Based on an Unknown  and a Small Simple Random Sample from a Normally Distributed Population s ME = t/ n 2 where t/ 2 has n - 1 degrees of freedom

36 Confidence Interval for the Estimate of ME Based on an Unknown  and a Small Simple Random Sample from a Normally Distributed Population x - ME < µ < x + ME s where ME = t/2 n

37 Table A-3 t Distribution
.005 (one tail) .01 (two tails) .01 (one tail) .02 (two tails) .025 (one tail) .05 (two tails) .05 (one tail) .10 (two tails) .10 (one tail) .20 (two tails) .25 (one tail) .50 (two tails) Degrees of freedom 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Large (z) 63.657 9.925 5.841 4.604 4.032 3.707 3.500 3.355 3.250 3.169 3.106 3.054 3.012 2.977 2.947 2.921 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 2.787 2.779 2.771 2.763 2.756 2.575 31.821 6.965 4.541 3.747 3.365 3.143 2.998 2.896 2.821 2.764 2.718 2.681 2.650 2.625 2.602 2.584 2.567 2.552 2.540 2.528 2.518 2.508 2.500 2.492 2.485 2.479 2.473 2.467 2.462 2.327 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.132 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 2.060 2.056 2.052 2.048 2.045 1.960 6.314 2.920 2.353 2.132 2.015 1.943 1.895 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.746 1.740 1.734 1.729 1.725 1.721 1.717 1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.645 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 1.363 1.356 1.350 1.345 1.341 1.337 1.333 1.330 1.328 1.325 1.323 1.321 1.320 1.318 1.316 1.315 1.314 1.313 1.311 1.282 1.000 .816 .765 .741 .727 .718 .711 .706 .703 .700 .697 .696 .694 .692 .691 .690 .689 .688 .687 .686 .685 .684 .683 .675 Emphasize the difference in how Table A-3 is set up versus that of Table A-2. In Table A-3, the critical values are found in the body of the table. In Table A-2, the critical values (z scores) are found in the left column and across the top row. Table A-3 is set up for only certain common values of  - a limitation of the table.

38 Important Properties of the Student t Distribution
1. The Student t distribution is different for different sample sizes (see Figure 6-5 for the cases n = 3 and n = 12). 2. The Student t distribution has the same general symmetric bell shape as the normal distribution but it reflects the greater variability (with wider distributions) that is expected with small samples. 3. The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0). 4. The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a = 1). 5. As the sample size n gets larger, the Student t distribution gets closer to the normal distribution. For values of n > 30, the differences are so small that we can use the critical z values instead of developing a much larger table of critical t values. (The values in the bottom row of Table A-3 are equal to the corresponding critical z values from the standard normal distribution.) page of text

39 Student t Distributions for n = 3 and n = 12
with n = 12 Standard normal distribution Student t distribution with n = 3 Student t distributions have the same general shape and symmetry as the standard normal distribution, but reflect a greater variability that is expected with small samples.

40 Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.) Same as exercise #15 on page 321.

41 Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.) x = 26,227 s = 15,873  = 0.05 /2 = 0.025

42 Table A-3 t Distribution
.005 (one tail) .01 (two tails) .01 (one tail) .02 (two tails) .025 (one tail) .05 (two tails) .05 (one tail) .10 (two tails) .10 (one tail) .20 (two tails) .25 (one tail) .50 (two tails) Degrees of freedom 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Large (z) 63.657 9.925 5.841 4.604 4.032 3.707 3.500 3.355 3.250 3.169 3.106 3.054 3.012 2.977 2.947 2.921 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 2.787 2.779 2.771 2.763 2.756 2.575 31.821 6.965 4.541 3.747 3.365 3.143 2.998 2.896 2.821 2.764 2.718 2.681 2.650 2.625 2.602 2.584 2.567 2.552 2.540 2.528 2.518 2.508 2.500 2.492 2.485 2.479 2.473 2.467 2.462 2.327 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.132 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 2.060 2.056 2.052 2.048 2.045 1.960 6.314 2.920 2.353 2.132 2.015 1.943 1.895 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.746 1.740 1.734 1.729 1.725 1.721 1.717 1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.645 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 1.363 1.356 1.350 1.345 1.341 1.337 1.333 1.330 1.328 1.325 1.323 1.321 1.320 1.318 1.316 1.315 1.314 1.313 1.311 1.282 1.000 .816 .765 .741 .727 .718 .711 .706 .703 .700 .697 .696 .694 .692 .691 .690 .689 .688 .687 .686 .685 .684 .683 .675

43 Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.) x = 26,227 s = 15,873  = 0.05 /2 = 0.025 t/2 = 2.201 ME = t2 s = (2.201)(15,873) = 10,085.29 12 n

44 ME = t2 s = (2.201)(15,873) = 10,085.3 x - ME < µ < x +ME
Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.) x = 26,227 s = 15,873  = 0.05 /2 = 0.025 t/2 = 2.201 ME = t2 s = (2.201)(15,873) = 10,085.3 n 12 x - ME < µ < x +ME

45 E = t2 s = (2.201)(15,873) = 10,085.3 x - E < µ < x + E
Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.) x = 26,227 s = 15,873  = 0.05 /2 = 0.025 t/2 = 2.201 E = t2 s = (2.201)(15,873) = 10,085.3 n 12 x - E < µ < x + E 26, , < µ < 26, ,085.3

46 ME = t2 s = (2.201)(15,873) = 10,085.3 x - ME < µ < x + ME
Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.) x = 26,227 s = 15,873  = 0.05 /2 = 0.025 t/2 = 2.201 ME = t2 s = (2.201)(15,873) = 10,085.3 n 12 x - ME < µ < x + ME 26, , < µ < 26, ,085.3 $16, < µ < $36,312.3

47 E = t2 s = (2.201)(15,873) = 10,085.3 x - E < µ < x + E
Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.) x = 26,227 s = 15,873  = 0.05 /2 = 0.025 t/2 = 2.201 E = t2 s = (2.201)(15,873) = 10,085.3 n 12 x - E < µ < x + E 26, , < µ < 26, ,085.3 $16, < µ < $36,312.3 We are 95% confident that this interval contains the average cost of repairing a Dodge Viper.

48 One Sided Confidence Intervals
A one sided confidence interval is a confidence interval that establishes either a likely minimum or a likely maximum value for µ, but not both. Likely maximum Likely minimum

49

50


Download ppt "Use of Chebyshev’s Theorem to Determine Confidence Intervals"

Similar presentations


Ads by Google