Download presentation
Presentation is loading. Please wait.
1
Sampling and Sampling Distributions
Statistics for Business and Economics Chapter-6 Sampling and Sampling Distributions Note: Chapter 6 (7th edition) “Distribution of Sample Statistics” Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
2
Populations and Samples
A Population is the set of all items or individuals of interest Examples: All likely voters in the next election All parts produced today All sales receipts for November A Sample is a subset of the population Examples: 1000 voters selected at random for interview A few parts selected for destructive testing Random receipts selected for audit Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
3
Why Sample? Less time consuming than a census
Less costly to administer than a census It is possible to obtain statistical results of a sufficiently high precision based on samples. Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
4
Inferential Statistics
Making statements about a population by examining sample results Sample statistics Population parameters (known) Inference (unknown, but can be estimated from sample evidence) Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall
5
Simple Random Samples Every object in the population has an equal chance of being selected Objects are selected independently Samples can be obtained from a table of random numbers or computer random number generators A simple random sample is the ideal against which other sample methods are compared Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
6
Sampling Distributions
A sampling distribution is a distribution of all of the possible values of a statistic (sample mean) for a given size sample selected from a population. Definition The probability distribution of is called its sampling distribution. It lists the various values that can assume and the probability of each value of . In general, the probability distribution of a sample statistic is called its sampling distribution. Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
7
Chapter Outline Sampling Distributions Sampling Distribution of Sample
Mean Sampling Distribution of Sample Proportion Sampling Distribution of Sample Variance Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
8
Sampling Distribution
Suppose we assign the letters A, B, C, D, and E to the scores of the five students so that A = 70, B = 78, C = 80, D = 80, E = 95 Then, the 10 possible samples of three scores each are ABC, ABD, ABE, ACD, ACE, ADE, BCD, BCE, BDE, CDE Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved
9
Table 7.3 All Possible Samples and Their Means When the Sample Size Is 3
Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved
10
Table 7.4 Frequency and Relative Frequency Distributions of When the Sample Size Is 3
Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved
11
Table 7.5 Sampling Distribution of When the Sample Size Is 3
Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved
12
Developing a Sampling Distribution
Assume there is a population … Population size N=4 Random variable, X, is age of individuals Values of X: 18, 20, 22, 24 (years) D A C B Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
13
Developing a Sampling Distribution
(continued) Summary Measures for the Population Distribution: P(x) .25 x A B C D Uniform Distribution Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
14
Now consider all possible samples of size n = 2
Developing a Sampling Distribution (continued) Now consider all possible samples of size n = 2 16 Sample Means 16 possible samples (sampling with replacement) Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
15
Sampling Distribution of All Sample Means
Developing a Sampling Distribution (continued) Sampling Distribution of All Sample Means Sample Means Distribution 16 Sample Means _ P(X) .3 .2 .1 _ X (no longer uniform) Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
16
Summary Measures of this Sampling Distribution:
Developing a Sampling Distribution (continued) Summary Measures of this Sampling Distribution: Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
17
MEAN AND STANDARD DEVIATION OF x
Definition The mean and standard deviation of the sampling distribution of are called the mean and standard deviation of and are denoted by and , respectively. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved
18
Expected Value of Sample Mean
Let X1, X2, Xn represent a random sample from a population The sample mean value of these observations is defined as Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
19
Standard Error of the Mean
The standard deviation of the sampling distribution of a statistic is referred to as the standard error of that quantity. Different samples of the same size from the same population will yield different sample means A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean: Note that the standard error of the mean decreases as the sample size increases Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
20
If the Population is Normal
If a population is normal with mean μ and standard deviation σ, the sampling distribution of is also normally distributed with and Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
21
Z-value for Sampling Distribution of the Mean
Z-value for the sampling distribution of : where: = sample mean = population mean = population standard deviation n = sample size Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
22
Sampling Distribution Properties
Normal Population Distribution (i.e is unbiased ) Normal Sampling Distribution (has the same mean) Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
23
Sampling Distribution Properties
(continued) For sampling with replacement: When we sample with replacement, the two sample values are independent. Practically, this means that what we get on the first one doesn't affect what we get on the second. As n increases, decreases If the sample size, n, equals the population size, N, then the variance of the sample mean, , is zero. Larger sample size Smaller sample size
24
Example 6.2: “Executive Salary Distributions”
Suppose that the annual percentage salary increases for the chief executive officers of all midsize corporations are normally distributed with mean 12.2% and standard deviation 3.6%. A random sample of nine observations is obtained from this population and the sample mean computed. What is the probability that the sample mean will be less than 10%? Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
25
Example 6.3 “Spark Plug Life”
A spark plug manufacturer claims that the lives of its plugs are normally distributed with mean 36,000 miles and standard deviation 4000 miles. A random sample of 16 plugs had an average life of 34,500 miles. If the manufacturer’s claim is correct, what is the probability of finding a sample mean of 34,500 or less? Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
26
If the Population is not Normal
We can apply the Central Limit Theorem: Even if the population is not normal, …sample means from the population will be approximately normal as long as the sample size is large enough. Properties of the sampling distribution: and Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
27
Sampling From a Population That Is Not Normally Distributed
Central Limit Theorem According to the central limit theorem, for a large sample size, the sampling distribution of is approximately normal, irrespective of the shape of the population distribution. The mean and standard deviation of the sampling distribution of are The sample size is usually considered to be large if n ≥ 30. Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved
28
Central Limit Theorem the sampling distribution becomes almost normal regardless of shape of population As the sample size gets large enough… n↑ The central limit theorem is basic to the concept of statistical inference because it permits us to draw conclusions about the population based strictly on sample data. Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
29
If the Population is not Normal
(continued) Population Distribution Sampling distribution properties: Central Tendency Sampling Distribution (becomes normal as n increases) Variation Larger sample size Smaller sample size Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
30
Example 1 Suppose a population has mean μ = 8 and standard deviation σ = 3. Suppose a random sample of size n = 36 is selected. What is the probability that the sample mean is between 7.8 and 8.2? Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
31
Example 1 Solution: Solution:
(continued) Solution: Even if the population is not normally distributed, the central limit theorem can be used (n > 25) … so the sampling distribution of is approximately normal … with mean = 8 …and standard deviation Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
32
Example 1 Solution: Solution (continued): (continued) Z X
Population Distribution Sampling Distribution Standard Normal Distribution ? ? ? ? ? ? ? ? ? ? Sample Standardize ? ? Z X Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
33
Example 2 According to Sallie Mae surveys and Credit Bureau data, college students carried an average of $3173 credit card debt in Suppose the probability distribution of the current credit card debts for all college students in the United States is known but its mean is $3173 and the standard deviation is $750. Let be the mean credit card debt of a random sample of 400 U.S. college students. What is the probability that the mean of the current credit card debts for this sample is within $70 of the population mean? So, P($3103 ≤ ≤ $3243)? What is the probability that the mean of the current credit card debts for this sample is lower than the population mean by $50 or more lower? So, P( ≤ 3123)? Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved
34
Example 2: Solution μ = $3173 and σ = $750. The shape of the probability distribution of the population is unknown. However, the sampling distribution of is approximately normal because the sample is large (n > 25).
35
Example 2: Solution (a) P($3103 ≤ ≤ $3243) = P(-1.87 ≤ z ≤ 1.87) = = .9386 (a) Therefore, the probability that the mean of the current credit card debts for this sample is within $70 of the population mean is Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved
36
Figure Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved
37
Example 2: Solution (b) For = $3123: P( ≤ 3123) = P (z ≤ -1.33)
= .0918 (b) Therefore, the probability that the mean of the current credit card debts for this sample is lower than the population mean by $50 or more is Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved
38
Figure Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved
39
Acceptance Intervals Goal: determine a range within which sample means are likely to occur, given a population mean and variance Acceptance intervals are widely used for quality control monitoring of various production and service processes. By the Central Limit Theorem, we know that the distribution of X is approximately normal if n is large enough, with mean μ and standard deviation Let zα/2 be the z-value that leaves area α/2 in the upper tail of the normal distribution (i.e., the interval - zα/2 to zα/2 encloses probability 1 – α) Then is the interval that includes X with probability 1 – α Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
40
Sampling Distributions of Sample Proportions
Sampling Distribution of Sample Mean Sampling Distribution of Sample Proportion Sampling Distribution of Sample Variance Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
41
Population Proportions, P
P = the proportion of the population having same characteristic Sample proportion ( ) provides an estimate of P: 0 ≤ ≤ 1 has a binomial distribution, but can be approximated by a normal distribution when nP(1 – P) > 9 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
42
Sampling Distribution of P
^ Sampling Distribution of P Normal approximation: Properties: and Sampling Distribution .3 .2 .1 (where P = population proportion) Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
43
Z-Value for Proportions
Standardize to a Z value with the formula: Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
44
Example 6.8 Business Course Selection (Prob. Of Sample Proportion)
It has been estimated that 43% of business graduates believe that a course in business ethics is important. Find the probability that more than one-half of a random sample of 80 business graduates have this belief. Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
45
Example 3 If the true proportion of voters who support Proposition A is P = .4, what is the probability that a sample of size 200 yields a sample proportion between .40 and .45? i.e.: if P = .4 and n = 200, what is P(.40 ≤ ≤ .45) ? Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
46
Example 3 Solution if P = .4 and n = 200, what is P(.40 ≤ ≤ .45) ?
(continued) if P = .4 and n = 200, what is P(.40 ≤ ≤ .45) ? Find : Convert to standard normal: Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
47
Standardized Normal Distribution
Example 3 solution (continued) if p = .4 and n = 200, what is P(.40 ≤ ≤ .45) ? Use standard normal table:P(0 ≤ Z ≤ 1.44)= = Standardized Normal Distribution Sampling Distribution .4251 Standardize .40 .45 1.44 Z Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
48
Example 4 Maureen Webster, who is running for mayor in a large city, claims that she is favored by 53% of all eligible voters of that city. Assume that this claim is true. What is the probability that in a random sample of 400 registered voters taken from this city, less than 49% will favor Maureen Webster? So, P( < .49)? Prem Mann, Introductory Statistics, 7/E Copyright © 2010 John Wiley & Sons. All right reserved
49
Example 4: Solution n =400, p = .53, and q = 1 – p = 1 - .53 = .47
P( < .49) = P(z < -1.60) = =.0548 Hence, the probability that less than 49% of the voters in a random sample of 400 will favor Maureen Webster is
50
Sampling Distributions of Sample Variance
Sampling Distribution of Sample Mean Sampling Distribution of Sample Proportion Sampling Distribution of Sample Variance Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
51
Sample Variance Let x1, x2, , xn be a random sample from a population. The sample variance is the square root of the sample variance is called the sample standard deviation the sample variance is different for different random samples from the same population Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
52
Sampling Distribution of Sample Variances
The sampling distribution of s2 has mean σ2 If the population distribution is normal, then If the population distribution is normal then has a 2 distribution with n – 1 degrees of freedom Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
53
THE CHI-SQUARE DISTRIBUTION
Definition The chi-square distribution has only one parameter called the degrees of freedom. The shape of a chi-squared distribution curve is skewed to the right for small df and becomes symmetric for large df. The entire chi-square distribution curve lies to the right of the vertical axis. The chi-square distribution assumes nonnegative values only, since variances are all positive, and these are denoted by the symbol χ2 (read as “chi-square”). Is chi-square distributed with (n – 1) degrees of freedom Figure: Three chi-square distribution curves or Probability density functions for chi-square distribution 2,7, & 12 df.
54
The Chi-square Distribution
The chi-square distribution is a family of distributions, depending on degrees of freedom: d.f. = n – 1 The chi-square family of distribution is used in applied statistical analysis because it provides a link between the sample and the population variances. The chi-square distribution with n-1 degrees of freedom is the distribution of the sum of squares of n-1 independent standard normal random variables. The preceding chi-square distribution and the resulting computed probabilities for various values of require that the population distribution be normal. Thus the assumption of an underlying normal distribution is more important for determining probabilities of sample variances than it is for determining probabilities of sample mean. Text book Table 7 contains chi-square probabilities Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
55
Degrees of Freedom (df)
Idea: Number of observations that are free to vary after sample mean has been calculated Example: Suppose the mean of 3 numbers is 8.0 Let X1 = 7 Let X2 = 8 What is X3? If the mean of these three values is 8.0, then X3 must be 9 (i.e., X3 is not free to vary) Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2 (2 values can be any numbers, but the third is not free to vary for a given mean) Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
56
Example 11-1 Find the value of χ² for 7 degrees of freedom and an area of .10 in the right tail of the chi-square distribution curve.
57
Table 11.1 χ2 for df = 7 and .10 Area in the Right Tail
58
Chi-square Example-5 A commercial freezer must hold a selected temperature with little variation. Specifications call for a standard deviation of no more than 4 degrees (a variance of 16 degrees2). A sample of 14 freezers is to be tested What is the upper limit (K) for the sample variance such that the probability of exceeding this limit, given that the population standard deviation is 4, is less than 0.05? Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
59
Finding the Chi-square Value
Is chi-square distributed with (n – 1) = 13 degrees of freedom Use the the chi-square distribution with area 0.05 in the upper tail: (Table 7a) 213 = (α = .05 and 14 – 1 = 13 d.f.) probability α = .05 2 213 = 22.36 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
60
Chi-square Example 213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.) So:
(continued) 213 = (α = .05 and 14 – 1 = 13 d.f.) So: or (where n = 14) so If s2 from the sample of size n = 14 is greater than 27.52, there is strong evidence to suggest the population variance exceeds 16. Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
61
Example 6.9 George Sampson is responsible for quality assurance at Integrated Electronics. He has asked you to establish a quality monitoring process for the manufacturer of control device A. The variability of the electrical resistance, measured in ohms, is critical for this device. Manufacturing standards specify a standard deviation of 3.6 and normal distribution. The monitoring process requires that a random sample of n=6 observations be obtained from the population of devices and the sample variance be computed. Determine an upper limit for the sample variance such that the probability of exceeding this limit, given a population S.D. of 3.6, is less than 0.05. Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.