Download presentation
Presentation is loading. Please wait.
1
Determination of Sample Size
In almost all research situations the researcher is interested in the question: How large should the sample be?
2
Answer: Depends on: How accurate you want the answer.
Accuracy is specified by: Specifying the magnitude of the error bound Level of confidence
3
Confidence Intervals for the mean of a Normal Population, m
Let and Then t1 to t2 is a (1 – a)100% = P100% confidence interval for m = (1 – a)100% Error Bound for m The accuracy of a confidence interval is specified by setting: The magnitude of B (the Error bound), and The level of confidence (1 – a)100%
4
Error Bound: If we have specified the level of confidence then the value of za/2 will be known. If we have specified the magnitude of B, it will also be known Solving for n we get: s* is some estimate of s.
5
Summarizing: The sample size that will estimate m with an Error Bound B and level of confidence P = 1 – a is: where: B is the desired Error Bound za/2 is the a/2 critical value for the standard normal distribution s* is some preliminary estimate of s.
6
Notes: n increases as B, the desired Error Bound, decreases
Larger sample size required for higher level of accuracy n increases as the level of confidence, (1 – a), increases za/2 increases as a/2 becomes closer to zero. Larger sample size required for higher level of confidence n increases as the standard deviation, s, of the population increases. If the population is more variable then a larger sample size required
7
Summary: The sample size n depends on: Desired level of accuracy
Desired level of confidence Variability of the population
8
Example Suppose that one is interested in estimating the average number of grams of fat (m) in one kilogram of lean beef hamburger : This will be estimated by: randomly selecting one kilogram samples, then Measuring the fat content for each sample. Preliminary estimates of m and s indicate: that m and s are approximately 220 and 40 respectively. I want the study to estimate m with an error bound 5 and a level of confidence to be 95% (i.e. a = 0.05 and za/2 = z0.025 = 1.960)
9
Solution Hence n = 246 one kilogram samples are required to estimate m within B = 5 gms with a 95% level of confidence.
10
Confidence Intervals for the mean of a Bernoulli probability, p
Let and Then t1 to t2 is a (1 – a)100% = P100% confidence interval for p = (1 – a)100% Error Bound for p The accuracy of a confidence interval is specified by setting: The magnitude of B (the Error bound), and The level of confidence (1 – a)100%
11
Error Bound: If we have specified the level of confidence then the value of za/2 will be known. If we have specified the magnitude of B, it will also be known Solving for n we get:
12
Summarizing: The sample size that will estimate p with an Error Bound B and level of confidence P = 1 – a is: where: B is the desired Error Bound za/2 is the a/2 critical value for the standard normal distribution p* is some preliminary estimate of p. If no estimate for p is available use p = One can easily check that the maximum sample size required occurs when p = 0.50.
13
maximum sample size n occurs when p = 0.50.
14
Example Suppose that I want to conduct a survey and want to estimate p = proportion of voters who favour a downtown location for a casino: I know that the approximate value of p is p* = This is also a good choice for p if one has no preliminary estimate of its value. I want the survey to estimate p with an error bound B = 0.01 (1 percentage point) I want the level of confidence to be 95% (i.e. a = 0.05 and za/2 = z0.025 = 1.960 Then
15
A general method for constructing confidence limits
16
Definition: A statistic t is called a pivotal statistic (for determining confidence limits for the parameter f) if: The distribution of t is completely known (not dependent on any unknown parameters.) The only unknown parameter that the statistic t depends on is the parameter f (the parameter being estimated.) The statistic t depends on the data x1, …, xn through the sufficient statistics, S1, …, Sq.
17
Examples of pivotal statistics
Estimating m, the mean of a Normal population A pivotal statistic if s is known. Has a known distribution N(0,1). Only depends on the unknown parameter s. Depends on the data through the sufficient statistics
18
Estimating p, the bernoulli probability
A pivotal statistic. Has a known distribution N(0,1). Only depends on the unknown parameter p. Depends on the data through the sufficient statistics
19
To construct confidence limits using a pivotal statistic
Construct a probability statement regarding the pivotal statistic t. This is possible because the distribution of t is completely known. Translate this statement into a confidence statement about the parameter f (the parameter being estimated.)
20
Estimating m, the mean of a Normal population
(s2 known) Pivotal Statistic Starting with after some manipulation we get
21
Estimating p, a Bernoulli probability
Pivotal Statistic Starting with after some manipulation we get
22
Estimating m, the mean of a Normal population
The t distribution Estimating m, the mean of a Normal population (s2 unknown) Let x1, … , xn denote a sample from the normal distribution with mean m and variance s2. Both m and s2 are unknown Recall Also
23
Recall also that if : then has a t-distribution with n degrees of freedom. Thus since then has a t-distribution with n – 1 degrees of freedom.
24
Thus we use as the pivotal statistic
It satisfies the conditions of a pivotal statistic. has a known distribution, the t-distribution with n -1 df. only depends on the unknown parameter m. depends on the data through the sufficient statistics
25
Critical Values for the t–distribution with n df
Definition The a-upper critical values for the t–distribution with n df is the quantity such that t–distribution with n df
26
Thus we use as the pivotal statistic to set up confidence limits for m.
Starting with
27
Hence are (1 – a)100% confidence limits for m.
28
Example Let x1, x2, x3 , x4, x5, x6 denote weight loss from a new diet for n = 6 cases. The Data: The summary statistics:
29
95% Confidence Intervals (use a = 0.05)
95% Confidence Limits
30
Confidence Limits for s2 the variance of a Normal population
Let x1, … , xn denote a sample from the normal distribution with mean m and variance s2. Both m and s2 are unknown. Recall The statistic U satisfies the conditions for a pivotal statistic for estimating s2.
31
U has a known distribution, the c2-distribution with n -1 df.
only depends on the unknown parameter s2. depends on the data through the sufficient statistics
32
Critical Values for the c2–distribution with n df
Definition The a-upper critical values for the c2–distribution with n df is the quantity such that c2–distribution with n df
33
Note: and
34
Confidence limits for s2 and s.
thus
35
hence and
36
hence is a (1 – a)100 % confidence interval for s2. and is a (1 – a)100 % confidence interval for s.
37
Example Let x1, x2, x3 , x4, x5, x6 denote weight loss from a new diet for n = 6 cases. The Data: The summary statistics:
38
(1 – a)100 % confidence interval for s2.
Using a = 0.05 Thus 95 % confidence interval for s2 are:
39
(1 – a)100 % confidence interval for s.
Using a = 0.05 Thus 95 % confidence interval for s are:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.