Presentation is loading. Please wait.

Presentation is loading. Please wait.

Determination of Sample Size

Similar presentations


Presentation on theme: "Determination of Sample Size"— Presentation transcript:

1 Determination of Sample Size
In almost all research situations the researcher is interested in the question: How large should the sample be?

2 Answer: Depends on: How accurate you want the answer.
Accuracy is specified by: Specifying the magnitude of the error bound Level of confidence

3 Confidence Intervals for the mean of a Normal Population, m
Let and Then t1 to t2 is a (1 – a)100% = P100% confidence interval for m = (1 – a)100% Error Bound for m The accuracy of a confidence interval is specified by setting: The magnitude of B (the Error bound), and The level of confidence (1 – a)100%

4 Error Bound: If we have specified the level of confidence then the value of za/2 will be known. If we have specified the magnitude of B, it will also be known Solving for n we get: s* is some estimate of s.

5 Summarizing: The sample size that will estimate m with an Error Bound B and level of confidence P = 1 – a is: where: B is the desired Error Bound za/2 is the a/2 critical value for the standard normal distribution s* is some preliminary estimate of s.

6 Notes: n increases as B, the desired Error Bound, decreases
Larger sample size required for higher level of accuracy n increases as the level of confidence, (1 – a), increases za/2 increases as a/2 becomes closer to zero. Larger sample size required for higher level of confidence n increases as the standard deviation, s, of the population increases. If the population is more variable then a larger sample size required

7 Summary: The sample size n depends on: Desired level of accuracy
Desired level of confidence Variability of the population

8 Example Suppose that one is interested in estimating the average number of grams of fat (m) in one kilogram of lean beef hamburger : This will be estimated by: randomly selecting one kilogram samples, then Measuring the fat content for each sample. Preliminary estimates of m and s indicate: that m and s are approximately 220 and 40 respectively. I want the study to estimate m with an error bound 5 and a level of confidence to be 95% (i.e. a = 0.05 and za/2 = z0.025 = 1.960)

9 Solution Hence n = 246 one kilogram samples are required to estimate m within B = 5 gms with a 95% level of confidence.

10 Confidence Intervals for the mean of a Bernoulli probability, p
Let and Then t1 to t2 is a (1 – a)100% = P100% confidence interval for p = (1 – a)100% Error Bound for p The accuracy of a confidence interval is specified by setting: The magnitude of B (the Error bound), and The level of confidence (1 – a)100%

11 Error Bound: If we have specified the level of confidence then the value of za/2 will be known. If we have specified the magnitude of B, it will also be known Solving for n we get:

12 Summarizing: The sample size that will estimate p with an Error Bound B and level of confidence P = 1 – a is: where: B is the desired Error Bound za/2 is the a/2 critical value for the standard normal distribution p* is some preliminary estimate of p. If no estimate for p is available use p = One can easily check that the maximum sample size required occurs when p = 0.50.

13 maximum sample size n occurs when p = 0.50.

14 Example Suppose that I want to conduct a survey and want to estimate p = proportion of voters who favour a downtown location for a casino: I know that the approximate value of p is p* = This is also a good choice for p if one has no preliminary estimate of its value. I want the survey to estimate p with an error bound B = 0.01 (1 percentage point) I want the level of confidence to be 95% (i.e. a = 0.05 and za/2 = z0.025 = 1.960 Then

15 A general method for constructing confidence limits

16 Definition: A statistic t is called a pivotal statistic (for determining confidence limits for the parameter f) if: The distribution of t is completely known (not dependent on any unknown parameters.) The only unknown parameter that the statistic t depends on is the parameter f (the parameter being estimated.) The statistic t depends on the data x1, …, xn through the sufficient statistics, S1, …, Sq.

17 Examples of pivotal statistics
Estimating m, the mean of a Normal population A pivotal statistic if s is known. Has a known distribution N(0,1). Only depends on the unknown parameter s. Depends on the data through the sufficient statistics

18 Estimating p, the bernoulli probability
A pivotal statistic. Has a known distribution N(0,1). Only depends on the unknown parameter p. Depends on the data through the sufficient statistics

19 To construct confidence limits using a pivotal statistic
Construct a probability statement regarding the pivotal statistic t. This is possible because the distribution of t is completely known. Translate this statement into a confidence statement about the parameter f (the parameter being estimated.)

20 Estimating m, the mean of a Normal population
(s2 known) Pivotal Statistic Starting with after some manipulation we get

21 Estimating p, a Bernoulli probability
Pivotal Statistic Starting with after some manipulation we get

22 Estimating m, the mean of a Normal population
The t distribution Estimating m, the mean of a Normal population (s2 unknown) Let x1, … , xn denote a sample from the normal distribution with mean m and variance s2. Both m and s2 are unknown Recall Also

23 Recall also that if : then has a t-distribution with n degrees of freedom. Thus since then has a t-distribution with n – 1 degrees of freedom.

24 Thus we use as the pivotal statistic
It satisfies the conditions of a pivotal statistic. has a known distribution, the t-distribution with n -1 df. only depends on the unknown parameter m. depends on the data through the sufficient statistics

25 Critical Values for the t–distribution with n df
Definition The a-upper critical values for the t–distribution with n df is the quantity such that t–distribution with n df

26 Thus we use as the pivotal statistic to set up confidence limits for m.
Starting with

27 Hence are (1 – a)100% confidence limits for m.

28 Example Let x1, x2, x3 , x4, x5, x6 denote weight loss from a new diet for n = 6 cases. The Data: The summary statistics:

29 95% Confidence Intervals (use a = 0.05)
95% Confidence Limits

30 Confidence Limits for s2 the variance of a Normal population
Let x1, … , xn denote a sample from the normal distribution with mean m and variance s2. Both m and s2 are unknown. Recall The statistic U satisfies the conditions for a pivotal statistic for estimating s2.

31 U has a known distribution, the c2-distribution with n -1 df.
only depends on the unknown parameter s2. depends on the data through the sufficient statistics

32 Critical Values for the c2–distribution with n df
Definition The a-upper critical values for the c2–distribution with n df is the quantity such that c2–distribution with n df

33 Note: and

34 Confidence limits for s2 and s.
thus

35 hence and

36 hence is a (1 – a)100 % confidence interval for s2. and is a (1 – a)100 % confidence interval for s.

37 Example Let x1, x2, x3 , x4, x5, x6 denote weight loss from a new diet for n = 6 cases. The Data: The summary statistics:

38 (1 – a)100 % confidence interval for s2.
Using a = 0.05 Thus 95 % confidence interval for s2 are:

39 (1 – a)100 % confidence interval for s.
Using a = 0.05 Thus 95 % confidence interval for s are:


Download ppt "Determination of Sample Size"

Similar presentations


Ads by Google