Choosing the Sample Size
Confidence Interval for a Mean Given A random sample of size n from a Normal population with mean . (n/N 0.05) Result A confidence interval is given by where t* is the appropriate critical value for the T distribution with (n – 1) DF.
Example A radar gun was used to find the speed of 19 vehicles passing a checkpoint in a 30 mph zone near a school. The distribution of speeds was unimodal and symmetric, with mean and standard deviation The error margin for a 90% confidence interval is.
Example Suppose we want to estimate the mean speed of vehicles with 90% confidence and error margin no greater than 0.5 mph. Results of the prior study have the standard deviation around 6.3. (There is no guarantee that S will be for a future study. In fact: it won’t. You can fudge this, but if you fudge small and the actual result is larger, you will miss your goal.)
Solution: Trial and Error With n = 19, error margin = 2.5. To reduce the error margin by 5 , try increasing sample size 5 : 5(19) = 95… If n = 95 then DF = 94 and t* = (Notice that z* = – quite close.) A 5 increase in n doesn’t produce a 5 decrease in E. More like a square root of 5 = 2.24 decrease: 2.5/2.24 = 1.12.
Solution: Trial and Error n = 95 gives about 1.07 for the error margin. We want to cut that in two. So: quadruple (4 = 2 2 ) the sample size. If n = 380 then DF = 379 and t* = (Notice that z* = – very very close.) A few more than 380 will do the trick.
Solution: Via Formula Since the t* value will be close to the z* value (unless n is very small), we can use z* to determine the sample size.
Solution: Example With confidence 95%, z* = We’re taking S = 6.3 (although this is only an educated guess) The desired error margin is E = 0.5 mph. Sample at least n = 430 vehicles.
What Do I Use for S? Previously collected data – even on a similar but not identical variable – is useful. The “Range Rule of Thumb” can be useful in generating an educated guess for the standard deviation. To ensure the error margin, guess conservatively (too high). If you are too conservative, you will oversample (costs money, wastes time) If the guess is too small (S actually turns out larger than what you guessed) you will not achieve the target for E If the sample size you obtain is small (say less than 30), you may want to increase it by a bit to adjust for the difference between t* and z*.