Sampling and Sample Size Part 1 Cally Ardington
Course Overview 1.What is Evaluation? 2.Outcomes, Impact, and Indicators 3.Why Randomise? 4.How to Randomise? 5.Sampling and Sample Size 6.Threats and Analysis 7.Project from Start to Finish 8.Cost Effectiveness and Scaling
Lecture Outline Precision and accuracy Statistical tools Population and sampling distribution Law of Large Number and Central Limit Theorem Standard deviation and standard error
Which of these is more accurate? A. I. B. II. C. Don’t know I. II.
Accuracy versus Precision Precision (Sample Size) Accuracy (Randomization) truth estimates
Accuracy versus Precision Precision (Sample Size) Accuracy (Randomization) truth estimates truth estimates
This session’s question How large does the sample need to be for you to be able to detect a given treatment effect? Randomization removes the bias (ensures accuracy) but it does not remove noise We control precision with sample size
Lecture Outline Precision and accuracy Statistical tools Population and sampling distribution Law of Large Number and Central Limit Theorem Standard deviation and standard error
Population distribution
Take a random sample : Sampling distribution
Lecture Outline Precision and accuracy Statistical tools Population and sampling distribution Law of Large Number and Central Limit Theorem Standard deviation and standard error
We generally don’t have a our population distribution but, we have our sampling distribution. What do we know about our sampling distribution? Two statistical laws help us here (1)Central Limit Theorem (2)The Law of Large Numbers
(1) Central Limit Theorem To here… This is the distribution of the population (Population Distribution) This is the distribution of Means from all Random Samples (Sampling distribution)
Central Limit Theorem Population Draw 1 Mean test score Draw 2 Mean test score Draw 3 Mean test score
Central Limit Theorem Population Draw 6 Mean test score Draw 5 Mean test score Draw 4 Mean test score
Central Limit Theorem Population Draw 9 Mean test score Draw 10 Mean test score Draw 8 Mean test score Draw 7 Mean test score
Draw 10 random students, take the average, plot it: 10 times. Inadequate sample size No clear distribution around population mean
More sample means around population mean Still spread a good deal Draw 10 random students: 50 and 100 times
Distribution now significantly more normal Starting to see peaks Draws 10 random students: 500 and 1000 times
This is a theoretical exercise. In reality we do not have multiple draws, we only have one draw. BUT, we can control the number of people in that draw. This is what we refer to as SAMPLE SIZE. The previous example was based on a sample size of 10 What happens if we take a sample size of 50?
What happens to the sampling distribution if we draw a sample size of 50 instead of 10, and take the mean (thousands of times)? A.We will approach a bell curve faster (than with a sample size of 10) B.The bell curve will be narrower C.Both A & B D.Neither. The underlying sampling distribution does not change.
N = 10 N = 50 (2) Law of Large Numbers
N= 10 N = 50
Lecture Outline Precision and accuracy Statistical tools Population and sampling distribution Law of Large Number and Central Limit Theorem Standard deviation and standard error
Standard deviation/error What’s the difference between the standard deviation and the standard error? The standard error = the standard deviation of the sampling distributions
Variance and Standard Deviation
Standard Deviation
Standard Error
Sample size ↑ x4, SE ↓ ½ Sample Frequency Population mean Standard deviation Standard error Sample Distribution
Sample size ↑ x9, SE ↓ ?
Sample size ↑ x100, SE ↓? Sample Frequency Population mean Standard deviation Standard error Sample Distribution