Determining Sample Size
Statistical Significance What factors influence the probability of a statistical significance?
Statistical Significance What factors influence the probability of a statistical significance? Alpha Sample Size Amount of variability in sample Magnitude of differences between groups/categories/intervals
Determining Sample size 𝑛= 𝑡∗𝑠 𝐸 2 Where n = sample size t = t score associated with desired significance level s is the estimated standard deviation E = the amount error that can be tolerated
Determining Sample size 𝑛= 𝑡∗𝑠 𝐸 2 Where n = sample size t = t score associated with desired significance level s is the estimated standard deviation E = the amount error that can be tolerated
Where to get s? If we don’t have population data, how do we know or even estimate s? One solution take a small sample Not always practical
Sample Size for Proportion 𝑛= 𝑡∗𝑠 𝐸 2 With a proportion the largest s is associated with a proportion of .5 Using .5 is thus a “prudent” assumption when choosing sample size
Example Example: How big should the NYS Housing and Community Renewal survey be? Want to be at least 90% confident Can tolerate a margin of error of plus or minus three percentage points 𝑛= 𝑡∗𝑠 𝐸 2
POWER
IS MY COIN FAKE? How many flips before you are confident coin is fake
IS MY COIN FAKE? How many flips before you are confident coin is fake? Number of heads Probability 1 0.5 2 0.25 3 0.125 4 0.0625 5 0.03125 6 0.015625 7 0.007813 8 0.003906 9 0.001953 10 0.000977
Relationship between Power and hypothesis testing Accept Null Hypothesis Reject Null Hypothesis Null Hypothesis is true Correct decision Type I error( alpha typically set to 5%) Null Hypothesis is False Type II error Correct decision: Probability of making this decision correctly is defined as Power Probability of making this correct inference
Requirements to estimate Power Type of test (e.g. two-sample independent t-test, one tail) Alpha Effect size of interest How much accuracy is desirable Sample size Standard deviation of sample
Requirements to estimate Power Type of test (e.g. two-sample independent t-test, one tail) Given
Requirements to estimate Power Alpha Prefer to avoid Type I error-reject null hypothesis although null hypothesis is true (lower alpha (.01) Prefer to avoid Type II error –accept null hypothesis although null hypothesis is false (higher alpha (.05)
Requirements to estimate Power Effect size of interest Determined by theory or intuition Are men heavier than women? What is an “important” difference? Two kilograms? Twenty kilograms?
Requirements to estimate Power Effect size of interest Cohen’s D 𝐶𝑜ℎ𝑒𝑛 𝑑 = 𝑀 𝑡 − 𝑀 𝑐 𝑆𝐷 𝑝𝑜𝑜𝑙𝑒𝑑 Mt mean treatment or group 1 Mc mean control or group 2 Sdpooled= 𝑆𝐷 𝑡 2 𝑁 𝑡 −1 + 𝑆𝐷 𝑐 2 𝑁 𝑐 −1 𝑁 𝑐 + 𝑁 𝑡 −2
Requirements to estimate Power Cohen’s D Tells us how big a difference is substantively important Expresses difference in standard deviation units Rules of thumb .2 small effect .5 moderate effect .8 large effect Consider using Cohen’s D if you have no intuition about effect size or what is an important difference
Stata Examples Class data 10.2 in text book Are men heavier than women? 10.2 in text book Captain Beaver is warned by Colonel Verleaf that if the mean efficiency rating for the 150 platoons under Verleaf’s command falls below 80, Captain Beaver will be transferred to Minot Air Base (A base in the middle of nowhere). Beaver takes a sample of 20 platoos and finds the following: mean = 85; s=13.5 Null hypothesis µ = 80 Alternative hypothesis µ = 85 sd = 13.5 n = 20
Problem 10.2
Stata Example Power for Proportion 12.10 in Book VISTA manager William suspects 50% of his volunteers are over 65 years old. A survey of 16 volunteers reveals 44% that are over age 65. How much power does he have?
Problem 12.10
Sample size and power
Effect Size and Power
Are incomes higher in Mixed Income Developments NYSHCR survey of tenants 0=Not mixed income, 1 = mixed income
Are incomes higher in Mixed Income Developments
Are incomes higher in Mixed Income Developments