PSY 307 – Statistics for the Behavioral Sciences Chapter 11-12 – Confidence Intervals, Effect Size, Power
Point Estimates The best estimate of a population mean is the sample mean. When we use a sample to estimate parameters of the population, it is called a point estimate. How accurate is our point estimate? The sampling distribution of the mean is used to evaluate this.
Confidence Interval The range around the sample mean within which the true population mean is likely to be found. It consists of a range of values. The upper and lower values are the confidence limits. The range is determined by how confident you wish to be that the true mean falls between the values.
What is a Confidence Interval? A confidence interval for the mean is based on three elements: The value of the statistic (e.g., the mean, m). The standard error (SE) of the measure (sx). The desired width of the confidence interval (e.g., 95% or 99%, 1.96 for z). To calculate for z: m ± (zconf)(sx)
Levels of Confidence A 95% confidence interval means that if a series of confidence intervals were constructed around different means, about 95% of them would include the true population mean. When you use 99% as your confidence interval, then 99% would include the true pop mean.
Demos http://www.stat.sc.edu/~west/javahtml/ConfidenceInterval.html http://www.ruf.rice.edu/~lane/stat_sim/conf_interval/
Calculating Different Levels For 95% use the critical values for z scores that cutoff 5% in the tails: 533 ± (1.96)(11) = 554.56 & 511.44 where M = 533 and sM = 11 For 99% use the critical values that cutoff 1% in the tails: 533 ± (2.58)(11) = 561.38 & 504.62
Sample Size Increasing the sample size decreases the variability of the sampling distribution of the mean:
Effect of Sample Size Because larger sample sizes produce a smaller standard error of the mean: The larger the sample size, the narrower and more precise the confidence interval will be. Sample size for a confidence interval, unlike a hypothesis test, can never be too large.
Other Confidence Intervals Confidence intervals can be calculated for a variety of statistics, including r and variance. Later in the course we will calculate confidence intervals for t and for differences between means. Confidence intervals for percents or proportions frequently appear as the margin of error of a poll.
Effect Size Effect size is a measure of the difference between two populations. One population is the null population assumed by the null hypothesis. The other population is the population to which the sample belongs. For easy comparison, this difference is converted to a z-score by dividing it by the pop std deviation, s.
Effect Size Effect Size X1 X2
A Significant Effect Effect Size X1 X2 Critical Value Critical Value
Calculating Effect Size Subtract the means and divide by the null population std deviation: Interpreting Cohen’s d: Small = .20 Medium = .50 Large = .80
Comparisons Across Studies The main value of calculating an effect size is when comparing across studies. Meta-analysis – a formal method for combining and analyzing the results of multiple studies. Samples sizes vary and affect significance in hypothesis tests, so test statistics (z, t, F) cannot be compared.
Probabilities of Error Probability of a Type I error is a. Most of the time a = .05 A correct decision exists .95 of the time (1 - .05 = .95). Probability of a Type II error is b. When there is a large effect, b is very small. When there is a small effect, b can be large, making a Type II error likely.
When there is no effect… a = .05 Sample means that produce a type I error Hypothesized and true distributions coincide .05 COMMON 1.65
Effect Size and Distribution Overlap Cohen’s d is a measure of effect size. The bigger the d, the bigger the difference in the means. http://www.bolderstats.com/gallery/normal/cohenD.html
Power The probability of producing a statistically significant result if the alternative hypothesis (H1) is true. Ability to detect an effect. 1- b (where b is the probability of making a Type II error)
Small Effects Have Low Power Effect Size X1 X2 Critical value
Large Effects Have More Power Effect Size X1 X2 Critical Value Critical Value
Calculating Power Most researchers use special purpose software or internet power calculators to determine power. This requires input of: Population mean, sample mean Population standard deviation Sample size Significance level, 1 or 2-tailed test http://www.stat.ubc.ca/~rollin/stats/ssize/n2.html
Sample Power Graph 1
Sample Power Graph 2
How Power Changes with N WISE Demo http://wise.cgu.edu/powermod/exercise1b.asp
Effect of Larger Sample Size Larger samples produce smaller standard deviations. Smaller standard deviations mean less overlap between two distributions.
b Decreases with Larger N’s Note: This is for an effect in the negative direction (H0 is the red curve on the right).
Increasing Power Strengthen the effect by changing your manipulation (how the study is done). Decrease the population’s standard deviation by decreasing noise and error (do the study well, use a within subject design). Increase sample size. Change the significance level.