Exam Exam starts two weeks from today
Amusing Statistics Use what you know about normal distributions to evaluate this finding: The study, published in Pediatrics, the journal of the American Academy of Pediatrics, found that among the 4,508 students in Grades 5-8 ハ who participated, 36 per cent reported excellent school performance, 38 per cent reported good performance, 20 per cent said they were average performers, and 7 per cent said they performed below average.
Review The Z-test is used to compare the mean of a sample to the mean of a population and
Review The Z-score is normally distributed
Review The Z-score is normally distributed Thus the probability of obtaining any given Z-score by random sampling is given by the Z table
Review We can likewise determine critical values for Z such that we would reject the null hypothesis if our computed Z- score exceeds these values –For alpha =.05: Zcrit (one-tailed) = 1.64 Zcrit (two-tailed) = 1.96
Confidence Intervals A related question you might ask: –Suppose you’ve measured a mean and computed a standard error of that mean –What is the range of values such that there is a 95% chance of the population mean falling within that range?
There is a 2.5% chance that the population mean is actually 1.96 standard errors more than the observed mean Confidence Intervals 95% % True mean?
There is a 2.5% chance that the population mean is actually 1.96 standard errors less than the observed mean Confidence Intervals 2.5% % True mean?
Thus there is a 95% chance that the true population mean falls within + or standard errors from a sample mean Confidence Intervals
Thus there is a 95% chance that the true population mean falls within + or standard errors from a sample mean Likewise, there is a 95% chance that the true population mean falls within + or standard deviations from a single measurement Confidence Intervals
This is called the 95% confidence interval…and it is very useful It works like significance bounds…if the 95% C.I. doesn’t include the mean of a population you’re comparing your sample to, then your sample is significantly different from that population Confidence Intervals
Consider an example: You measure the concentration of mercury in your backyard to be.009 mg/kg The concentration of mercury in the Earth’s crust is.007 mg/kg. Let’s pretend that, when measured at many sites around the globe, the standard deviation is known to be.002 mg/kg Confidence Intervals
The 95% confidence interval for this mercury measurement is Confidence Intervals
This interval includes.007 mg/kg which, it turns out, is the mean concentration found in the earth’s crust in general Thus you would conclude that your backyard isn’t artificially contaminated by mercury Confidence Intervals
Imagine you take 25 samples from around Alberta and you found: Confidence Intervals
Imagine you take 25 samples from around Alberta and you found:.009 +/- (1.96 x.0004) = to This interval doesn’t include the.007 mg/kg value for the earth’s crust so you would conclude that Alberta has an artificially elevated amount of mercury in the soil Confidence Intervals
Power we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p <.05
Power we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p <.05 we say that we have a significant result…
Power we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p <.05 we say that we have a significant result… but what if p is >.05?
Power What are the two reasons why p comes out greater than.05?
Power What are the two reasons why p comes out greater than.05? –Your experiment lacked Statistical Power and you made a Type II Error –The null hypothesis really is true
Power Two approaches: –The Hopelessly Jaded Grad Student Solution –The Wise and Well Adjusted Professor Procedure
Power 1. Hopelessly Jaded Grad Student Solution - conclude that your hypothesis was wrong and go directly to the grad student pub
Power - This is not the recommended course of action
Power 2. The Wise Professor Procedure - consider the several reasons why you might not have detected a significant effect
Power - recommended by wise professors the world over
Power Why might p be greater than.05 ? Recall that: and
Power Why might p be greater than.05 ? 1. Small effect size: –The effect doesn’t stand out from the variability in the data – You might be able to increase your effect size (e.g. with a larger dose or treatment) is quite close to the mean of the population
Power Why might p be greater than.05 ? 2. Noisy Data –A large denominator will swamp the small effect –Take greater care to reduce measurement errors and thereforeis quite large
Power Why might p be greater than.05 ? 3. Sample Size is Too Small –A large denominator will swamp the small effect –Run more subjects is quite large becauseis small
Power The solution in each case is more power:
Power The solution in each case is more power: Power is like sensitivity - the ability to detect small effects in noisy data
Power The solution in each case is more power: Power is like sensitivity - the ability to detect small effects in noisy data It is the opposite of Type II Error rate
Power The solution in each case is more power: Power is like sensitivity - the ability to detect small effects in noisy data It is the opposite of Type II Error rate So that you know: there are equations for computing statistical power
Power An important point about power and the null hypothesis: –Failing to reject the null hypothesis DOES NOT PROVE it to be true!!!
Power Consider an example: –How to prove that smoking does not cause cancer: enroll 2 people who smoke infrequently and use an antique X-Ray camera to look for cancer Compare the mean cancer rate in your group (which will probably be zero) to the cancer rate in the population (which won’t be) with a Z-test
Power Consider an example: –If p came out greater than.05, you still wouldn’t believe that smoking doesn’t cause cancer
Power Consider an example: –If p came out greater than.05, you still wouldn’t believe that smoking doesn’t cause cancer –You will, however, often encounter statements such as “The study failed to find…” misinterpreted as “The study proved no effect of…”
We’ve been using examples in which a single sample is compared to a population Experimental Design
We’ve been using examples in which a single sample is compared to a population Often we employ more sophisticated designs Experimental Design
We’ve been using examples in which a single sample is compared to a population Often we employ more sophisticated designs What are some different ways you could run an experiment? Experimental Design
Compare one mean to some value –Often that value is zero
Experimental Design Compare one mean to some value –Often that value is zero Compare two means to each other
Experimental Design There are two general categories of comparing two (or more) means with each other
Experimental Design 1.Repeated Measures - also called “within-subjects” comparison The same subjects are given pre- and post- measurements e.g. before and after taking a drug to lower blood pressure Powerful because variability between subjects is factored out Note that pre- and post- scores are linked - we say that they are dependant Note also that you could have multiple tests
Experimental Design 1.Problems with Repeated-Measure design: Practice/Temporal effect - subjects get better/worse over time The act of measuring might preclude further measurement - e.g. measuring brain size via surgery Practice effect - subjects improve with repeated exposure to a procedure
Experimental Design 2. Between-Subjects Design Subjects are randomly assigned to treatment groups - e.g. drug and placebo Measurements are assumed to be statistically independent
Experimental Design 2. Problems with Between-Subjects design Can be less powerful because variability between two groups of different subjects can look like a treatment effect Often needs more subjects
Experimental Design We’ll need some statistical tests that can compare: –One sample mean to a fixed value –Two dependent sample means to each other (within-subject) –Two independent sample means to each other (between-subject)
Experimental Design The t-test can perform each of these functions It also gets around a big problem with the z-test…
Problems with Z and what to do instead
The Z statistic The Z statistic (with which to compare to the Zcrit) Where
The Z statistic What is the problem you will encounter in trying to use this statistic?
The Z statistic What is the problem you will encounter in trying to use this statistic? Although you might have a guess about the population mean, you will almost certainly not know the population variance!
The Z statistic Where
The Z statistic Where
The Z statistic Where
The Z statistic Where
The Z statistic What to do? Could we estimate What would we use and what would have to be the case for it to be useful?
The Z statistic What to do? Could we estimate What would we use and what would have to be the case for it to be useful? We could use our sample variance, S 2 to estimate the population variance
Estimating Population Variance Just like there are many sample means (the sampling distribution of the mean) there are many S 2 s
Estimating Population Variance Just like there are many sample means (the sampling distribution of the mean) there are many S 2 s tends to be near the value of but does S 2 tend to be near the value of
Estimating Population Variance Just like there are many sample means (the sampling distribution of the mean) there are many S 2 s tends to be near the value of but does S 2 tend to be near the value of No. It is a biased estimator. It tends to be lower than
Estimating Population Variance Why is S 2 biased?
Estimating Population Variance Why is S 2 biased? The sum of the deviation scores in your sample must equal zero regardless of where they came from in the population
Estimating Population Variance Why is S 2 biased? The sum of the deviation scores in your sample must equal zero regardless of where they came from in the population This means that the deviations in your sample are somewhat more constrained than in the population
Estimating Population Variance Why is S 2 biased? The sum of the deviation scores in your sample must equal zero regardless of where they came from in the population This means that the deviations in your sample are somewhat more constrained than in the population S 2 is has relatively fewer degrees of freedom than the entire population
Estimating Population Variance Specifically S 2 has n - 1 degrees of freedom
Estimating Population Variance Specifically S 2 has n - 1 degrees of freedom So if we compute S 2 but use n - 1 instead of n in the denominator we’ll get an unbiased estimator of
Estimating Population Variance Of course if you’ve already computed S 2 using n in the denominator you can multiply by n to recover the sum of squared deviations and then divide by n-1
The t Statistic(s) Using an estimated, which we’ll call we can create an estimate of which we’ll call where
The t Statistic(s) Using, instead of we get a statistic that isn’t from a normal (Z) distribution - it is from a family of distributions called t