P Values - part 4 The P value and ‘rules’ Robin Beaumont 10/03/2012 With much help from Professor Geoff Cumming
Putting it all together
Summary so far A P value is a conditional probability which considers a range of outcomes = ‘area’ in a PDF graph. The SEM formula allows us to: predict the accuracy of your estimated mean across a infinite number of samples! Taking another random sample will give us a different P value How different? - Does not follow a normal distribution Dance of the p values – Geoff Cumming Depends upon if the null hypothesis is actually true in reality! Remember we have assumed so far that the null hypothesis is true. Review
Rules t density:s x = n = t Shaded area = Original units: 0 Rule -> If our result has a P value more extreme than our level of acceptability = our critical value Reject the parameter value we based the p value on. = reject the null hypothesis Given that the sample was obtained from a population with a mean (i.e. parameter value) of 120 a sample with a T (n=15) statistic of or or one more extreme with occur 1.8% of the time. This is less than one in twenty (i.e. P value < 0.05). Mark the one in twenty value = the critical value for T (n=15) = = alpha (α) level =0.05) Therefore we dismiss the possibility that our sample came from a population with a mean of 120
Rules Say one in twenty 1/20 = Or 1/100 Or 1/1000 or.... What value do we now give the parameter value ? When p value is in the critical region Reject the parameter value we based the p value on is considered untenable = reject the null hypothesis What is the mean of the population if we have now ruled out one value? Set a level of acceptability = critical value (CV)
Allows decision making Moves thing forward If the sample did not come from the ‘null distribution’ indicates there is some effect Population mean of 120 for substance X indicates they have a propensity to a range of nasty diseases Given them miracle drug Y which is believed to reduce the level has this have any effect in our sample of 15? If the probability of obtaining our t value (i.e. + associated p value) is below a certain critical threshold we can say that our sample does not come from the null distribution and there is a effect. Why do we want to reject the null distribution?
Fisher – only know and only consider the model we have i.e. The parameter we have used in our model – when we reject it we accept that any value but that one can replace it. Neyman and Pearson + Gossling Bayesians H null = μ=120 versus alternative H alt = μ≠120 [μ = population mean] H null = T= 0 versus alternative H alt = T≠ 0 [T = t statistic] H null = μ=120 versus alternative H alt = μ = XXX [μ = population mean] H null = T= 0 versus alternative H alt = T = xxx [T = t statistic]
Fisher – infinitely many alternative ‘red’ distributions View 1 - The infinite variety of alternative distributions - Fisher Alternative distributions: Become flatter further away from null When coincides with null = same shape How do we define the alternative hypothesis? Distance measures: Effect size / Non centrality Parameter Become more asymmetrical as further away from null Null distribution t (df, ncp=0) Alternative distributions t (df, ncp)
View 2 - The single specified alternative – Neyman + Pearson Take the distribution around the sample value Then work our the difference SD’s: Delta = Δ Non centrality parameter = capital = d x √15 = x = Population mean of 120 for substance X, SD= 35 units/100 ml. 15 random subjects take miracle drug have a mean = 96
Gpower shows it all! Population mean of 120 for substance X, SD= 35 units/100 ml. 15 random subjects take miracle drug have a mean = 96 Red line = null distribution Red areas = critical regions = α alpha regions Blue line = specified alternative distribution Blue shaded area = β beta regions (far right– very small)
α = the reject region = 120= 96 Correct decisions incorrect decisions Two correct and 2 incorrect decisions
Correct decisions Power = 1 - Beta Power is good
Insufficient power – unlikely to get a p value in the critical region Too much power always p value in critical region but possibly trivial effect size More Power the better up unto a certain point!
Cumming Replication and P intervals Red line = null distribution Red areas = critical regions = α alpha regions Blue line = specified alternative distribution Blue shaded area = β beta regions (far right– very small)
Implications Power and Cohens effect size measure d reflect one another. When power high p value distribution does provides a measure of evidence for the specific alternative hypothesis. Computer simulations - reality different Replication is a vitally important research strategy – meta analysis. Power analysis during study design and Confidence intervals along with effect size measures when reporting results – alternative strategy? Always specify specific alternative hypothesis
Students bloomers The p value did not indicate much statistic significance Given that the population comes from one population The p value is thus rejecting the null hypothesis and there is a statistical significance Correlation = 0.25 (p<0.001) indicating that assuming that the data come from a bivariate normal distribution with a correlation of zero you would obtain a correlation of < There is 95% chance that the relationship among the variables is not due to chance