The logic behind a statistical test. A statistical test is the comparison of the probabilities in favour of a hypothesis H 1 with the respective probabilities of an appropriate null hypothesis H 0. Type I error Type II error Power of a test Accepting the wrong hypothesis H 1 is termed type I error. Rejecting the correct hypothesis H 1 is termed ttype II error. Lecture 11 Parametric hypothesis testing
Testing simple hypotheses Karl Pearson threw times a coin and wanted to see whether in the real world deviations from the expectation of numbers and eagles occur. He got time the numbers. Does this result deviate from our expectation? The exact solution of the binomial The normal approximation
2 test Assume a sum of variances of Z-transformed variables Each variance is one. Thus the expected value of 2 is n The 2 distribution is a group of distributions of variances in dependence on the number of elements n. Observed values of 2 can be compared to predicted and allow for statistical hypthesis testing. Pearsons coin example Probability of H 0
9 times green, yellow seed 3 times green, green seed 3 times yellow, yellow seed 1 time yellow, green seed Does the observation confirm the prediction? The Chi 2 test has K-1 degrees of freedom.
All statistical programs give the probability of the null hypothesis, H 0.
Advices for applying a χ 2 -test χ 2 -tests compare observations and expectations. Total numbers of observations and expectations must be equal. The absolute values should not be too small (as a rule the smallest expected value should be larger than 10). At small event numbers the Yates correction should be used. The classification of events must be unequivocal. χ 2 -tests were found to be quite robust. That means they are conservative and rather favour H 0, the hypothesis of no deviation. The applicability of the χ 2 -test does not depend on the underlying distributions. They need not to be normally of binomial distributed. Dealing with frequencies
G-test or log likelihood test 2 relies on absolute differences between observed and expected frequencies. However, it is also possible to take the quotient L = observed / expected as a measure of goodness of fit G is approximately 2 distributed with k - 1 degrees of freedom
A species - area relation is expected to follow a power function of the form S = 10A Do the following data points (Area, species number) confirm this expectations: A 1 (1,12), A 2 (2,18), A 3 (4,14), A 4 (8,30), A 5 (16,35), A 6 (32,38), A 7 (64,33), A 8 (128,35), A 9 (256,56), A 10 (512,70)? We try different tests. Both tests indicate that the regression line doesnt fit The pattern is better seen in a double log plot. We have seven points above and 3 points below the regression line. Is there a systematic error?
Tests for systematic errors. The binomial The 2 test
Now we try the best fit model the G-test identified even the best fit model as having larger deviations than expected from a simple normal random sample model.
The best fit model Observation and expectation can be compared by a Kolmogorov-Smirnov test. The test compares the maximum cumulative deviation with that expected from a normal distribution. Kolmogorov-Smirnov test Both results are qualitatively identical but differ quantitatively. The programs use different algorithms
2x2 contingency table 1000 Drosophila flies with normal and curled wings and two alleles A and B suposed to influence wing form. Do flies with allele have more often curled wings than fiels with allele B? A contingency table chi2 test with n rows and m columns has (n-1) * (m-1) degrees of freedom. The 2x2 table has 1 degree of freedom Predicted number of allele A and curled wings
Relative abundance distributions Dominant species Rare species Intermediate species The hollow curve Evenness Abundance is the total number of individuals in a population. Density refers to the number of individuals in a unit of measurement. The log-normal distribution
The distribution of species abundance distributions across vertebrates and invertebrates 3 types of distributions: log- series, power function, lognormal. We compare 99 such distributions from all over the world. Row and column sums are identical due to our classification. We expect equal entries for each cell:
Do vertebrates and invertebrates differ in abundance distributions? But if we take the whole pattern we get Number of log-normal best fits only:
Students t-test for equal sample sizes and similar variances Welch t-test for unequal variances and sample sizes Bivariate comparisons of means F-test
In a physiological experiment mean metabolism rates had been measured. A first treatment gave mean = 100, variance = 45, a second treatment mean = 120, variance = 55. In the first case 30 animals in the second case 50 animals had been tested. Do means and variances differ? N 1 +N 2 -2 Degrees of freedom The probability level for the null hypothesis
The comparison of variances Degrees of freedom: N-1 The probability for the null hypothesis of no difference, H =0.713: probability that the first variance (50) is larger than the second (30). One sided test *0.287 Past gives the probability for a two sided test that one variance is either larger or smaller than the second. Two sided test
Power analysis Effect size In an experiment you estimated two means Each time you took 20 replicates. Was this sample size large enough to confirm differences between both means? We use the t- distribution with 19 degrees of freedom. You needed 15 replicates to confirm a difference at the 5% error level.
The t-test can be used to estimate the number of observations to detect a significant signal for a given effect size. From a physiological experiment we want to test whether a certain medicament enhances short time memory. How many persons should you test (with and without the treatment) to confirm a difference in memory of about 5%? We dont know the variances and assume a Poisson random sample. Hence 2 = We dont know the degrees of freedom: We use a large number and get t:
Home work and literature Refresh: 2 test Mendel rules t-test F-test Contingency table G-test Prepare to the next lecture: Coefficient of correlation Maximum, minimum of functions Matrix multiplication Eigenvalue Literature: Łomnicki: Statystyka dla biologów