Permutation Tests Hal Whitehead BIOL4062/5062
Introduction to permutation tests Exact and randomized permutation tests Permutation tests using standard statistics Mantel tests ANOSIM
Permutation Tests Allow hypotheses to be tested when: Distributional properties of test statistic under null hypothesis are not known e.g. measures of genetic distance Distributional properties of test statistic under null hypothesis are complex Assumptions about data necessary for standard tests or measure of uncertainty (e.g. normality) are not met Good for small data sets
Permutation Tests Useful when hypotheses can be phrased in terms of order or allocation of data points: e.g. When dogs meet, larger dog barks for longer e.g. Social relationships are stronger within same sex pairs
Exact and Random Permutation Tests Data => Real Test Statistic Either: Compute statistic for all possible permutations of data (“Exact test”) Or: Compute statistic for, say, 1,000 random permutations (“Random test”)
Permutation Tests Exact test Compare real test statistic with distribution of values of all other possible test statistics Random test Compare real test statistic with distribution of values of random test statistics
Permutation Tests If: real statistic is greater than or equal to 3/128 possible statistics (exact test): reject null hypothesis that allocation or ordering of units does not affect statistic: P=0.023 (1-tailed test) P=0.046 (2-tailed test) real statistic is greater than or equal to 12/1000 random statistics (random test): P=0.012 (1-tailed test) P=0.024 (2-tailed test)
No. times larger dog barks longer Example: dogs Null hypothesis: Longer barking unrelated to size ordering Alternative hypothesis: Larger dog barks longer Data 7 dogs: A > B > C > D > E > F > G Pair of dogs Who barks longer? AB A, A, B, A AF F, A, A BD B, D, D CF C, C, C EG G EF F Test statistic: No. times larger dog barks longer Y= 9
Example: dogs 7 dogs: A > B > C > D > E > F > G Pair of dogs Who barks longer? AB A, A, B, A AF F, A, A BD B, D, D CF C, C, C EG G EF F Test statistic: No. times larger dog barks longer Y= 9 RANDOM: G > B > A > C > F > D > E Pair of dogs Who barks longer? AB A, A, B, A AF F, A, A BD B, D, D CF C, C, C EG G EF F Random statistic: No. times larger dog barks longer Y= 8
Example: dogs No. times larger dog barks longer: Y= 9 In 5040 exact permutations: Y>9 1635 times do not reject null hypothesis (P=0.324) In 1000 random permutations: Y>9 332 times do not reject null hypothesis (P=0.332)
Can use permutation tests with normal test statistics when assumptions are not valid
Example: contingency table with small sample sizes Random permutation (totals same) 1 0 2 1 1 0 1 0 3 1 2 0 0 0 1 0 0 1 1 1 1 4 1 0 A B C D E F I 0 0 1 2 2 0 II 0 0 4 3 0 0 III 0 0 0 1 0 1 IV 3 1 2 0 1 0 G(r)=15.82 G=25.18 df=15 P=0.047 But expected numbers are too small for valid G-test For 10,000 random permutations: G>G(r) in 304; P=0.0304
Comparing Association Matrices: Mantel Test May help with problems of independence 2 association matrices, indexed by same units: Evolution: genetic similarity and environmental similarity between populations Behaviour: gender similarity (1/0) and association index between individuals Population genetics: genetic similarity and geographic distance between populations
Comparing Association Matrices: Mantel Test Matrices can be 0:1's Matrix correlation coefficient: similarity between the two association matrices Mantel test tests the null hypothesis that there is no relationship between the associations shown on the two matrices
Mantel Tests Given two symmetric association matrices: a11 a12 a13 .... a1k b11 b12 b13 .... b1k a21 a22 a23 .... a2k b21 b22 b23 .... b2k a31 a32 a33 .... a3k b31 b32 b33 .... b3k ... ... ak1 ak2 ak3 .... akk bk1 bk2 bk3 .... bkk Matrix correlation coefficient (r) is the correlation between: {a21, a31, a32, ... , ak1, ak2, ak3,.... , akk-1}, and {b21, b31,b32, ... , bk1, bk2, bk3,.... , bkk-1} [Cannot be tested using standard methods because of lack of independence] r=1 : maximal positive relationship r=0 : no relationship r=-1 : maximal negative relationship
Partial Mantel Tests Are X and Y related, controlling for V? Among populations of an organism Is genetic similarity related to morphological similarity controlling for geographical distance?
Mantel Tests Mantel test uses statistic: k k Z = Σ Σ aij . bij i=1 j=1 Z can be transformed into a variable W, approximately normal (0 mean and s.d. 1) under the null hypothesis (r=0) Somewhat dubious at small k
Mantel Tests Better to: Compare real Z with Zm’s randomly permute the individuals in one matrix many times each time calculate Z (Zm’s) Compare real Z with Zm’s If Z>97.5% of the Zm’s, or Z<97.5% of Zm’s, then the null hypothesis that r=0 is rejected there is a relationship between variables
Mantel test: example Do bottlenose whales associate with their kin? Microsatellite-based estimate of kin relatedness versus association index: Matrix correlation r = -0.09 Mantel test P = 0.83 (1,000 perms) They do not seem to preferentially associate with their kin
Mantel Test: Example Coda repertoire of sperm whales Repertoire similarity R1 R2 R3 R4 R5 R6 R7 R8 R1 1 0.62 0.34 0.03 -0.06 0 0.13 -0.06 R2 0.62 1 0.71 0.11 -0.18 0.13 0.11 0.13 R3 0.34 0.71 1 0.15 -0.07 0.37 0.13 0.02 R4 0.03 0.11 0.15 1 0.52 0.44 0.71 0.71 R5 -0.06 -0.18 -0.07 0.52 1 0.32 0.62 0.52 R6 0 0.13 0.37 0.44 0.32 1 0.33 0.34 R7 0.13 0.11 0.13 0.71 0.62 0.33 1 0.67 R8 -0.06 0.13 0.02 0.71 0.52 0.34 0.67 1 Groups: Mantel test: Group vs Repertoire P=0.00 Groups seem to have distinct repertoires Group similarity R1 R2 R3 R4 R5 R6 R7 R8 R1 1 1 1 0 0 0 0 0 R2 1 1 1 0 0 0 0 0 R3 1 1 1 0 0 0 0 0 R4 0 0 0 1 1 1 0 0 R5 0 0 0 1 1 1 0 0 R6 0 0 0 1 1 1 0 0 R7 0 0 0 0 0 0 1 1 R8 0 0 0 0 0 0 1 1
Mantel Test: Example Coda repertoire of sperm whales Repertoire similarity R1 R2 R3 R4 R5 R6 R7 R8 R1 1 0.62 0.34 0.03 -0.06 0 0.13 -0.06 R2 0.62 1 0.71 0.11 -0.18 0.13 0.11 0.13 R3 0.34 0.71 1 0.15 -0.07 0.37 0.13 0.02 R4 0.03 0.11 0.15 1 0.52 0.44 0.71 0.71 R5 -0.06 -0.18 -0.07 0.52 1 0.32 0.62 0.52 R6 0 0.13 0.37 0.44 0.32 1 0.33 0.34 R7 0.13 0.11 0.13 0.71 0.62 0.33 1 0.67 R8 -0.06 0.13 0.02 0.71 0.52 0.34 0.67 1 Groups: Clans: Partial Mantel test: Group vs Repertoire controlling for clan P=0.69 Groups do not seem to have distinct repertoires within clans Group similarity R1 R2 R3 R4 R5 R6 R7 R8 R1 1 1 1 0 0 0 0 0 R2 1 1 1 0 0 0 0 0 R3 1 1 1 0 0 0 0 0 R4 0 0 0 1 1 1 0 0 R5 0 0 0 1 1 1 0 0 R6 0 0 0 1 1 1 0 0 R7 0 0 0 0 0 0 1 1 R8 0 0 0 0 0 0 1 1 Clan similarity 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1
ANOSIM Analysis of Similarities (“R test”) Version of ANOVA for similarity of dissimilarity matrices Similarity/dissimilarity matrix with units grouped Closely related to Mantel test In which one matrix indicates group membership Programme PRIMER
Dissimilarity matrix with groups of units B C D E 0.2 0.4 0.7 0.6 0.5 0.1 0.8 0.3
A B C D E 0.2 0.4 0.7 0.6 0.5 0.1 0.8 0.3 Ranks A B C D E 2 4 8.5 6.5 5 1 10 3
Ranks Mean rank within groups rW= 3.125 Mean rank between groups rB = 7.083 ANOSIM statistic R = (rB – rW)/[n(n-1)/4] = 0.791 Ranks A B C D E 2 4 8.5 6.5 5 1 10 3
ANOSIM statistic ANOSIM statistic R = (rB – rW)/[n(n-1)/4] R = 0 if high and low ranks perfectly mixed between versus within groups R = 1 or -1 for maximal differences between groups But is R statistically different from 0?
Testing ANOSIM statistic Permute group assignations many times, and calculate R*’s Compare with real R
ANOSIM Can be done with more than 2 groups More complex designs Two-way Nested designs Can be done without ranking Then absolute value of R has less meaning Almost same as Mantel test
Issues with Permutation Tests Results of permutation test strictly refer to only the data set not the wider population unless sampled at random How many permutations? Depends on test and p-value Tradeoff between accuracy and computer time Usually 100-10,000 permutations
Permutation Tests Allow hypotheses to be tested when: Distributional properties unknown Distributional properties of test statistic complex Usual assumptions not met (need independence) Good for small data sets Can check analytically-based tests Mantel tests compare two or more association matrices may help deal with independence issues ANOSIM (or Mantel tests) can do ANOVA-like analyses of similarity or dissimilarity matrices