Download presentation
Presentation is loading. Please wait.
1
Multivariate Statistical Methods
Measuring and Testing Multivariate Distances by Jen-pei Liu, PhD Division of Biometry, Department of Agronomy, National Taiwan University and Division of Biostatistics and Bioinformatics National Health Research Institutes 2019/2/4 Copyright by Jen-pei Liu, PhD
2
Measuring and Testing Multivariate Distances
Introduction Distances between Individual Observations Distances between Populations and Samples Distances Based on Proportions Presence-absence Data The Mantel Randomization Test Summary 2019/2/4 Copyright by Jen-pei Liu, PhD
3
Copyright by Jen-pei Liu, PhD
Introduction Multivariate Problems in terms of distances Between single observations Between samples of observations Between populations of observations 2019/2/4 Copyright by Jen-pei Liu, PhD
4
Copyright by Jen-pei Liu, PhD
Introduction Mandible measurements of canine groups Dogs, wolves, jackals, cuons, and dingos How far one of these groups from the other groups Two groups are close if two animals have similar mandible measurements Distance measures for representation of similar measurements 2019/2/4 Copyright by Jen-pei Liu, PhD
5
Copyright by Jen-pei Liu, PhD
Introduction Different types of measurements Example of 16 colonies of a butterfly species Two sets of distances environmental genetic Relationship between these two sets of distances 2019/2/4 Copyright by Jen-pei Liu, PhD
6
Distances between Individual Observations
N objects with p variables X1,…,Xp Object i: Xi1, Xi2, …, Xip Object j: Xj1, Xj2, …, Xjp Distance measures between object i and object j Graphical presentation of two or three variables 2019/2/4 Copyright by Jen-pei Liu, PhD
7
Distances between Individual Observations
2019/2/4 Copyright by Jen-pei Liu, PhD
8
Copyright by Jen-pei Liu, PhD
2019/2/4 Copyright by Jen-pei Liu, PhD
9
Copyright by Jen-pei Liu, PhD
2019/2/4 Copyright by Jen-pei Liu, PhD
10
Distances between Individual Observations
Example: Dogs and Related Species Standardized Variables Group X X X X X5 X6 Modern Dogs Golden Jackals Chinese Wolf Indian Wolf Cuon Dingo Prehistoric Dog 2019/2/4 Copyright by Jen-pei Liu, PhD
11
Distances between Individual Observations
Euclidean Distances between Seven Canine Group Modern Golden Chinese Indian Prehistoric dog jackal wolf wolf Cuon Dingo dog Modern dog Golden jackal Chinese wolf Indian wolf Cuon Dingo Prehistoric dog 2019/2/4 Copyright by Jen-pei Liu, PhD
12
Distances between Populations and Samples
Information about populations Means Variances Covariances Measures Between populations Penrose distance Mahalanobis distance Between populations and samples 2019/2/4 Copyright by Jen-pei Liu, PhD
13
Distances between Populations and Samples
Penrose Distance Variables: X1,…,Xp The ith population means:1i,…, pi The ith population variances: v1i,…,vpi 2019/2/4 Copyright by Jen-pei Liu, PhD
14
Distances between Populations and Samples
The Mahalanobis distance: takes correlation into consideration 2019/2/4 Copyright by Jen-pei Liu, PhD
15
Distances between Populations and Samples
The Mahalanobis distance between the samples and population 2019/2/4 Copyright by Jen-pei Liu, PhD
16
Distances between Populations and Samples
Example: Distances between Egyptian skulls Penrose’s distance between sample 1 and sample 2 P12 = ( )2/(4x21.112) + ( )2/(4x23.486) + ( )2/(4x24.180) + ( )2/(4x10.154) =0.023 2019/2/4 Copyright by Jen-pei Liu, PhD
17
Distances between Populations and Samples
2019/2/4 Copyright by Jen-pei Liu, PhD
18
Distances between Populations and Samples
The inverse of sample covariance matrix 2019/2/4 Copyright by Jen-pei Liu, PhD
19
Distances between Populations and Samples
The Mahalanobis distance between sample 1 and sample 2 D122 = ( )0.0483( ) +( )0.0011( ) +…+( )( )( ) +( )0.1041( ) =0.091 2019/2/4 Copyright by Jen-pei Liu, PhD
20
Distances between Populations and Samples
Penrose Distances (1) (2) (3) (4) (5) Early predynastic (1) Late predynastic (2) 12-13th dynastic (3) Ptolemaic (4) Roman (5) 2019/2/4 Copyright by Jen-pei Liu, PhD
21
Distances between Populations and Samples
Mahalanobis Distances (1) (2) (3) (4) (5) Early predynastic (1) Late predynastic (2) 12-13th dynastic (3) Ptolemaic (4) Roman (5) 2019/2/4 Copyright by Jen-pei Liu, PhD
22
Distances Based on Proportions
Animals of a certain species might be classified into K genetic classes Class Colony Colony Difference 1 p1 q p1-q1 p q p2-q2 k pk qk pk-qk 2019/2/4 Copyright by Jen-pei Liu, PhD
23
Distances Based on Proportions
Distance Measures 2019/2/4 Copyright by Jen-pei Liu, PhD
24
Distances Based on Proportions
Distance Measures 2019/2/4 Copyright by Jen-pei Liu, PhD
25
Presence-absence Data
Presence and absence of two species at 10 locations Site Species Species 2019/2/4 Copyright by Jen-pei Liu, PhD
26
Presence-absence Data
Species 2 Species 1 Present Absent Total Present a b a+b Absent c d c+d Total a+c b+d n 2019/2/4 Copyright by Jen-pei Liu, PhD
27
Presence-absence Data
Distance Measures Simple matching index: (a+d)/n Ochiai index: a/[(a+b)(a+c)]1/2 Dice-Sorensen index: 2a/(2a+b+c) Jaccard index: a/(a+b+c) 2019/2/4 Copyright by Jen-pei Liu, PhD
28
The Mantel Randomization Test
Detection of space and time clustering of disease – whether cases of a disease that occur close in space also tend to be close in time Two 4x4 distance matrices of 4 objects Symmetric matrices 2019/2/4 Copyright by Jen-pei Liu, PhD
29
The Mantel Randomization Test
2019/2/4 Copyright by Jen-pei Liu, PhD
30
The Mantel Randomization Test
2019/2/4 Copyright by Jen-pei Liu, PhD
31
The Mantel Randomization Test
Mantel Test Whether the elements in M and E show some significant correlation Matching m12 with e12, m13 with e13, etc. 2019/2/4 Copyright by Jen-pei Liu, PhD
32
The Mantel Randomization Test
M stay as it is Random order chosen for E Order of 3,2,4,1 2019/2/4 Copyright by Jen-pei Liu, PhD
33
The Mantel Randomization Test
Time Distances (1) (2) (3) (4) (5) B.C. (1) 2999 – 2000 B.C. (2) B.C. (3) (4) A.D (5) 2019/2/4 Copyright by Jen-pei Liu, PhD
34
The Mantel Randomization Test
A total of 5!=120 ways to re-order the five samples A total elements in the randomization distribution in the correlation The observed correlation is 0.954 There are only two correlations greater than or equal to 0.954 The p-value (1-sided) = 2/120 = 0.017 2019/2/4 Copyright by Jen-pei Liu, PhD
35
Copyright by Jen-pei Liu, PhD
Summary Euclidean distances Penrose distance Mahalanobis distance Distance measures for proportions Mantel test for similarity between distance matrices 2019/2/4 Copyright by Jen-pei Liu, PhD
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.