Download presentation
Presentation is loading. Please wait.
Published byMarcel Tindel Modified over 9 years ago
1
Wilcoxon’s Rank-Sum Test (two independent samples) n1 + n2 ≤ 25: Same Distributions Runs (Labor Data) Naïve Bayes Acc (n1) RanksNaïve Bayes Acc (n2) Ranks 123456789123456789 80.0 88.89 90.0 94.44 94.74 95.0 100.0 1.0 4.5 7.0 8.0 10.0 12.5 15.0 84.2 85.0 88.89 89.47 94.74 100.0 2.0 3.0 4.5 6.0 10.0 15.0 Sample Size97 Mean92.5380.44 Rank Sum (W)80.555.5 (accept) Critical Values (Wilcoxon table) H 0 : mean(Acc 1 ) = mean(Acc 2 ) Significance, test type0.05, two-tailed0.01, two-tailed0.05, one-tailed0.01, one-tailed V40354337
2
Wilcoxon’s Rank-Sum Test (two independent samples) n1 + n2 ≤ 25: Different Distributions Runs (Labor Data) Naïve Bayes Acc (n1) RanksJ48 Acc (n2)Ranks 123456789123456789 80.0 88.89 90.0 94.44 94.74 95.0 100.0 3.5 7.5 9.5 11.0 12.5 14.5 16.0 65.0 70.0 80.0 84.21 85.0 88.89 90.0 1.0 2.0 3.5 5.0 6.0 7.5 9.5 Sample Size97 Mean92.5380.44 Rank Sum (W)101.534.5 (reject) Critical Values (Wilcoxon table) H 0 : mean(Acc 1 ) = mean(Acc 2 ) Significance, test type0.05, two-tailed0.01, two-tailed0.05, one-tailed0.01, one-tailed V40354337
3
Wilcoxon’s Rank-Sum Test (two independent samples) n1 + n2 > 25: Different Distributions Adult Datan1: Naïve Bayes Acc(rank) runs 1 - 15 n1: Naïve Bayes Acc(rank) runs 16 - 30 n2: J48 Acc(rank) runs 1 - 15 n2: J48 Acc(rank) runs 16 - 30 82.66 (1.0) 82.86 (2.0) 82.99 (3.0) 83.06 (4.0) 83.07 (5.0) 83.08 (6.0) 83.1 (7.0) 83.14 (8.0) 83.16 (9.0) 83.21 (10.0) 83.24 (11.0) 83.28 (12.0) 83.31 (13.0) 83.34 (14.0) 83.38 (15.0) 83.39 (16.0) 83.4 (17.0) 83.42 (18.0) 83.43 (19.5) 83.44 (21.0) 83.45 (22.0) 83.52 (23.0) 83.57 (24.0) 83.61 (25.0) 83.63 (26.0) 83.69 (27.0) 83.71 (28.0) 83.78 (29.0) 83.81 (30.0) 85.7 (31.0) 85.73 (32.0) 85.82 (33.0) 85.83 (34.0) 85.87 (35.0) 85.91 (36.5) 85.93 (38.0) 85.94 (39.0) 85.95 (40.0) 85.96 (41.0) 85.98 (42.0) 85.99 (43.0) 86.03 (44.5) 86.04 (46.5) 86.1 (48.5) 86.12 (50.5) 86.2 (52.0) 86.25 (53.0) 86.26 (54.0) 86.27 (55.0) 86.28 (56.0) 86.31 (57.0) 86.36 (58.0) 86.42 (59.0) 86.7 (60.0) Sample Size30 Mean83.33986.072 Rank Sum (W)465.01365.0 Mean(W) = 915, STD(W) = 67.6387 Z statistic-6.653 < 1.96 (z at alpha = 0.05) * reject H 0 : mean(Acc 1 ) = mean(Acc 2 )
4
Wilcoxon’s Matched Pairs Signed Ranks Test (for paired scores) n ≤ 50 Data Example Classifier 1 scores (A) Classifier 2 scores (B) A-B|A-B|Rank(|A-B|)Signed Rank(|A-B|) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 78 24 64 45 64 52 30 50 64 50 78 22 84 40 90 72 78 24 62 48 68 56 25 44 56 40 68 36 68 20 58 32 0 +2 —3 —4 +5 +6 +8 +10 —14 +16 +20 +32 +40 0 2 3 4 5 6 8 10 14 16 20 32 40 remove 1 2 3.5 5 6 7 8.5 10 11 12 13 14 remove +1 —2 —3.5 +5 +6 +7 +8.5 —10 +11 +12 +13 +14 Sum of Signed RanksW+ = +86 W- = -19 Select W = 19 (reject H 0 ) Critical Values (Wilcoxon table) H 0 : mean(signed_rank(|A-B|) = 0 Significance, test type0.05, two-tailed0.01, two-tailed0.05, one-tailed0.01, one-tailed0.05, two-tailed V4035433740
5
Wilcoxon’s Matched Pairs Signed Ranks Test (for paired scores) n > 50 Randomly split the Adult data set at 50% 100 times. For each training/testing data set, run Naïve Bayes and J48 and record their accuracy values as a pair for which we compute the difference in accuracy Determine the signed ranks of the difference for each pair (as previous example – data is omitted due to space constraints) We get W+ = 0 and W- = 5050 (J48 produces higher accuracy always), N = 100 We get, mean(W) = 2525, STD(W)=290.84 Z=(0-2525)/290.84 = -8.6818 < 1.96 (at alpha = 0.05)
6
What is the Effect Size? (The effect of using LaPlace smoothing on accuracy of J48) Runs on Adult dataAccuracy of J48 (no LePlace)Acc J48 (LePlace) 1 2 3 4 5 6 7 8 9 10 85.83 85.91 86.12 85.82 86.28 86.42 85.91 86.10 85.95 86.12 85.83 85.91 86.12 85.82 86.28 86.42 85.90 86.10 85.95 86.11 Mean86.0586.04 Standard Deviation0.185850.196002 SP 2 SP (9 * (0.18585) 2 + 9 * (0.196002) 2 ) / 18 = 0.0365 Sqrt(0.0365) = 0.1910 d(86.05 – 86.04) / 0.1910 = 0.0524 This is less than 0.2 d is very small to no effect
7
One-Way ANOVA (J48 on three domains) RunsJ48 Acc AdultJ48 Acc PimaJ48 Acc Credit 1 2 3 4 5 6 7 8 9 10 85.83 85.91 86.12 85.82 86.28 86.42 85.91 86.10 85.95 86.12 75.86 73.18 69.08 74.05 74.71 65.90 76.25 75.10 70.50 73.95 84.19 85.90 83.83 85.11 86.38 81.20 86.38 86.75 88.03 87.18 Results: High F and very low p Groups are significantly different (see plot) Source of Variability Sum Squares Degree of Freedom Mean Squares F Statistic = MS G /MS E Pro. > F (p-value) Groups1113.22556.598110.569.9E-14 Error135.92275.034 Total1249.1229
8
One-Way ANOVA (J48 on three domains)
9
Two-Way ANOVA (J48 & N.B. on 3 domains) ClassifierRunsAcc AdultAcc PimaAcc Credit J48 (A) 1234512345 85.83 85.91 86.12 85.82 86.28 75.86 73.18 69.08 74.05 74.71 84.19 85.90 83.83 85.11 86.38 NB (B) 1234512345 83.08 83.07 83.63 83.16 83.71 78.54 74.33 71.37 76.72 78.93 74.36 76.07 78.30 79.57 80.00 p-values are low Columns (H 0A ), and Interactions(H 0AB ) are significantly different but Rows(H 0B ) are the least different Source of Variability Sum Squares Degree of Freedom Mean Squares F Statistic = MS G /MS E Pro. > F (p-value) Columns H 0A 517.71332258.856765.46361.9099E-10 Rows H 0B 46.65031 11.79760.0021643 Interactions H 0AB 125.7066262.853315.89534.0161E-05 Error94.901243.9542 Total784.971129
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.