TAUCHI – Tampere Unit for Computer-Human Interaction ERIT 2015: Data analysis and interpretation (1 & 2) Hanna Venesvirta Tampere Unit for Computer-Human Interaction
TAUCHI – Tampere Unit for Computer-Human Interaction Aims See which analysis are used for the course projects and how
TAUCHI – Tampere Unit for Computer-Human Interaction Overview of comparing samples Our aim is simple: we wish to find out, if the means of our collected data samples are separated enough to conclude that the means are likely to be different
TAUCHI – Tampere Unit for Computer-Human Interaction Overview of comparing samples Null hypothesis: no difference Alternative hypoth.: there is difference Aim to reject the null hypoth. Result is ”statistically significant” when there is only little likehood that the null hypothesis is true – p-value < 0.05
TAUCHI – Tampere Unit for Computer-Human Interaction Analysis of Variance (ANOVA) Is used to find out differences between means from more than two sample means Two sample designs: t-tests Can be used for testing the effects of more than one independent variable (IV) at one time 2-way / 3-way / etc.-way designs
TAUCHI – Tampere Unit for Computer-Human Interaction Repeated measures ANOVA Is used if we have measured all the participants under all the different levels of the (different) IV(s) Standard ANOVA cannot be used as the data is correlated
TAUCHI – Tampere Unit for Computer-Human Interaction Analysis example step-by-step Experimental task: select an object as fast as possible Depend variable: selection time (ms) One independent variable: diameter of an object With three levels: diameter either 25, 30, or 40 mm All the participants made the same task -> one-way within subjects design
TAUCHI – Tampere Unit for Computer-Human Interaction Note! The values per participant per level of IV are averages of several tasks - usually one exact task is repeated several time during the trial.
TAUCHI – Tampere Unit for Computer-Human Interaction Note! The values per participant per level of IV are averages of several tasks - usually one exact task is repeated several times during the trial...Thus, these means are averages of averages.
TAUCHI – Tampere Unit for Computer-Human Interaction Visualize your data! (on this case: means) Good for initial inspection of the possible difference Excellent for showing a summary of the results to the reader
TAUCHI – Tampere Unit for Computer-Human Interaction Visualize your data! Column graphs are good when presenting means The one below shows only the means of different levels of the IV
TAUCHI – Tampere Unit for Computer-Human Interaction Visualize your data! This one shows also the deviation of the data sample Here: Standard Error of the Mean (S.E.M.) If you add error bars to the graphs, see, e.g.,
TAUCHI – Tampere Unit for Computer-Human Interaction The means differ! …significantly?
TAUCHI – Tampere Unit for Computer-Human Interaction Data to SPSS? 1) Select “variable view” –tab from the bottom left corner 2) Add descriptive variable names 3) Select “data view” –tab from the bottom left corner 4) Add your data by, e.g., copy-and-paste from, e.g., excel NOTE! Only the numbers – you defined the column headings already on points no. 1-2
TAUCHI – Tampere Unit for Computer-Human Interaction Parametric tests – One way repeated measures ANOVA and Paired samples t-test
TAUCHI – Tampere Unit for Computer-Human Interaction
From the output, find table called “tests of within-subjects effects” – this is where ANOVA result is
TAUCHI – Tampere Unit for Computer-Human Interaction …but which row to read? ???
TAUCHI – Tampere Unit for Computer-Human Interaction Go back up and find table called ”mauchly’s test of sphericity” Tests, if the data looks like this: …or, more like this:
TAUCHI – Tampere Unit for Computer-Human Interaction If the result from this test is significant.. …the variances of the data are not equal, that is, the sphericity cannot be assumed
TAUCHI – Tampere Unit for Computer-Human Interaction Back to the result table… - thus we read the second row As we cannot assume sphericity, we cannot read the first row from the result table
TAUCHI – Tampere Unit for Computer-Human Interaction..also the with Greenhouse-Geisser corrected degrees of freedom the significance value is less than 0.05
TAUCHI – Tampere Unit for Computer-Human Interaction NOTE: If it happens that the result from the mauchly’s table is not significant we can assume sphericity, and thus we can use the result from the first row
TAUCHI – Tampere Unit for Computer-Human Interaction Thus there is a difference, but where? -> we shall find out by running pairwise comparisons with paired samples t-tests NOTE: pairwise comparisons are not to run if the ANOVA shows non-sign. result Comparing 25 mm to 30 mm, 25 mm to 40 mm, and 30 mm to 40 mm -> 3 comparisons Multiple comparisons – remember to adjust the p-value in order to avoid Type I error! Bonferroni correction: original p / number of comparisons Here: 0.05/3 = ~0.017
TAUCHI – Tampere Unit for Computer-Human Interaction
Paired Samples Test Paired DifferencestdfSig. (2-tailed) MeanStd. DeviationStd. Error Mean 95% Confidence Interval of the Difference LowerUpper Pair 1 Diameter25mm - Diameter30mm 534, , , , , ,01019,059 Pair 2 Diameter25mm - Diameter40mm 634, , , , , ,01219,007 Pair 3 Diameter30mm - Diameter40mm 100, , , , , ,98519,337 From the output, find table called ”paired samples test” – here are the results This one is smaller than the adjusted p (0.007 < 0.017), thus the significant difference is between this comparison… …and you can check the direction of the difference from, e.g., the graph you made.
TAUCHI – Tampere Unit for Computer-Human Interaction Reporting ANOVA result following is needed: (fixed) degrees of freedom (here: ~1.2 and ~22.4) F-value (here: ~5.6) p-value (here p < 0.05) ANOVA reporting
TAUCHI – Tampere Unit for Computer-Human Interaction Tests of Within-Subjects Effects Measure: MEASURE_1 SourceType III Sum of Squares dfMean SquareFSig. size Sphericity Assumed , ,9775,567,008 Greenhouse-Geisser ,9551, ,0215,567,023 Huynh-Feldt ,9551, ,2655,567,022 Lower-bound ,9551, ,9555,567,029 Error(size) Sphericity Assumed , ,431 Greenhouse-Geisser ,38122, ,041 Huynh-Feldt ,38123, ,988 Lower-bound ,38119, ,862 “One-way within subjects ANOVA with object diameter size as a factor revealed a statistically significant effect of the object diameter size, F(1.2, 22.4) = 5.6, p < 0.05.” ANOVA reporting
TAUCHI – Tampere Unit for Computer-Human Interaction For reporting the results from pairwise comparisons (with paired sample t-tests) following is needed: Degrees of freedom (here: 19) t-value (here: ~3.0) p-value (here: p < 0.01) Reporting pairwise comparisons
TAUCHI – Tampere Unit for Computer-Human Interaction Paired Samples Test Paired DifferencestdfSig. (2-tailed) MeanStd. DeviationStd. Error Mean 95% Confidence Interval of the Difference LowerUpper Pair 1 Diameter25mm - Diameter30mm 534, , , , , ,01019,059 Pair 2 Diameter25mm - Diameter40mm 634, , , , , ,01219,007 Pair 3 Diameter30mm - Diameter40mm 100, , , , , ,98519,337 Reporting pairwise comparisons “Post hoc pairwise comparisons for the object diameter size showed that the participants pointed significantly faster the 40 mm diameter object than the 25 mm diameter object, t(19) = 3.0, p < Other pairwise comparisons were not statistically significant.”
TAUCHI – Tampere Unit for Computer-Human Interaction Take the data from following slide and Create a data matrix excel & SPSS Visualize data Column graph is recommended Run analysis with SPSS Write down the results// Sent graph & the written-down-results to Hanna via mail as.pdf We shall take a look on this next week Task: parametric analysis (1)
TAUCHI – Tampere Unit for Computer-Human Interaction Errors P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 P16 P17 P18 P19 P20 25 mm , ,587,525 37,512,50 30 mm 012, ,537, , ,5 2512, mm , , ,5 Experimental task: select an object as accurately as possible Depend variable: errors One independent variable: diameter of an object With three levels: diameter either 25, 30, or 40 mm Task: parametric analysis (2)
TAUCHI – Tampere Unit for Computer-Human Interaction Non-parametric analysis – Friedman’s test and Wilcoxon Repeated Measures Signed-Rank test
TAUCHI – Tampere Unit for Computer-Human Interaction These analysis do not make any assumptions about the probability distribution of the data Before the analysis, the data is transformed to ranks (by the statistical SW) Usually a non-parametric test has an equivalent parametric test Non-parametric analysis
TAUCHI – Tampere Unit for Computer-Human Interaction One-way repeated measures ANOVA Equivalent tests Friedman’s rank test for k- correlated samples Matched pairs t-test Wilcoxon Repeated Measures Signed- Rank test Parametric tests Non-parametric tests
TAUCHI – Tampere Unit for Computer-Human Interaction Friedman’s rank test for k-correlated samples
TAUCHI – Tampere Unit for Computer-Human Interaction
Test Statistics a N 20 Chi-Square 10,900 df 2 Asymp. Sig.,004 a. Friedman Test From the output, find ”test statistics”
TAUCHI – Tampere Unit for Computer-Human Interaction Again, a difference; find out where -> pairwise comparisons, this time done with non-parametric paired samples test Comparing 25 mm to 30 mm, 25 mm to 40 mm, and 30 mm to 40 mm -> 3 comparisons Multiple comparisons – remember to adjust the p-value in order to avoid Type I error! Bonferroni correction: original p / number of comparisons Here: 0.05/3 = ~0.017
TAUCHI – Tampere Unit for Computer-Human Interaction Wilcoxon Repeated Measures Signed-Rank test
TAUCHI – Tampere Unit for Computer-Human Interaction
Test Statistics a Diameter30mm - Diameter25mm Diameter40mm - Diameter25mm Diameter40mm - Diameter30mm Z -2,165 b -3,472 b -,261 b Asymp. Sig. (2-tailed),030,001,794 a. Wilcoxon Signed Ranks Test b. Based on positive ranks. From the output, find ”test statistics” Again, his one is smaller than the adjusted p (0.007 < 0.017), thus the significant difference is here again.
TAUCHI – Tampere Unit for Computer-Human Interaction Test Statistics a N 20 Chi-Square 10,900 df 2 Asymp. Sig.,004 a. Friedman Test Reporting Friedman’s test “Friedman's test showed that there was statistically significant effect of object diameter, Χ²(2) = 10.9, p < 0.01.”
TAUCHI – Tampere Unit for Computer-Human Interaction Reporting Wilcoxon test Test Statistics a Diameter30mm - Diameter25mm Diameter40mm - Diameter25mm Diameter40mm - Diameter30mm Z -2,165 b -3,472 b -,261 b Asymp. Sig. (2-tailed),030,001,794 a. Wilcoxon Signed Ranks Test b. Based on positive ranks. Note: double click the table (twice) and you will see more accurate p-value; on this case p = , thus it is significant in 0.01 level as 0.01 / 3 = “Post hoc pairwise comparisons with Wilcoxon signed-rank tests showed that the selection time was significantly faster when the object diameter was 40 mm than when the diameter was 25 mm, Z = -3.47, p < Other pairwise comparisons were not statistically significant.”
TAUCHI – Tampere Unit for Computer-Human Interaction Check your previous exercise and compare it to Hanna’s answer Answer is in course web-side/schedule Try to fix if different Run non-parametric analysis to the same error data in SPSS Are the results different in any way? Mail possible fix of the 1 st task and the written-down-results of the 2 nd to Hanna If the results differ, note it, and how 1) Check the previous task 2) Make non-parametric analysis