Quantitative Methods for Researchers Paul Cairns
Objectives Statistical argument Comparison of distributions A fly-by of approaches 2
How are the abstracts? Questions? Problems? Restarts? 3
4 Statistical Argument Inference is an argument form Prediction is essential – Alternative hypothesis – “X causes Y” No prediction – measuring noise
5 Gold standard argument 1.Collect data 2.Data variation could be chance (null) 3.Predict the variations (alternative) 4.Statistics give probabilities 5.Unlikely predictions “prove” your case
6 Implications Must have an alt hyp No multiple testing No post hoc analysis Need multiple experiments
7 Silver standard argument 1.Collect data 2.Data variations could be chance (null) 3.Are there “real” patterns in the data? 4.Use statistics to suggest (unlikely) patterns 5.Follow up findings with gold standard work
8 Fishing: This is bad science 1.Collect lots of data – DVs and IVs 2.Data variations could be chance 3.Test until a significant result appears 4.Report the tests that were significant 5.Claim the result is important
Statistical inference Model comparison: – Single distribution (null) – Multiple distributions (alternative) From samples, which model is better? From samples, is null likely? 9
What terms do you know? The statistical zoo! 10
Choosing a test What’s the data type? Do you know the distribution? Within or between What are you looking for? 11
Distributions Theoretical stance Must have this! Not inferred from samples 12
13 Parametric tests Normal distribution Two parameters Null = one underlying normal distribution Differences in location (mean)
t-test models 14
t-test Two samples Two means Are means showing natural variation? Compare difference to natural variation 15
Effect size How interesting is the difference? – 2s difference in timings – Significance is not same as importance Cohen’s d 16
ANOVA Parametric Multiple groups Why not do pairwise comparison? Get an F value Follow up tests 17
ANOVA++ Multiple IV – So more F values! Within and between Effect size, η 2 – Amount of variance predicted by IV 18
Non-parametric tests Unknown underlying distribution Heterogeneity of variance Non-interval data Usually test location Effect size is tricky! 19
Wilcoxon test See sheet 20
Seeing location Boxplots Median, IQR, “Range” Outliers 21
22
Multivariate Multiple DV Multivariate normal distribution – Normal no matter how you slice MANOVA Null = one underlying (mv) normal distribution 23
24
Issues Sample size Assumptions Interpretation Communication 25
Your abstract What sort of data will you produce? Can you theorise about the distribution? What sort of test do you think you will need? 26
Health warnings Craft skill Simpler is better – Doing it – Interpreting it – Communicating it Experiments as evidence Software packages are deceptively easy 27
Q & A Any question about any aspect Very general or very specific Any research method! 28
Useful Reading Cairns, Cox, Research Methods for HCI: chaps 6 Rowntree, Statistics Without Tears Howell, Fundamental Statistics for the Behavioural Sciences, 6 th edn. Abelson, Statistics as Principled Argument Silver, The Signal and the Noise 29
Monte Carlo Process but not distribution Generate a really large sample Compare to your sample Still theoretically driven! 30
Example Event = 4 heads in a row from a set of 20 flips of a coin You have sample of 30 sets 18 events How likely? – Get flipping! 31