Applied quantitative analysis a practical introduction SOSM-405 (5 cr) Session 5 Molto importante! Tue 8-10, Period III, Jan-Feb 2018 Faculty of Social Sciences / University of Helsinki Teemu Kemppainen teemu.t.kemppainen@helsinki.fi https://teemunsivu.wordpress.com/applied-quantitative-analysis/
Contents Description statistical inference Associations causality Or, what a cross-sectional regression analysis can and cannot do
Populations and samples in surveys Random sample Non-response Data Different sampling techniques Statistical inference, study objectives
Statistical inference in a nutshell 1 Let’s consider a bivariate result lrscale (0-5 / 6-10) & ownrdcc (ESS8)
Statistical inference in a nutshell 2 Is the result we see only due to sampling? there is nothing there in the population: random sampling just produce sometimes ”false positives” Or is it really there in the population we study? Technical term: null hypothesis (H0) there is nothing there: no differences, no results p-value (sig.) How often does random sampling produce such a result when H0 is true? Or, how much do we have evidence to reject H0? Old conventions: 0.05, 0.01, 0.001 … just conventions but often used! Says nothing about the practical or substantial meaning or significance of the result with big data you get small p-values for even very very small real differences!
Statistical inference in a nutshell 3 SPSS: bivariate linear regression p <0.0005 we can easily reject H0 ”The result is out there!” - Causality? - Is this interesting from a practical or substantial point of view? - Non-response? - Correct test / regression model type? ”OLS”, a good starting point, but good to be careful
On causality 1 Philosophy of science E.g. J. L. Mackie’ & INUS Picture from: de Araújo, L. F. S., Dalgalarrondo, P., & Banzato, C. E. (2014). On the notion of causality in medicine: addressing Austin Bradford Hill and John L. Mackie. Archives of Clinical Psychiatry (São Paulo), 41(2), 56-61.
On causality 2 Applied QUAN studies Often observational (e.g. surveys) cf. RCT Practical perspective: manipulation of the IV Language: association etc. contra cause, effect, impact… Some typically used practical criteria Association Temporal order (reverse causality) No confounders Interpretation E.g. Bradford-Hill considerations see Höfler, M. (2005). The Bradford Hill considerations on causality: a counterfactual perspective. Emerging themes in epidemiology, 2(1), 11. Easy not always clear, reciprocal effects…cross-sectional design does not help regression helps to some extent…but cross-sectional less regression may help + theory, reasoning
Causality and regression Mediators (intermediate vars, mechanism) Modifiers (interaction) X, IV, predictor Y, DV, outcome Confounders (control vars, adjustments)
Causality and regression Mediators (intermediate vars, mechanism) Modifiers (interaction) THEORY X, IV, predictor Y, DV, outcome E.g. Is Z a confounder or a mediator? Confounders (control vars, adjustments)