Tue 8-10, Period III, Jan-Feb 2018

Slides:



Advertisements
Similar presentations
The Simple Regression Model
Advertisements

Logistic Regression Psy 524 Ainsworth.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Sociology 601 Class 17: October 28, 2009 Review (linear regression) –new terms and concepts –assumptions –reading regression computer outputs Correlation.
Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.
Multiple Regression [ Cross-Sectional Data ]
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Linear Regression with One Regression
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
1 Carrying out EFA - stages Ensure that data are suitable Decide on the model - PAF or PCA Decide how many factors are required to represent you data When.
Lecture 6: Multiple Regression
Business Statistics - QBM117 Statistical inference for regression.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Regression Analysis Week 8 DIAGNOSTIC AND REMEDIAL MEASURES Residuals The main purpose examining residuals Diagnostic for Residuals Test involving residuals.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Examining Relationships in Quantitative Research
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Handout Twelve: Design & Analysis of Covariance
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Chapter 8 Introducing Inferential Statistics.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard)   Week 5 Multiple Regression  
Chapter 14 Introduction to Multiple Regression
Correlation analysis is undertaken to define the strength an direction of a linear relationship between two variables Two measurements are use to assess.
Linear Regression with One Regression
Logistic Regression APKC – STATS AFAC (2016).
Introduction to Regression Analysis
The Correlation Coefficient (r)
Understanding Results
Statistics in MSmcDESPOT
12 Inferential Analysis.
Business Statistics Multiple Regression This lecture flows well with
I271B Quantitative Methods
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Regression Analysis Week 4.
CHAPTER 29: Multiple Regression*
Tue 8-10, Period III, Jan-Feb 2018
Regression Models - Introduction
Migration and the Labour Market
Tue 8-10, Period III, Jan-Feb 2018
Tue 8-10, Period III, Jan-Feb 2018
From GLM to HLM Working with Continuous Outcomes
Molto importante! Tue 8-10, Period III, Jan-Feb 2018
Simple Linear Regression
12 Inferential Analysis.
Incremental Partitioning of Variance (aka Hierarchical Regression)
Wrap-up and Course Review
Ch11 Curve Fitting II.
Tue 8-10, Period III, Jan-Feb 2018
Statistics II: An Overview of Statistics
Product moment correlation
An Introduction to Correlational Research
Research Methods & Statistics
MGS 3100 Business Analysis Regression Feb 18, 2016
Regression Models - Introduction
The Correlation Coefficient (r)
Presentation transcript:

Applied quantitative analysis a practical introduction SOSM-405 (5 cr) Session 6 Tue 8-10, Period III, Jan-Feb 2018 Faculty of Social Sciences / University of Helsinki Teemu Kemppainen teemu.t.kemppainen@helsinki.fi https://teemunsivu.wordpress.com/applied-quantitative-analysis/

Contents Statistical inference, causality, regression: a clarifying example (hopefully) Regression model evaluation & diagnostics  some basic notions Don’t worry! Factor analysis, manual sum variables and regression: some important observations Dummy coding

Statistical inference - take 2 Population: we first assume H0, e.g. difference between two groups = 0 Random sample -> Sampling error, sampling distribution Non-response Data Different sampling techniques Statistical inference; e.g. p < 0.05  this kind of result in data is quite rare when H0 is true

Causality and regression revisited Mediators (intermediate vars, mechanism) Modifiers (interaction) X, IV, predictor Y, DV, outcome Confounders (control vars, adjustments)

Example: a typical cross-sectional survey study

RQ’s How is the tenure structure of the estate related to perceived social disorder? Does local social disadvantage mediate the association? To what extent do social interaction and normative regulation of the estate explain why more disadvantaged estates expose their residents to social disorder?  two levels, residents in neighbourhoods  hierarchical design  multi-level regression

Diagram - logic of the study

Demonstration & statistical inference 1 Univariate descriptives

Demonstration & statistical inference 2 Bivariate associations: descriptives

Adjusted associations: regression Ex. cont’d (RQ 1) – regression model elaboration: I (bivariate/crude model)  II (confounders)  III mediators

Regression models Cf. description … ”All models are wrong but some are useful” (George Box) How to evaluate the model? Or, when is our model good / useful? Plain reason: we obtain a clear(er) idea of what is going on  e.g. confounders taken into account, possible mechanism elucidated Technical aspect: regression diagnostics

Regression diagnostics 1 Is the model correctly specified? Should include all relevant variables and nothing else  What is relevant? Confounders vs. mediators. See slide 10 above. Theoretical, not a technical matter! Quality of measurement is sufficient for all variables  Reliability and validity (session 4) DV-IV-relationship: linear (straight line)  The model may incorporate curvilinearities, e.g. a square term. Multicollinearity  loss of power, larger standard errors Outliers, weird cases?  Quite rare in typical surveys Important: check residuals…(next slides)

Regression diag. 2: residuals Haslwanter 2013, Residuals for Linear Regression Fit. Wikimedia Commons.

Regression diag. 3: residuals Residuals distributed with a mean of 0, Normal distribution Residuals should be independent from each other  important E.g. ESS  if all Finns have a negative residual  problem! Residuals should be random  no information, no structure, just random noise Constant variance (homoskedasticity)

Check the UCLA site for more information (if/when you need it) SPSS and regression: https://stats.idre.ucla.edu/spss/seminars/introduction-to-regression-with-spss/ SPSS and reg. diagnostics: https://stats.idre.ucla.edu/spss/seminars/introduction-to-regression-with-spss/introreg-lesson2/ Stata and reg. diag: https://stats.idre.ucla.edu/stata/webbooks/reg/chapter2/stata-webbooksregressionwith-statachapter-2-regression-diagnostics/

EFA, sum variables and regression 1 Value conservatism Survey items V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 Indicator for value conservatism Political left-rihgt scale Indicator for political left-right scale Why? Content validity & reliability  better measurement, cf. regression evaluation. Measurement error See e.g. Vehkalahti, slides 4.

EFA, sum variables and regression 2 One or many dimensions in EFA? Factor score variables often ”extracted” with Varimax rotation  the resulting indicators do not correlate with each other If DV and IV’s in the same EFA…problems! Only IV’s in EFA  might be handy! There are ”oblique” rotations…interpretation? Manual sum variables often a good alternative Or taking DV and IV’s in separate EFA’s Cf. the example above

Dummy coding Enables a full control over categorical IV’s Different programs/procedures treat categorical IV’s differently  a real nuisance! E.g. education, values 1/2/3/4 Make three dummy vars (0/1): e.g. edu2, edu3, edu4 The omitted category (edu1) is ”the reference” category Interpret results separately for each dummy Difference to the reference (p-value) What differences are important in your analysis?  choose the reference category accordingly!

Next lecture: logistic regression How to analyse binary/dichotomic/dummy outcomes E.g. death, medications, crimes (registers) Good self-rated health, moving intentions, insecurity in the neighbourhood (surveys, often used as dummies OLS / LPM / logistic or logit / probit … Discipline, research area etc. Sociology and logistic?

Questions? Let’s practise!