Statistics for the Social Sciences

Slides:



Advertisements
Similar presentations
Kin 304 Regression Linear Regression Least Sum of Squares
Advertisements

Correlation and Linear Regression.
Prediction with multiple variables Statistics for the Social Sciences Psychology 340 Spring 2010.
Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Statistics for the Social Sciences Psychology 340 Spring 2005 Prediction cont.
Statistics for the Social Sciences
Lecture 6: Multiple Regression
Statistics for the Social Sciences Psychology 340 Spring 2005 Hypothesis testing with Correlation and Regression.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Correlation and Regression Analysis
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Relationships Among Variables
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, November 19 Chi-Squared Test of Independence.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
Examining Relationships in Quantitative Research
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Fundamental Statistics in Applied Linguistics Research Spring 2010 Weekend MA Program on Applied English Dr. Da-Fu Huang.
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, November 12, 2013 Correlation and Regression.
Environmental Modeling Basic Testing Methods - Statistics III.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Outline of Today’s Discussion 1.Seeing the big picture in MR: Prediction 2.Starting SPSS on the Different Models: Stepwise versus Hierarchical 3.Interpreting.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Stats Methods at IC Lecture 3: Regression.
Lecture Slides Elementary Statistics Twelfth Edition
23. Inference for regression
The simple linear regression model and parameter estimation
Chapter 20 Linear and Multiple Regression
Regression and Correlation
Regression Analysis AGEC 784.
Sections Review.
Correlation, Bivariate Regression, and Multiple Regression
Kin 304 Regression Linear Regression Least Sum of Squares
Multiple Regression.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Correlation and Regression
BPK 304W Regression Linear Regression Least Sum of Squares
(Residuals and
BPK 304W Correlation.
Regression and Residual Plots
The Practice of Statistics in the Life Sciences Fourth Edition
I271B Quantitative Methods
Stats Club Marnie Brennan
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Prepared by Lee Revere and John Large
Simple Linear Regression
No notecard for this quiz!!
Unit 3 – Linear regression
Statistics for the Social Sciences
Undergraduated Econometrics
Reasoning in Psychology Using Statistics
M248: Analyzing data Block D UNIT D2 Regression.
Linear Regression and Correlation
Product moment correlation
Linear Regression and Correlation
Linear Regression and Correlation
Inferences for Regression
Review I am examining differences in the mean between groups How many independent variables? OneMore than one How many groups? Two More than two ?? ?
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
3.2. SIMPLE LINEAR REGRESSION
REGRESSION ANALYSIS 11/28/2019.
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Statistics for the Social Sciences Psychology 340 Fall 2006 Prediction cont.

Outline (for week) Simple bi-variate regression, least-squares fit line The general linear model Residual plots Using SPSS Multiple regression Comparing models, (?? Delta r2)

From last time Review of last time Y = intercept + slope(X) + error Y 1 2 3 4 5 6 Y = intercept + slope(X) + error

From last time Y X The sum of the residuals should always equal 0. 6 1 2 3 4 5 6 The sum of the residuals should always equal 0. The least squares regression line splits the data in half Additionally, the residuals to be randomly distributed. There should be no pattern to the residuals. If there is a pattern, it may suggest that there is more than a simple linear relationship between the two variables.

Seeing patterns in the error Residual plots Useful tools to examine the relationship even further. These are basically scatterplots of the Residuals (often transformed into z-scores) against the Explanatory (X) variable (or sometimes against the Response variable)

Seeing patterns in the error Scatter plot Residual plot The scatter plot shows a nice linear relationship. The residual plot shows that the residuals fall randomly above and below the line. Critically there doesn't seem to be a discernable pattern to the residuals.

Seeing patterns in the error Scatter plot Residual plot The residual plot shows that the residuals get larger as X increases. This suggests that the variability around the line is not constant across values of X. This is referred to as a violation of homogeniety of variance. The scatter plot also shows a nice linear relationship.

Seeing patterns in the error Scatter plot Residual plot The scatter plot shows what may be a linear relationship. The residual plot suggests that a non-linear relationship may be more appropriate (see how a curved pattern appears in the residual plot).

Regression in SPSS Using SPSS Variables (explanatory and response) are entered into columns Each row is an unit of analysis (e.g., a person)

Regression in SPSS Analyze: Regression, Linear

Regression in SPSS Enter: Predicted (criterion) variable into Dependent Variable field Predictor variable into the Independent Variable field

Regression in SPSS The variables in the model r r2 We’ll get back to these numbers in a few weeks Unstandardized coefficients Slope (indep var name) Intercept (constant)

Regression in SPSS Recall that r = standardized  in bi-variate regression Standardized coefficient  (indep var name)

Multiple Regression Typically researchers are interested in predicting with more than one explanatory variable In multiple regression, an additional predictor variable (or set of variables) is used to predict the residuals left over from the first predictor.

Multiple Regression Bi-variate regression prediction models Y = intercept + slope (X) + error

Multiple Regression “residual” “fit” Bi-variate regression prediction models Y = intercept + slope (X) + error Multiple regression prediction models “fit” “residual”

Multiple Regression Multiple regression prediction models whatever variability is left over First Explanatory Variable Second Explanatory Variable Third Explanatory Variable Fourth Explanatory Variable

Multiple Regression Predict test performance based on: Study time Test time What you eat for breakfast Hours of sleep whatever variability is left over First Explanatory Variable Second Explanatory Variable Third Explanatory Variable Fourth Explanatory Variable

Multiple Regression Predict test performance based on: Study time Test time What you eat for breakfast Hours of sleep Typically your analysis consists of testing multiple regression models to see which “fits” best (comparing r2s of the models) For example: versus versus

Total variability it test performance Multiple Regression Model #1: Some co-variance between the two variables If we know the total study time, we can predict 36% of the variance in test performance R2 for Model = .36 Response variable Total variability it test performance Total study time r = .6 64% variance unexplained

Total variability it test performance Multiple Regression Model #2: Add test time to the model Little co-variance between these test performance and test time We can explain more the of variance in test performance R2 for Model = .49 Response variable Total variability it test performance Total study time r = .6 51% variance unexplained Test time r = .1

Total variability it test performance Multiple Regression Model #3: No co-variance between these test performance and breakfast food Not related, so we can NOT explain more the of variance in test performance R2 for Model = .49 Response variable Total variability it test performance breakfast r = .0 Total study time r = .6 51% variance unexplained Test time r = .1

Total variability it test performance Multiple Regression Model #4: Some co-variance between these test performance and hours of sleep We can explain more the of variance But notice what happens with the overlap (covariation between explanatory variables), can’t just add r’s or r2’s R2 for Model = .60 Response variable Total variability it test performance breakfast r = .0 Total study time r = .6 40% variance unexplained Hrs of sleep r = .45 Test time r = .1

Multiple Regression in SPSS Setup as before: Variables (explanatory and response) are entered into columns A couple of different ways to use SPSS to compare different models

Regression in SPSS Analyze: Regression, Linear

Multiple Regression in SPSS Method 1:enter all the explanatory variables together Enter: Predicted (criterion) variable into Dependent Variable field All of the predictor variables into the Independent Variable field

Multiple Regression in SPSS The variables in the model r for the entire model r2 for the entire model Unstandardized coefficients Coefficient for var1 (var name) Coefficient for var2 (var name)

Multiple Regression in SPSS The variables in the model r for the entire model r2 for the entire model Standardized coefficients Coefficient for var1 (var name) Coefficient for var2 (var name)

Multiple Regression Which  to use, standardized or unstandardized? Unstandardized ’s are easier to use if you want to predict a raw score based on raw scores (no z-scores needed). Standardized ’s are nice to directly compare which variable is most “important” in the equation

Multiple Regression in SPSS Method 2: enter first model, then add another variable for second model, etc. Enter: Click the Next button First Predictor variable into the Independent Variable field Predicted (criterion) variable into Dependent Variable field

Multiple Regression in SPSS Method 2 cont: Enter: Second Predictor variable into the Independent Variable field Click Statistics

Multiple Regression in SPSS Click the ‘R squared change’ box

Multiple Regression in SPSS Shows the results of two models The variables in the first model (math SAT) The variables in the second model (math and verbal SAT)

Multiple Regression in SPSS Shows the results of two models The variables in the first model (math SAT) The variables in the second model (math and verbal SAT) r2 for the first model Model 1 Coefficients for var1 (var name)

Multiple Regression in SPSS Shows the results of two models The variables in the first model (math SAT) The variables in the second model (math and verbal SAT) r2 for the second model Model 2 Coefficients for var1 (var name) Coefficients for var2 (var name)

Multiple Regression in SPSS Shows the results of two models The variables in the first model (math SAT) The variables in the second model (math and verbal SAT) Change statistics: is the change in r2 from Model 1 to Model 2 statistically significant?

Cautions in Multiple Regression We can use as many predictors as we wish but we should be careful not to use more predictors than is warranted. Simpler models are more likely to generalize to other samples. If you use as many predictors as you have participants in your study, you can predict 100% of the variance. Although this may seem like a good thing, it is unlikely that your results would generalize to any other sample and thus they are not valid. You probably should have at least 10 participants per predictor variable (and probably should aim for about 30).