Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regression single and multiple. Overview Defined: A model for predicting one variable from other variable(s). Variables:IV(s) is continuous, DV is continuous.

Similar presentations


Presentation on theme: "Regression single and multiple. Overview Defined: A model for predicting one variable from other variable(s). Variables:IV(s) is continuous, DV is continuous."— Presentation transcript:

1 Regression single and multiple

2 Overview Defined: A model for predicting one variable from other variable(s). Variables:IV(s) is continuous, DV is continuous Relationship:Relationship amongst variables Example:Can we predict height from weight? (or weight from height?, or weight from multiple variables?, or height from multiple variables?, Assumptions:Normality. Linearity. Multicollinearity

3 Regression is about finding the best straight line

4 The best straight line is the one that minimizes S, the sum of the squares

5 Once we find the best straight line, we know the “intercept” and the “slope”:

6 Same Intercept, Different slope

7 Same slope, Different Intercept

8 Relationship between correlation and regression Correlation expresses the strength and direction of the relationship between two variables. Regression is an extension of correlation, and allows you to make predictions about one variable from other variable(s) Bivariate regression (1 IV and 1 DV) produces the same result as correlation Multiple regression (1+ IVs and 1 DV) goes a step farther than correlation

9 Relationship between correlation and regression Hypothesis: What is the relationship between gun ownership and murder rate within a city? Correlation: Imagine you are a researcher interested in the relationship between number of registered weapons (“weapons”) and the murder rate (“murder”) so you collect data on those two variables from many different cities. You find a strong positive relationship (.885) between the two variables that is statistically significant (p=.003).

10 Relationship between correlation and regression Regression: Now, imagine you are the Mayor of Los Angeles. You are considering lifting the ban on automatic weapons. You want to predict whether lifting the ban (so increasing the number of automatic weapons on the streets) will impact the murder rate. You are going to use the data (from the above 8 cities) to PREDICT the relationship for a 9th city – Los Angeles. You find a strong positive relationship (.885) between the two variables that is statistically significant (p=.003).

11 Relationship between correlation and regression Regression: We can now use numbers from output to create a “regression line” For example, the regression line is: Y = a + bX Y = the unknown score on the variable you are predicting. a = the Y-intercept of the regression line. b = the slope of the regression line. X = the known score on the other variable you are using to make a prediction. Y = a + b * X Murders = 4.047 +.853 * Weapons

12 Relationship between correlation and regression Regression: Y = a + b * X Murders = 4.047 +.853 * Weapons If you are the Mayor of Los Angeles, simply insert into the regression equation the number of weapons on the street in Los Angeles (X), and you can predict the number of murders (Y) If 1000 weapons, then murders will be = 857 If 2000 weapons, then murders will be = 1710 If 3000 weapons, then murders will be = 2563

13 Multiple Regression Using several “predictors” simultaneously Example: Study about internalizing violence (DV) Degree of witnessing violence X1 Measure of life stress X2 Measure of social support X3 DV

14 Multiple Regression Given this diagram, what would you want to know: (1) When all three entered, overall prediction (variance) of DV DV

15 Multiple Regression (2) unique prediction of each variable DV

16

17 Multiple Regression The three things you typically want to know are… Overall effect (of all variables) Unique effect of each variable, while controlling for the others Unique effect of each variable, without controlling for others = R 2 = Beta = correlation matrix (same as separate bivariate regressions)

18 Multiple Regression What we have just talked about is: Entry (all simultaneously) But you have other options as well: Hierarchical (you specify order) Stepwise (computer chooses based on criteria) Backward Forward Stepwise

19 Hierarchical You enter the variables in a specified order (called steps or blocks). Block 1 tells you unique effect of the variable(s) Block 2 tells you unique effect of the new variable(s) And so forth

20 Forward Computer first enters predictor with highest correlation to DV Computer then enters predictor with highest semi- partial correlation to DV (if V1 explained 40% of DV, then 60% unexplained, so which variable is best explainer of the 60%) Computer then enters predictor with highest semi- partial correlation to DV (if V1 and V2 explained 80%, then which variable best explains the 20%, etc) and so forth… Stops when no new variables significantly explains the residual variation.

21 Backward Computer enters all variables and calculates unique contribution of each. A removal criteria is set, and if variable(s) don’t meet the criteria, they are removed from analysis. The new model is then analyzed, if variable(s) don’t meet the criteria, they are removed from the analysis. Stops when no more variables meet criteria

22 Stepwise Combination of Forward and Backward Similar to Forward in that… Computer first enters predictor with highest correlation to DV Computer then enters predictor with highest semi-partial correlation to DV Similar to Backward in that… A removal criteria is set, and if variable(s) don’t meet the criteria, they are removed from analysis

23 How to choose which variables and how Correlational matrix IV Variables somewhat correlated to DV IV Variables not too correlated with other IV Regression Analyze your hypothesis first Then start “exploratory” analysis Statisticians frown upon too much exploratory work as “fishing” Entry and Hierarchical preferred over stepwise. If stepwise, Backward preferred over others.


Download ppt "Regression single and multiple. Overview Defined: A model for predicting one variable from other variable(s). Variables:IV(s) is continuous, DV is continuous."

Similar presentations


Ads by Google