Download presentation
Presentation is loading. Please wait.
Published byCandace Jefferson Modified over 9 years ago
1
Understanding Multivariate Research Berry & Sanders
2
Regression Assumptions 1. The independent variables are measured at the interval level or are dichotomous (1/0). 2. The dependent variable is continuous. 3. The variables in the model are measured perfectly (no measurement error). 4. The effect of the independent variable, X, on the dependent variable, Y, is linear. 5. The error or disturbance term is completely uncorrelated with the independent variables. 6. The effects of all independent variables on the dependent variable are additive.
3
Multivariate Regression Y i = b 0 + b 1 X 1i + b 2 X 2i + b 3 X 3i + e i Each slope coefficient (b i ) measures the responsiveness of the dependent variable to a one-unit change in the associated independent variable when the other independent variables are held constant.
4
Example Y: body weight (lbs) X 1 : food intake (average daily calories) X 2 : exercise (average daily expenditure, calories) X 3 : gender (1=male, 0=female)
5
Table 3.1 Regression Model of Body Weight Coefficient Intercept152.0 FOOD 0.028 EXERCISE-0.045 MALE 35.00
6
Interpretation Intercept: A female who eats no food and does not exercise weighs 152 pounds. FOOD: A one calorie increase in daily average food intake increases a person’s weight by.028 pounds. A 100 calorie increase results in a 2.8 pound increase in weight (100x.028). EXERCISE: A one calorie increase in calories expended through exercise decreases a person’s weight by 0.045 pounds.
7
Interpretation (continued) MALE (dichotomous) The coefficient can be interpreted as the difference in the expected value of Y between a case for which X=0 and a case for which X=1 (holding all other independent variables constant). For two individuals with identical food intake and exercise, a man can expect to weigh 35 pounds more than a woman.
8
Elements of a Regression Model 1. Measuring the fit of the model: based on a comparison of the actual and predicted values of Y. The further away data points fall from the regression line, the worse the fit. 2. R 2 = the proportion of the variation in Y that is explained by the independent variables, or the squared correlation between the actual and predicted values. R 2 ranges from 0 to 1 with 1 indicating a perfect fit (all points on the regression line).
9
Elements (continued) 3. Statistical Significance H o : β i = 0 H 1 : β i > 0 (or < or not equal) t = b i / s.e. Rule of thumb: If t > 2 or t < -2, then the coefficient is statistically significant (we reject the null hypothesis that the coefficient is zero).
10
Elements (continued) Confidence level (95%): We calculate a partial slope coefficient for a sample (b i ). We can calculate a confidence interval around this estimate, within which we would expect the true (population) coefficient (β i ) to fall in 95 of 100 samples.
11
Potential Problems Multicollinearity: a high (or perfect) correlation among any of the independent variables (e.g. education and income) Heteroskedasticity: Non-constant variance in errors Autocorrelation (or serial correlation): The error terms are correlated; very common in time series data All of these problems create inefficiencies (increasing standard errors), but they do not affect our slope coefficients (they remain unbiased).
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.