REGRESSION REVISITED
PATTERNS IN SCATTER PLOTS OR LINE GRAPHS Pattern Pattern Strength Strength Regression Line Regression Line Linear Linear y = mx + b Quadratic Quadratic y = ax 2 + bx + c Negative linear - weak No relationship Positive linear - strong Curvilinear
MODEL EVALUATION MODEL EVALUATION We want to test and see if the overall regression model is “statistically significant” We want to test and see if the overall regression model is “statistically significant” Null: The model is not significant (no effect) Null: The model is not significant (no effect) Alternative: The model is significant (there is an effect) or not all model coefficients are zero. Alternative: The model is significant (there is an effect) or not all model coefficients are zero. Time (years)# of cedar trees in 1 acre plot % of the variation in the number of cedar trees is explained by this linear model. This is a measure of how good the model fits the data. It does not tell us if the overall model is statistically significant.
MODEL EVALUATION MODEL EVALUATION In excel: Data: Data Analysis: Regression In excel: Data: Data Analysis: Regression An F-test and associated significance value is used to test overall model significance. An F-test and associated significance value is used to test overall model significance. A t-test and associated p-value is used to test if individual coefficients are significantly different than zero or that variable significantly predicts the y values (response). A t-test and associated p-value is used to test if individual coefficients are significantly different than zero or that variable significantly predicts the y values (response). SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations13 ANOVA dfSSMSFSignificance F Regression Residual Total CoefficientsStandard Errort StatP-valueLower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept X Variable Reject the null of p < your chosen significance level. Typically, 0.01 or 0.05
MODEL EVALUATION MODEL EVALUATION No linear relationship. No linear relationship. What about a quadratic relationship? What about a quadratic relationship? Coefficients are not easy to interpret like the slope = rate of change in the linear model. Coefficients are not easy to interpret like the slope = rate of change in the linear model. Model interpretation is that initially the number of cedar trees increase and then eventually the numbers fall off. Biological explanation is that cedar numbers grow quickly but eventually taller deciduous trees dominate the forest canopy and ultimately restrict their growth and decrease their numbers. Model interpretation is that initially the number of cedar trees increase and then eventually the numbers fall off. Biological explanation is that cedar numbers grow quickly but eventually taller deciduous trees dominate the forest canopy and ultimately restrict their growth and decrease their numbers. 89% of the variation in the number of cedar trees is explained by this quadratic model. This still doesn’t tell if the overall model is statistically significant.
MODEL EVALUATION MODEL EVALUATION In excel you have to manually form a column of x 2 values to use in the Data Analysis: Regression In excel you have to manually form a column of x 2 values to use in the Data Analysis: Regression X Variable 1 = x values X Variable 1 = x values X Variable 2 = x 2 values X Variable 2 = x 2 values SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations13 ANOVA dfSSMSFSignificance F Regression E-05 Residual Total CoefficientsStandard Errort StatP-valueLower 95%Upper 95% Lower 95.0% Upper 95.0% Intercept X Variable E X Variable E
REGRESSION RECAP REGRESSION RECAP Plot the data using either a line graph or scatter plot Plot the data using either a line graph or scatter plot Visually determine a pattern and strength of pattern Visually determine a pattern and strength of pattern Add either a linear or quadratic trendline Add either a linear or quadratic trendline Evaluate the model based upon Evaluate the model based upon R 2 for model fit R 2 for model fit F-test and individual t-test for model and variable significance. F-test and individual t-test for model and variable significance. Reject the null and conclude the model and/or variable(s) are significant if p < 0.05 or 0.01 Reject the null and conclude the model and/or variable(s) are significant if p < 0.05 or 0.01 If not a significant model, try a different model and evaluate that model If not a significant model, try a different model and evaluate that model Once you find a significant model, interpret the model or trend Once you find a significant model, interpret the model or trend Give possible biological explanations for the model or trend Give possible biological explanations for the model or trend