MARE 250 Dr. Jason Turner Linear Regression
Linear regression investigates and models the linear relationship between a response (Y) and predictor(s) (X) Both the response and predictors are continuous variables (“Responses”) Linear regression analysis is used to: - determine how the response variable changes as a particular predictor variable changes - predict the value of the response variable for any value of the predictor variable
Regression vs. Correlation Linear regression investigates and models the linear relationship between a response (Y) and predictor(s) (X) Both the response and predictors are continuous variables (“Responses”) Correlation coefficient (Pearson) – measures the extent of a linear relationship between two continuous variables (“Responses”)
When Regression vs. Correlation? Linear regression - used to predict relationships, extrapolate data, quantify change in one versus other is weighted direction Correlation coefficient (Pearson) – used to determine whether there is a relationship or not IF Regression – then it matters which variable is the Response (Y) and which is the predictor (X) Y – (Dependent variable) X – (Independent) X causes change in Y (Y outcome dependent upon X) Y Does Not cause change in X (X –Independent)
Linear Regression Regression provides a line that "best" fits the data (from response & predictor) The least-squares criterion (method used to draw this "best line“) requires that the best-fitting regression line is the one with the smallest sum of the squared error terms (the distance of the points from the line).
Linear Regression The R 2 and adjusted R 2 values represent the proportion of variation in the response data explained by the predictors Adjusted R 2 is a modified R 2 that has been adjusted for the number of terms in the model. If you include unnecessary terms, R 2 can be artificially high
y Linear Regression y = b 0 + b 1 x y = dependent variable b 0 + b 1 = are constants b 0 = y intercept b 1 = slope x = independent variable Urchin density = b 0 + b 1 (%coral)
Effects of Outliers Outliers may be influential observations A data point whose removal causes the regression equation (line) to change considerably Consider removal much like an outlier If no explanation – up to researcher
Warning on Regression Regression is based upon assumption that data points are scattered about a straight line What can we do to determine if a Regression is warranted?
Coefficient of Determination ( R 2 ) Coefficient of Determination ( R 2 ) - Expression of the proportion of the total variability in the response (s) attributable to the dependence of all of the factors R 2 – used for assessing the “goodness of fit” of a regression model
Coefficient of Determination ( R 2 ) Should use Adjusted R 2 as it is a more conservative measure R 2 values range from 0 to 100%. An R 2 of 100% means that all of the variability in the data can be explained by the model
Coefficient Relationships The coefficient of determination (r 2 ) is the square of the linear correlation coefficient (r)
In Lab…
Regression Analysis: _ Urchins versus % Coral
In Lab… Regression Analysis: _ Urchins versus % Coral
In Lab… Regression Analysis: _ Urchins versus % Coral