Download presentation
Presentation is loading. Please wait.
1
Simple Linear Regression
Question Is annual carbon dioxide concentration related to annual global temperature? We want to extend our inquiry into data by looking at the comparison of two groups.
2
Simple Linear Regression
Response variable, Y. Annual global temperature (o C). Explanatory (predictor) variable, x. Annual atmospheric CO2 concentration.
3
Regression model Y represents a value of the response variable.
represents the population mean response for a given value of the explanatory variable, x. represents the random error For each individual in our population we can model the value of the variable of interest (in our example global temperature) as being a population mean value that is related to the carbon dioxide concentration plus some random error (the random error could be a positive or negative values).
4
Linear Model The Y-intercept parameter. The slope parameter.
The linear model says that the mean response for values of x is linearly related to the x values. The linear relationship has a Y-intercept parameter and a slope parameter.
5
Conditions The relationship is linear. The random error term, , is
Independent Identically distributed Normally distributed with standard deviation, . These are the usual normal model conditions mentioned in Stat They are really no different from the one-sample, or two-sample model conditions. We will come back to these later, after we have fit a linear model to the data and examined its usefulness. Draw picture on the blackboard.
6
There appears to be a positive trend, as CO2 increases so does temperature. That is, above (below) average values of temperature are related to above (below) average values of CO2. The trend appears to be somewhat linear. It is a fairly strong linear trend because points are closely clustered around the overall positive trend. There are no unusual points.
7
Describe the plot. Direction – positive/negative.
Form – linear/non-linear. Strength. Unusual points?
8
Method of Least Squares
Find estimates of such that the sum of squared vertical deviations from the estimated straight line is the smallest possible.
9
Least Squares Estimates
These are the formulas but we will be using JMP to do the heavy lifting.
11
Linear Fit Predicted Temp = 9.8815 + 0.012584*CO2
The first equation is generic. Always clearly state that the linear fit gives you the predicted response as a linear function of the explanatory (predictor) variable within the context of the problem.
12
Interpretation Estimated Y-intercept.
This does not have an interpretation within the context of the problem. Having no CO2 in the atmosphere is not reasonable given the data.
13
Interpretation Estimated slope.
For each additional 1 ppmv of CO2, the annual global temperature goes up o C, on average. The interpretation of the estimated slope is very important. Learn to interpret the estimated slope correctly. The slope is rise over run, so a 1 ppmv increase in carbon dioxide gives a degrees C increase in temperature. But the slope is based on data and so it is actually an average change in the response. This idea of an average change is very important. Always put the interpretation into the context of the problem.
14
The least squares regression line always goes through the point (mean x, mean y). For our CO2/Temp data the average CO2 is and the average Temp is If you plug into the prediction equation for CO2, you should get out for the predicted value of Temp.
15
How Strong? The strength of a linear relationship can be measured by R2, the coefficient of determination. RSquare in JMP output.
16
How Strong?
17
Interpretation 80.6% of the variation in the global temperature can be explained by the linear relationship with carbon dioxide concentration. 19.4% is unexplained.
18
Interpretation There is a fairly strong positive linear relationship between carbon dioxide concentration and global temperature. Cause and effect? Be careful. Even though there is a fairly strong positive linear relationship between carbon dioxide concentration and global temperature we cannot say that the carbon dioxide concentration is causing the rise in global temperature based on these data and linear regression alone. Remember that this is an observational study and so there can be all sorts of variables that we have not looked at.
19
Cause and Effect? There is a strong positive linear relationship between the number of 2nd graders in communities and the number of crimes committed in those communities. Does this mean that 2nd graders are causing the crimes? No, not necessarily. There is a lurking variable (size of the community) that may be related to both variables. Larger communities have more second graders and more crimes, smaller communities have fewer 2nd graders and fewer crimes. Although, there is not a cause and effect relationship we can still use number of 2nd graders to predict the number of crimes.
20
Connection to Correlation
If you square the correlation coefficient, r, relating carbon dioxide to global temperature you get R2, the coefficient of determination. Working backwards, take the square root of RSquare and attach the appropriate sign (-/+).
21
Connection to Correlation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.