Download presentation
Presentation is loading. Please wait.
Published byBaldric Briggs Modified over 9 years ago
1
Regression using lm lmRegression.R Basics Prediction World Bank CO2 Data
2
Simple Linear regression Simple linear model: y = b 1 + x b 2 + error y: the dependent variable x: the independent variable b 1, b 2 : intercept and slope coefficients error: random departures between the model and the response. Coefficients estimated by least squares
3
Multiple regression y = b 0 + x 1 b 1 + x 2 b 2 + x 3 b 3 + … + error
4
Annual Boulder Temperatures Temperature is dependent variable, Year is the independent variable Errors =???? Linear =???
5
CO 2 Emissions by Country Independent: GDP/capita Dependent: CO2 emission Linear?? Errors ??
6
The R lm function Takes a formula to describe the regression where ~ means equals Works best when the data set is a data frame Returns a complicated list that can be used in summary, predict, print plot lmFit <- lm( y ~ x1 + x2)
7
Or more generally using a data frame lmFit <- lm( y ~ x1 + x2, data=dataset) dataset$y, dataset$x1, dataset$x2
8
Analysis of World Bank data set Best to work on a log scale and GDP has the strongest linear relationship Some additional pattern leftover in the residuals Try other variables Try a more complex curve Check the predictions using cross-validation
9
Leave-one-out Cross-validation Robust way to check a models predictions and the uncertainty measure Four steps: 1.Sequentially leave out each observation 2.Refit model with remaining data 3.Predict the omitted observation 4.Compare prediction and confidence interval to the actual observation A check on the consistency of the statistical model Because omitted observation is not used to make prediction
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.