[1] Simple Linear Regression. The general equation of a line is Y = c + mX or Y =  +  X.  > 0  > 0  > 0  = 0  = 0  < 0  > 0  < 0.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Objectives (BPS chapter 24)
Chapter 12 Simple Linear Regression
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 10 Simple Regression.
The Simple Regression Model
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
Simple Linear Regression Analysis
SIMPLE LINEAR REGRESSION
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Simple Linear Regression and Correlation
Simple Linear Regression Analysis
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Simple linear regression Linear regression with one predictor variable.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
Diploma in Statistics Introduction to Regression Lecture 2.21 Introduction to Regression Lecture Review of Lecture 2.1 –Homework –Multiple regression.
EQT 272 PROBABILITY AND STATISTICS
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Introduction to Linear Regression
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
1 Lecture 4 Main Tasks Today 1. Review of Lecture 3 2. Accuracy of the LS estimators 3. Significance Tests of the Parameters 4. Confidence Interval 5.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Environmental Modeling Basic Testing Methods - Statistics III.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Regression. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other words, there is a distribution.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Correlation and Regression Elementary Statistics Larson Farber Chapter 9 Hours of Training Accidents.
Chapter 26: Inference for Slope. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other.
REGRESSION AND CORRELATION SIMPLE LINEAR REGRESSION 10.2 SCATTER DIAGRAM 10.3 GRAPHICAL METHOD FOR DETERMINING REGRESSION 10.4 LEAST SQUARE METHOD.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
CHAPTER 12 More About Regression
Chapter 4 Basic Estimation Techniques
Statistics for Managers using Microsoft Excel 3rd Edition
Least Square Regression
CHAPTER 12 More About Regression
Quantitative Methods Simple Regression.
Inference for Regression Lines
Simple Linear Regression
Unit 3 – Linear regression
SIMPLE LINEAR REGRESSION
CHAPTER 12 More About Regression
SIMPLE LINEAR REGRESSION
Simple Linear Regression
CHAPTER 12 More About Regression
Introduction to Regression
Presentation transcript:

[1] Simple Linear Regression

The general equation of a line is Y = c + mX or Y =  +  X.  > 0  > 0  > 0  = 0  = 0  < 0  > 0  < 0

[3] Regression analysis is a technique for quantifying the relationship between a response variable (or dependent variable) and one or more predictor (independent or explanatory) variables. Two Main Purposes: To predict the dependent variable based on specified values for the predictor variable(s). To understand how the predictor variable(s) influence or relate to the dependent variable.

Example - Humidity Data The raw material used in the production of a certain synthetic fiber is stored in a location without humidity control. Measurements of the relative humidity in the storage location and the moisture content (in %) of a sample of the raw material were taken over 15 days. Rel. Humidity: 46, 53, 29, 61, 36, 39, 47, 49, Mois. Content: 12, 15, 7, 17, 10, 11, 11, 12, Rel. Humidity: 52, 38, 55, 32, 57, 54, 44. Mois. Content: 14, 9, 16, 8, 18, 14, 12. Relative Humidity takes the role of explanatory variable. Moisture Content takes the role of dependent variable.

[5]

[6] The Regression Model The Simple Linear Regression Model can be stated as Y i =  +  X i +  i Y i is the value of the response variable in the i th trial  and  are the intercept and slope parameters X i is a known constant, namely the value of the explanatory variable in the ith trial  i is an unobservable random error term such that  i ~N(0,  2 ).  i is also referred to as the stochastic element of the regression model Y i =  +  X i +  i.

[7] Minimise Vertical Distances of Data to ‘Best Fit Line’

[8] Formulae For Least Squares Method

LEAST SQUARES ESTIMATES

[11]

[12]

[13]

[14] RESIDUAL = DATA - MODEL = SSE

[15]

[16] Explained Variation Unexplained Variation Total Variation =+ where p equals the number of parameters being estimated, in our case p = 2, (. intercept and slope).

[17] A Measure of the Relative Goodness-Of-Fit R 2 is interpreted as the percentage variation in the response variable Y, explained through the simple linear regression on the explanatory variable X.

[18] The regression equation is Moisture = Humidity Predictor Coef StDev T P Constant Humidity S = R-Sq = 91.1% Analysis of Variance Source DF SS MS F P Regression Error Total on 13 degrees of freedom

[19] Estimating A Confidence Interval for  Using statistical theory we can derive a formula for the standard error of  We may use a confidence interval to quantify the uncertainty associated with the slope. A confidence interval will be calculated as the point estimate + a value from the tables times the standard error of the point estimate…...

[20] Comes from a t-distribution on (n-2) = 13 degrees of freedom Read from MINITAB output.

[21] Hypothesis Testing About  Ho:  = 0 (% Moist. per Rel. Hum.) Ha:   0 (% Moist. per Rel. Hum.) With a 0.05 level of significance the decision rule is reject Ho if t* t-distribution on 13 df 2.5% 95% 2.5% Reject Ho:  = 0

[22] The regression equation is Moisture = Humidity Predictor Coef StDev T P Constant Humidity S = R-Sq = 91.1% Analysis of Variance Source DF SS MS F P Regression Error Total = 

[23] Statistical Inference for 

[24] The regression equation is Moisture = Humidity Predictor Coef StDev T P Constant Humidity S = R-Sq = 91.1%  = Analysis of Variance Source DF SS MS F P Regression Error Total

[25] F-Test:Ho:  = 0 (% Moist. per Rel. Hum.) Ha:   0 (% Moist. per Rel. Hum.) Note: Large values of F* lead to the rejection of Ho Critical Value = F.05 = df Numerator, 13df Denominator

[26] 4.67 Do Not Reject H 0 Reject H 0 Area = 5% Decision Rule: Fail to accept Ho if F* = MSR/MSE < 4.67 Reject Ho if F* = MSR/MSE > 4.67

[27] options(show.signif.stars = FALSE) humidity = c(46, 53, 29, 61, 36, 39, 47, 49, 52, 38, 55, 32, 57, 54, 44) moisture = c(12, 15, 7, 17, 10, 11, 11, 12, 14, 9, 16, 8, 18, 14, 12) slr = lm( moisture ~ humidity ) slr summary(slr) anova(slr) plot(x = humidity, y = moisture) abline(slr, col = "red", lwd = 2) confint(slr) fits = predict(slr, data.frame( humidity = seq(30,60,by=0.1)), se.fit = TRUE) lines(seq(30,60,by=0.1), fits$fit + 2 * fits$se.fit, col = "blue", lty = 2) lines(seq(30,60,by=0.1), fits$fit - 2 * fits$se.fit, col = "blue", lty = 2)

[28] Mail Processing Hours (Fiscal Years )

[29] Line plots of Manhours and Volume

[30] Line plots of Manhours and Volume Christmas excluded

[31] Scatter plots of Manhours and Volume

[32] Scatter plots of Manhours and Volume with curve representing return to scale

[33] Simple linear regression model with Normal model for chance variation Y = α + β X + ε

[34] The simple linear regression model Y = α + βX + ε Y is the Response variable X is the Explanatory variable Model parameters: α and β are the linear parameters hidden parameter, standard deviation σ, measures spread of Normal curve

[35] The simple linear regression model Choosing values for the regression coefficients –the method of least squares Interpreting the fitted line Using the fitted line; prediction A model for chance causes of variation Estimating 

[36] Case study: Mail processing costs in a U.S. Post Office

[37] Scatter plots of Manhours and Volume

[38] Scatter plot with grid (to assist in reading x- and y-values)

[39] Simple linear regression model with Normal model for chance variation Y = α + β X + ε

[40] The simple linear regression model Y = α + βX + ε Y is the Response variable X is the Explanatory variable Model parameters: α and β are the linear parameters hidden parameter, standard deviation σ, measures spread of Normal curve

[41] Choosing values for the regression coefficients Given values for  and , the fitted values of Y are  +  X 1,  +  X 2,  +  X 3,   +  X n

[42] Find values for  and  that minimise the deviations Y 1 −  −  X 1, Y 2 −  −  X 2, Y 3 −  −  X 3,  Y n −  −  X n Choosing values for the regression coefficients

[43] Trial regression lines, with "residuals"

[44] The method of least squares Find values for  and  that minimise the sum of the squared deviations: (Y 1 −  −  X 1 ) 2 + (Y 2 −  −  X 2 ) 2 + (Y 3 −  −  X 3 ) 2  + (Y n −  −  X) 2

[45] "Least squares" regression line, with "residuals"

[46] The method of least squares Solution: For these data,

[47] Interpretation is the marginal change in Y for a unit change in X. Check the measurement units! is overheads. WARNING

[48] "Least squares" regression line, with non-linear extensions

[49] Using the fitted line; prediction Prediction equation: Prediction equation allowing for chance variation: Original model: SD = 

[50] Simple linear regression model with Normal model for chance variation Y = α + β X + ε

[51] Estimating   measures spread of deviations from the true line. Estimate  by s, the standard deviation of deviations from the fitted line, via fitted values: and residuals: = 20 for our example

[52] The estimated model: Exercise Use the prediction formula to estimate the loss incurred through equipment breakdown in Period 6, Fiscal 1962, when Y was 765 and X was 180.

[53] Homework Given the Volume figures for periods 1, 6 and 7 of Fiscal Year 1963, what predictions, including prediction errors, would you make for the Manhours requirement? Recall: How do these predictions relate to the actual manhours used? Comment.

[54] Case study: Mail processing costs in a U.S. Post Office

[55] Scatter plots of Manhours and Volume

[56] Simple linear regression model with Normal model for chance variation Y = α + β X + ε

[57] Calculating the regression by formula: For these data,

[58] Calculating the regression by computer

[59] The "constant" variable? Y = α + βX + ε Y = α × 1 + β × X + ε

[60] Calculating the prediction formula Manhours = × Volume  2 × 18.93

[61] Standard errors of estimated regression coefficients Regression coefficient estimate subject to chance variation Normal model applies Standard deviation of the Normal model is the standard error of the coefficient estimate

[62] Application 1 Confidence interval for marginal change Recall confidence interval for  or Confidence interval for  :

[63] More results Exercise:Calculate a 95% confidence interval for . Calculate a 95% CI for change in manhours corresponding to a 10m increase in pieces of mail handled.

[64] Point Estimate Standard Error 95% CI ± 2× ± to 4.026

[65] Point Estimate Standard Error 95% CI ± × ± to to 4.026using betahat + 2 SE(betahat) 21 df

[66] Point Estimate Standard Error 95% CI ± 2× ± to 40.26

[67] Application 2 Testing the statistical significance of the slope Formal test: H 0 :  = 0 Test statistic: Calculated value: 9.84 Critical value: (t-dist, 21df) or 2 (approx) Comparison: | 9.84 | > cutoff Conclusion:REJECT H 0