1 732G21/732A35/732G28. Formal statement  Y i is i th response value  β 0 β 1 model parameters, regression parameters (intercept, slope)  X i is i.

Slides:



Advertisements
Similar presentations
Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
Advertisements

Topic 12: Multiple Linear Regression
Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
Inference for Regression
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Objectives (BPS chapter 24)
Simple Linear Regression
Chapter 12 Simple Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
Simple Linear Regression Analysis
732G21/732G28/732A35 Lecture 2. Inference concerning β 1  Confidence interval for β 1 : where  Test concerning β 1 : H 0 : β 1 = 0 H a : β 1 ≠ 0 Reject.
SIMPLE LINEAR REGRESSION
BCOR 1020 Business Statistics
Simple Linear Regression and Correlation
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Prediction concerning Y variable. Three different research questions What is the mean response, E(Y h ), for a given level, X h, of the predictor variable?
Simple Linear Regression Models
EQT 272 PROBABILITY AND STATISTICS
Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Econ 3790: Business and Economics Statistics
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
CHAPTER 14 MULTIPLE REGRESSION
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Introduction to Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
An alternative approach to testing for a linear association The Analysis of Variance (ANOVA) Table.
1 Lecture 4 Main Tasks Today 1. Review of Lecture 3 2. Accuracy of the LS estimators 3. Significance Tests of the Parameters 4. Confidence Interval 5.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Chapter 10 Inference for Regression
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
1 1 Slide © 2011 Cengage Learning Assumptions About the Error Term  1. The error  is a random variable with mean of zero. 2. The variance of , denoted.
REGRESSION AND CORRELATION SIMPLE LINEAR REGRESSION 10.2 SCATTER DIAGRAM 10.3 GRAPHICAL METHOD FOR DETERMINING REGRESSION 10.4 LEAST SQUARE METHOD.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Analysis of variance approach to regression analysis … an (alternative) approach to testing for a linear association.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Applied Regression Analysis BUSI 6220
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Decomposition of Sum of Squares
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Simple Linear Regression
Statistics for Business and Economics (13e)
Slides by JOHN LOUCKS St. Edward’s University.
Simple Linear Regression
Simple Linear Regression
SIMPLE LINEAR REGRESSION
Simple Linear Regression
Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
Simple Linear Regression
Linear Regression and Correlation
Introduction to Regression
Inference about the Slope and Intercept
Presentation transcript:

1 732G21/732A35/732G28

Formal statement  Y i is i th response value  β 0 β 1 model parameters, regression parameters (intercept, slope)  X i is i th predictor value  is i.i.d. normally distributed random vars with expectation zero and variance σ 2 732G21/732A35/732G282

Inference about regression coefficients and response:  Interval estimates and test concerning coefficients  Confidence interval for Y  Prediction interval for Y  ANOVA-table 732G21/732A35/732G283

 After fitting the data, we may obtain a regr. line  Is significant or just because of random variation? (hence, no linear dependence between Y and X)  How to do? ◦ Use Hypothesis testing (later) ◦ Derive confindence interval for β 0. If ”0” does not fall within this interval, there is dependence 732G21/732A35/732G284

 Estimated slope b 1 is a random variable (look at formula) Properties of b 1  Normally distributed (show)  E(b 1 )= β 1  Variance Further: Test statistics is distributed as t(n-2) 732G21/732A35/732G285

 See table B.2 (p. 1317)  Example one-sided interval t(95%), 15 observations t 13 = G21/732A35/732G286

 Confidence interval for β 1 (show…)  If variance in the data is unknown, Example Compute confidence interval for slope, Salary dataset 732G21/732A35/732G287

8

 Often, we have sample and we test at some confidence level α How to do?  Step 1: Find and compute appropriate test function T=T(sample,λ 0 )  Step 2: Plot test function’s distrubution and mark a critical area dependent on α  If T is in the critical area, reject H 0 otherwise do not reject H 0 (accept H 1 ) 732G21/732A35/732G289

 Test  Step 1: compute  Step 2: Plot the distribution, mark the points and the critical area.  Step 3: define where t* is and reject H 0 if it is in the critical area Example Test the hypothesis for Salary dataset:  Manually, compute also P-values  By Minitab 732G21/732A35/732G2810

 Sometimes, we need to know ” β 0 =0?” Do confidence intervals and hypothesis testing in the same way using folmulas below! Properties of b 0  Normally distributed (show)  E(b 0 )= β 0  Variance (show..) Further: Test statistics is distributed as t(n-2) 732G21/732A35/732G2811

 If distribution not normal (if slightly, OK, otherwise asymptotic)  Spacing affects variance (larger spacing –smaller variance) Example Test β 0 =0 for Salary data 732G21/732A35/732G2812

 Estimate at X=X h (X h – any): Properties of E(Y h )  Normally distributed (show)   Variance Further: Test statistics is distributed as t(n-2) Confidence interval 732G21/732A35/732G2813

 Make a plot… 732G21/732A35/732G2814 POINT ESTIMATE CONFIDENCE INTERVAL We estimate the position of the mean in the population with X = X h PREDICTION INTERVAL We estimate the position of the individual observation in the population with X = X h

 When parameters are unknown, the mean E(Y h ) may have more than one possible location  New observation = mean + random error -> prediction interval should be wider 732G21/732A35/732G2815

Further: Test statistics is distributed as t(n-2) Prediction interval  How to estimate s(pred) ? New observ. is any within b 0 +b 1 X h +ε. Hence  Standard error (show)  732G21/732A35/732G2816

Example  Calculate confidence and prediction intervals for 35 years old person  Compare with output in Minitab 732G21/732A35/732G2817

 Total sum of squares  Error sum of squares  Regression sum of squares 732G21/732A35/732G2818

 SSTO has n-1 (sum up to zero)  SSE has n-2 ( 2 model parameters)  SSR has 1 (fitted values lie on regression line= 2 degrees- sum up to zero 1 degree) n-1 = n SSTO =SSE + SSR Important : MSxx= SSxx/degrees_of_freedom 732G21/732A35/732G2819

 ANOVA table 732G21/732A35/732G2820 Source of variation SSdfMS Regression1 Errorn - 2 Totaln - 1

Expected mean squares  E(MSE) does not depend on the slope, even when zero  E(MSR) =E(MSE) when slope is zero  -> IF MSR much more than MSE, slope is not zero, if approximately same, can be zero 732G21/732A35/732G2821

 Test statistics F* = MSR/MSE, use F(1,n-2) (see p. 1320) Decision rules:  If F* > F(1-α;1, n-2) conclude H a  If F* ≤ F(1-α;1, n-2) conclude H 0 Note: F test and t test about β 1 are equivalent 732G21/732A35/732G2822

 General approach  Full model: (linear)  Reduced model: (constant) 732G21/732A35/732G2823

 It is known (why?..) SSE(F)≤SSE(R). Large difference -different models, small difference – can be same  Test statistics  For univariate linear model, equivalent to F* = MSR/MSE  F* belongs to F(df R -df F,df F ) distribution (plot critical area..)  Test rule: F*> F(1-α; df R -df F,df F )  reject H 0 732G21/732A35/732G2824

Example For Salary dataset  Compose ANOVA table and compare with MINITAB  Perform F-test and compare with MINITAB 732G21/732A35/732G2825

 Coefficient of determination:  Coefficient of correlation: Limitations:  High R does not mean a good fit  Low R does not mean than X and Y are not related Example: For Salary dataset, compute R 2 and compare with MINITAB 732G21/732A35/732G2826

 Chapter 2 up to page G21/732A35/732G2827