Chapter 12 Simple Linear Regression

Slides:



Advertisements
Similar presentations
4/9/ :38 AM Department of Epidemiology and Health Statistics,Tongji Medical College (Dr. Chuanhua Yu)
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Linear regression models
Objectives (BPS chapter 24)
Chapter 10 Simple Regression.
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
SIMPLE LINEAR REGRESSION
Korelasi dalam Regresi Linear Sederhana Pertemuan 03 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Pertemua 19 Regresi Linier
Correlation and Regression Analysis
Simple Linear Regression and Correlation
Chapter 7 Forecasting with Simple Regression
Correlation & Regression
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
Chapter 14 Simple Regression
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Chapter 9 Statistical Inferences Based on Two Samples Business Statistics in Practice.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Chapter 13 Simple Linear Regression
The simple linear regression model and parameter estimation
Principles of Biostatistics
Chapter 4: Basic Estimation Techniques
Chapter 20 Linear and Multiple Regression
Chapter 4 Basic Estimation Techniques
Regression Analysis AGEC 784.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistics for Managers using Microsoft Excel 3rd Edition
Regresi dan Korelasi Pertemuan 10
Correlation and Simple Linear Regression
Linear Regression and Correlation Analysis
Chapter 11 Simple Regression
Correlation and Simple Linear Regression
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Unit 3 – Linear regression
PENGOLAHAN DAN PENYAJIAN
Correlation and Simple Linear Regression
Undergraduated Econometrics
SIMPLE LINEAR REGRESSION
Simple Linear Regression and Correlation
Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
Simple Linear Regression
Linear Regression and Correlation
Introduction to Regression
St. Edward’s University
REGRESSION ANALYSIS 11/28/2019.
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Chapter 12 Simple Linear Regression 宇传华

Terminology Linear regression 线性回归 Response (dependent) variable 反应(应)变量 Explanatory (independent) variable 解释(自)变量 Linear regression model      线性回归模型 Regression coefficient 回归系数 Slope 斜率 Intercept 截距 Method of least squares 最小二乘法 Error sum of squares or residual sum of squares 残差(剩余)平方和 Coefficient of determination 决定系数 Outlier 异常点(值) Homoscedasticity 方差齐同 heteroscedasticity     方差非齐同      

Contents 12.2 The Simple Linear Regression Model 12.1 An example 12.2 The Simple Linear Regression Model 12.3 Estimation: The Method of Least Squares 12.4 Error Variance and the Standard Errors of Regression Estimators 12.5 Confidence Intervals for the Regression Parameters 12.6 Hypothesis Tests about the Regression Relationship 12.7 How Good is the Regression? 12.8 Analysis of Variance Table and an F Test of the Regression Model 12.9 Residual Analysis 12.10 Prediction Interval and Confidence Interval

12.1 An example Table18.1 IL-6 levels in brain and serum (pg/ml) of 10 patients with subarachnoid hemorrhage (蛛网膜下腔出血) Patient i Serum IL-6 (pg/ml) x Brain IL-6 (pg/ml) y 1 22.4 134.0 2 51.6 167.0 3 58.1 132.3 4 25.1 80.2 5 65.9 100.0 6 79.7 139.1 7 75.3 187.2 8 32.4 97.2 9 96.4 192.3 10 85.7 199.4

Scatterplot This scatterplot locates pairs of observations of serum IL-6 on the x-axis and brain IL-6 on the y-axis. We notice that: Larger (smaller) values of brain IL-6 tend to be associated with larger (smaller) values of serum IL-6 . The scatter of points tends to be distributed around a positively sloped straight line. The pairs of values of serum IL-6 and brain IL-6 are not located exactly on a straight line. The scatter plot reveals a more or less strong tendency rather than a precise linear relationship. The line represents the nature of the relationship on average.

Examples of Other Scatterplots Y

12.2 The Simple Linear Regression Model The population simple linear regression model: y= a + b x +  or my|x=a+b x Nonrandom or Random Systematic Component Component Where y is the dependent (response) variable, the variable we wish to explain or predict; x is the independent (explanatory) variable, also called the predictor variable; and  is the error term, the only random component in the model, and thus, the only source of randomness in y. my|x is the mean of y when x is specified, all called the conditional mean of Y. a is the intercept of the systematic component of the regression relationship.  is the slope of the systematic component.

Picturing the Simple Linear Regression Model Regression Plot The simple linear regression model posits (假定) an exact linear relationship between the expected or average value of Y, the dependent variable Y, and X, the independent or predictor variable: my|x= a+b x Actual observed values of Y (y) differ from the expected value (my|x ) by an unexplained or random error(e): y = my|x +  = a+b x +  Y my|x=a +  x { y } } Error:   = Slope 1 { a = Intercept X x

Errors in Regression y= a+ bx + e Y . yi { X xi

12.3 Estimation: The Method of Least Squares squared e rrors in r egression is: n n å å $ SSE = e 2 = (y - y ) 2 SSE: 残差平方和 i i i i = 1 i = 1 The least squa res regres sion line is that which minimizes the SSE with respe ct to the estimates a and b .

Example 12-1

Example 12-1: Using Computer-Excel The results on the bottom are the output created by selecting REGRESSION (回归)option from the DATA ANALYSIS(数据分析) toolkit. 完全安装Office后,点击菜单“工具”“加载宏”可安装“数据分析”插件

Total Variance and Error Variance Y X What you see when looking at the total variation of Y. What you see when looking along the regression line at the error variance of Y.

12.4 Error Variance and the Standard Errors of Regression Estimators Y Square and sum all regression errors to find SSE. X

Standard Errors of Estimates in Regression

18.5 Confidence Intervals for the Regression Parameters

12.6 Hypothesis Tests about the Regression Relationship Constant Y Unsystematic Variation Nonlinear Relationship Y X Y X Y X H0:b =0 H0:b =0 H0:b =0 A hypothes is test fo r the exis tence of a linear re lationship between X and Y: H : b = H : b ¹ 1 Test stati stic for t he existen ce of a li near relat ionship be tween X an d Y: sb b where b is the le ast - squares es timate of the regres sion slope and is the s tandard er ror of When the null hypot hesis is t rue, the stati stic has a t distribu tion with n - 2 degrees o f freedom.

Hypothesis Tests for the Regression Slope

12.7 How Good is the Regression? The coefficient of determination, R2, is a descriptive measure of the strength of the regression relationship, a measure how well the regression line fits the data. R2:决定系数 Y . } { Unexplained Deviation Total Deviation { Explained Deviation Percentage of total variation explained by the regression. R2= X

The Coefficient of Determination 决定系数 Y Y Y X X X SST SST SST S E R2=0 SSE R2=0.50 SSE SSR R2=0.90 SSR

12.8 Analysis of Variance Table and an F Test of the Regression Model

12.9 Residual Analysis Residuals Homoscedasticity: Residuals appear completely random. No indication of model inadequacy. Curved pattern in residuals resulting from underlying nonlinear relationship. Residuals exhibit a linear trend with time. Time Heteroscedasticity: Variance of residuals changes when x changes.

Assumptions of the Simple Linear Regression Model The relationship between X and Y is a straight-Line 线性relationship. The values of the independent variable X are assumed fixed (not random); the only randomness in the values of Y comes from the error term . The errors  are uncorrelated (i.e. Independent独立) in successive observations. The errors  are Normally正态 distributed with mean 0 and variance 2(Equal variance等方差). That is: ~ N(0,2) LINE assumptions of the Simple Linear Regression Model Y my|x=a +  x y Identical normal distributions of errors, all centered on the regression line. N(my|x, sy|x2) x X

12.10 Prediction Interval and Confidence Interval Point Prediction A single-valued estimate of Y for a given value of X obtained by inserting the value of X in the estimated regression equation. Prediction Interval For a value of Y given a value of X Variation in regression line estimate Variation of points around regression line For confidence interval of an average value of Y given a value of X

Confidence Interval for the Average Value of Y and Prediction Interval for the Individual Value of Y

Summary 1. Regression analysis is applied for prediction while control effect of independent variable X. 2. The principle of least squares in solution of regression parameters is to minimize the residual sum of squares. 3. The coefficient of determination, R2, is a descriptive measure of the strength of the regression relationship. 4. There are two confidence bands: one for mean predictions and the other for individual prediction values 5. Residual analysis is used to check the conditions for which the model is true

Assignments 1. What is the main distinctions and assossiations between correlation analysis and simple linear regression? 2. What is the least squares method to estimate regression line? 3. Please describe the main steps for fitting a simple linear regression model with data.

main distinctions Difference: 1. Data source: correlation analysis is required that both x and y follow normal distribution; but for simple linear regression, only y is required following normal distribution. 2. application: correlation analysis is employed to measure the association between two random variables (both x and y are treated symmetrically) simple linear regression is employed to measure the change in y for x (x is the independent varible, y is the dependent variable) 3. r is a dimensionless number, it has no unit of measurement; but b has its unit which relate to y.

main associations relationship: 1. tr=tb 2. Have same sign between r and b.