Reading – Linear Regression Le (Chapter 8 through 8.1.6) C &S (Chapter 5:F,G,H)

Slides:



Advertisements
Similar presentations
Chapter 12 Simple Linear Regression
Advertisements

EPI 809/Spring Probability Distribution of Random Error.
Definition  Regression Model  Regression Equation Y i =  0 +  1 X i ^ Given a collection of paired data, the regression equation algebraically describes.
Chapter 12 Simple Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Creating Graphs on Saturn GOPTIONS DEVICE = png HTITLE=2 HTEXT=1.5 GSFMODE = replace; PROC REG DATA=agebp; MODEL sbp = age; PLOT sbp*age; RUN; This will.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 10 Simple Regression.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 12a Simple Linear Regression
The Simple Regression Model
Chapter Topics Types of Regression Models
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
REGRESSION AND CORRELATION
SIMPLE LINEAR REGRESSION
Korelasi dalam Regresi Linear Sederhana Pertemuan 03 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
Correlation and Regression Analysis
Simple Linear Regression and Correlation
1 1 Slide Simple Linear Regression Chapter 14 BA 303 – Spring 2011.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Simple Linear Regression. Types of Regression Model Regression Models Simple (1 variable) LinearNon-Linear Multiple (2
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
EQT 272 PROBABILITY AND STATISTICS
Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation Note: Homework Due Thursday.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Regression For the purposes of this class: –Does Y depend on X? –Does a change in X cause a change in Y? –Can Y be predicted from X? Y= mX + b Predicted.
Copyright © Cengage Learning. All rights reserved. 12 Simple Linear Regression and Correlation
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
CORRELATION: Correlation analysis Correlation analysis is used to measure the strength of association (linear relationship) between two quantitative variables.
Chapter 13 Multiple Regression
Regression Analysis Relationship with one independent variable.
Simple Linear Regression In the previous lectures, we only focus on one random variable. In many applications, we often work with a pair of variables.
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 18 Introduction to Simple Linear Regression (Data)Data.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Free Powerpoint Templates ROHANA BINTI ABDUL HAMID INSTITUT E FOR ENGINEERING MATHEMATICS (IMK) UNIVERSITI MALAYSIA PERLIS.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
1 Experimental Statistics - week 12 Chapter 11: Linear Regression and Correlation Chapter 12: Multiple Regression.
Free Powerpoint Templates ROHANA BINTI ABDUL HAMID INSTITUT E FOR ENGINEERING MATHEMATICS (IMK) UNIVERSITI MALAYSIA PERLIS.
BUSINESS MATHEMATICS & STATISTICS. Module 6 Correlation ( Lecture 28-29) Line Fitting ( Lectures 30-31) Time Series and Exponential Smoothing ( Lectures.
REGRESSION AND CORRELATION SIMPLE LINEAR REGRESSION 10.2 SCATTER DIAGRAM 10.3 GRAPHICAL METHOD FOR DETERMINING REGRESSION 10.4 LEAST SQUARE METHOD.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical.
1 Experimental Statistics - week 11 Chapter 11: Linear Regression and Correlation.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
1 Linear Regression Model. 2 Types of Regression Models.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Simple Linear Regression In many scientific investigations, one is interested to find how something is related with something else. For example the distance.
Lecture 11: Simple Linear Regression
Chapter 20 Linear and Multiple Regression
Linear Regression and Correlation Analysis
Simple Linear Regression
Statistics for Business and Economics (13e)
Relationship with one independent variable
Simple Linear Regression
Econ 3790: Business and Economics Statistics
Slides by JOHN LOUCKS St. Edward’s University.
Relationship with one independent variable
SIMPLE LINEAR REGRESSION
Introduction to Regression
St. Edward’s University
Presentation transcript:

Reading – Linear Regression Le (Chapter 8 through 8.1.6) C &S (Chapter 5:F,G,H)

Issues with hypothesis testing Significance does not imply causality –Need a proper prospective experiment Significance does not imply practical importance –Trivial but significant differences Run lots of tests, will find significant difference by chance –With α = 0.05, expect 1 in 20 results to be sig. by chance

Issues with hypothesis testing Large p-values because sample size is small –Effect could exist but we may not have a large enough sample size Outliers may cause problems especially in small samples.

Issues With Hypothesis Testing What is the population of inference? Example: A statistics class of n=15 women and n=5 men yield the following exam scores: Women:mean = 90%SD = 10% Men: mean = 85%SD = 11% Test the hypothesis that women did better on the exam then men.

If 95% CI excludes 0 then the p-value will be <0.05.

Linear Regression Investigate the relationship between two variables –Does blood pressure relate to age? –Does weight loss relate to blood pressure loss –Does income relate to education? –Do sales relate to years of experience? Dependent variable –The variable that is being predicted or explained Independent variable –The variable that is doing the predicting or explaining Think of data in pairs (x i, y i )

Linear Regression - Purpose Is there an association between the two variables –Does weight change relate to BP change? Estimation of impact –How much BP change occurs per pound of weight change Prediction –If a person loses 10 pounds how much of a drop in blood pressure can be expected

Regression History Sir Francis Galton ( ) studied the relationship between a father’s height and the son’s height. He found that although there was a relationship between father and son’s height the relationship was not perfect. If the father was above average in height so was the son (typically) but not as much above average. This was called regression to the mean

Example of Regression Equation We know systolic BP increases with age. How much does it increase per year and is the increase constant over time? SBP = *AGE Interpretation: For each year of age SBP increases by 0.8 mmHg. At age 50: SBP = *50 = 130 mmHg At age 60: SBP = *60 = 138 mmHg Y or Dependent Variable X or Dependent Variable

Simple Linear Regression Equation n The simple linear regression equation is:  y =  0 +  1 x Graph of the regression equation is a straight line. Graph of the regression equation is a straight line.  0 is the y intercept of the regression line.  0 is the y intercept of the regression line.  1 is the slope of the regression line.  1 is the slope of the regression line.  y is the mean value of y for a given x value.  y is the mean value of y for a given x value.

Simple Linear Regression Model  The equation that describes how y is related to x and an error term is called the regression model.  The simple linear regression model is: y =  0 +  1 x +   0 and  1 are called parameters of the model.  0 and  1 are called parameters of the model.  is a random variable called the error term.  is a random variable called the error term.

Simple Linear Regression Equation n Positive Linear Relationship E(y)E(y)E(y)E(y) x Slope  1 is positive Regression line Intercept  0

Simple Linear Regression Equation n Negative Linear Relationship E(y)E(y)E(y)E(y) x Slope  1 is negative Regression line Intercept  0

Simple Linear Regression Equation n No Relationship E(y)E(y)E(y)E(y) x Slope  1 is 0 Regression line Intercept  0

Estimated Simple Linear Regression Equation n The estimated simple linear regression equation is: The graph is called the estimated regression line. The graph is called the estimated regression line. b 0 is the y intercept of the line. b 0 is the y intercept of the line. b 1 is the slope of the line. b 1 is the slope of the line. is the estimated value of y for a given x value. is the estimated value of y for a given x value.

Estimation Process Regression Model y =  0 +  1 x +  Regression Equation  y =  0 +  1 x Unknown Parameters  0,  1 Sample Data: x y x 1 y x n y n Estimated Regression Equation Sample Statistics b 0, b 1 b 0 and b 1 provide estimates of  0 and  1

Least Squares Method  Least Squares Criterion: Choose    and    to minimize where: y i = observed value of the dependent variable for the ith observation for the ith observation S =  Y i –  0  1  

Estimation

Slope: The Least Squares Estimates Intercept:

Example RestaurantStudent Population (Thousands) Quarterly Sales

X-Y PLOT OF DATA

Calculations ObsXiXi YiYi X i -XBARY i -YBAR(Xi – XBAR)* (Yi – YBAR) (Xi – XBAR) Tot

Estimates for Dataset b 1 = 2840/568 = 5 b 0 = 130 – 5*14 = 60 Y = Sales; X = # thousands of students Equation: Y = * X

DATA sales; INFILE DATALINES; INPUT restaurant studentpop quarsales; DATALINES; ;

PROC PRINT DATA=sales; PROC MEANS DATA=sales; PROC REG DATA=sales SIMPLE; MODEL quarsales = studentpop; PLOT quarsales * studentpop ; RUN;

OUTPUT FROM PROC REG The REG Procedure Descriptive Statistics Uncorrected Standard Variable Sum Mean SS Variance Deviation Intercept studentpop quarsales

Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept studentpop <.0001 REGRESSION EQUATION : Y = *X QUARSALES = *STUDENTPOP

The Coefficient of Determination  Relationship Among SST, SSR, SSE SST = SSR + SSE where: SST = total sum of squares SST = total sum of squares SSR = sum of squares due to regression SSR = sum of squares due to regression SSE = sum of squares due to error SSE = sum of squares due to error ^^

n The coefficient of determination is: r 2 = SSR/SST where: SST = total sum of squares SST = total sum of squares SSR = sum of squares due to regression SSR = sum of squares due to regression The Coefficient of Determination

OUTPUT FROM PROC REG Dependent Variable: quarsales Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 1 SSR <.0001 Error 8 SSE Corrected Total 9 SST Root MSE R-Square Dependent Mean Coeff Var Coefficient of Determination

First value is age Second value is SBP Find the regression equation SBP = b0 + b1*age Your TURN