19 - 1 Module 19: Simple Linear Regression This module focuses on simple linear regression and thus begins the process of exploring one of the more used.

Slides:



Advertisements
Similar presentations
Module 20: Correlation This module focuses on the calculating, interpreting and testing hypotheses about the Pearson Product Moment Correlation.
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Regression and correlation methods
Kin 304 Regression Linear Regression Least Sum of Squares
Chapter 12 Simple Linear Regression
Inference for Regression
Econ 140 Lecture 81 Classical Regression II Lecture 8.
Linear regression models
Correlation and Regression
Correlation Correlation is the relationship between two quantitative variables. Correlation coefficient (r) measures the strength of the linear relationship.
© The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
Linear Regression and Correlation Analysis
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
REGRESSION AND CORRELATION
Introduction to Probability and Statistics Linear Regression and Correlation.
SIMPLE LINEAR REGRESSION
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Simple Linear Regression and Correlation
Linear Regression/Correlation
Lecture 5 Correlation and Regression
Module 32: Multiple Regression This module reviews simple linear regression and then discusses multiple regression. The next module contains several examples.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
EQT 272 PROBABILITY AND STATISTICS
Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Correlation and Regression SCATTER DIAGRAM The simplest method to assess relationship between two quantitative variables is to draw a scatter diagram.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Warsaw Summer School 2015, OSU Study Abroad Program Regression.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
CORRELATION: Correlation analysis Correlation analysis is used to measure the strength of association (linear relationship) between two quantitative variables.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Module 18: One Way ANOVA Reviewed 11 May 05 /MODULE 18 This module begins the process of using variances to address questions about means. Strategies.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Environmental Modeling Basic Testing Methods - Statistics III.
Simple linear regression Tron Anders Moger
SIMPLE LINEAR REGRESSION. 2 Simple Regression Linear Regression.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
Module 25: Confidence Intervals and Hypothesis Tests for Variances for One Sample This module discusses confidence intervals and hypothesis tests.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Chapter 26: Inference for Slope. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other.
REGRESSION AND CORRELATION SIMPLE LINEAR REGRESSION 10.2 SCATTER DIAGRAM 10.3 GRAPHICAL METHOD FOR DETERMINING REGRESSION 10.4 LEAST SQUARE METHOD.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Inference about the slope parameter and correlation
Chapter 14 Introduction to Multiple Regression
Correlation and Simple Linear Regression
Confidence Intervals and Hypothesis Tests for Variances for One Sample
Module 22: Proportions: One Sample
Relationship with one independent variable
Simple Linear Regression - Introduction
Correlation and Simple Linear Regression
Correlation and Regression
Comparing k Populations
Correlation and Simple Linear Regression
Regression Chapter 8.
Relationship with one independent variable
SIMPLE LINEAR REGRESSION
Simple Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
Presentation transcript:

Module 19: Simple Linear Regression This module focuses on simple linear regression and thus begins the process of exploring one of the more used and powerful statistical tools. Reviewed 11 May 05 /MODULE 19

An ophthalmologist who is assessing intraocular pressures as a part of a community program for the prevention of glaucoma is interested in using a portable device (Tono- Pen) for making these measurements. An important question is how well the measurements made with this device compare to those made with a more standard device (Goldman) used in clinical settings. To address this question, the ophthalmologist compared the two devices by using each on n = 40 eyes. For this comparison, each eye was measured once with each device. Goldman-Tono-Pen Example

Goldman-Tono-Pen Example Data

One approach to comparing the two devices would be to do a paired t-test, which would be appropriate since the measurements made by the two devices on the same eyes could not be considered independent and since the differences between the two measurements are of interest. Comparing the two Devices

Goldman-Tono-Pen Worksheet

GoldmanTono-Pen IDX=GY=Td=G-Td2d2 N40 Sum Mean SD Sum 2 /n15, , Sum(x 2 )16,49715, SS s2s SE t = mean(d)/SE(d)1.58 df = n-139 t (39)2.02

Hypothesis: H 0 :  =  G -  T = 0 vs. H 1 :  ≠ 0, 2. Assumptions: Differences are a random sample with normal distribution, 3. The  level:  = 0.05, 4.Test statistic: 5. The Rejection Region: Reject if t is not between ± t (39)= The Result: 7. The conclusion: Accept H 0 :  =  G -  T = 0, since t is between ± 2.02.

Hence, from this standpoint, we do not have compelling evidence that the two devices are measuring intra-ocular pressures differently. Is this a sufficient assessment of the situation, or should we look further?

One way to look further at this situation is to think about the relationship between the measurements made by the two machines in terms of simple linear regression. In this context, we would wonder if higher values on one machine more directly imply higher values on the other. Simple linear regression focuses on a possible straight line relationship between the measurements made by the two machines. Looking Further

In general, simple linear regression finds the best straight line for describing the relationship between two variables. In its simplest form, which is what we consider here, it does not do a very good job of assessing how well the line describes the data, but nevertheless provides useful information. Simple Linear Regression Concepts

a = Intercept, that is, the point where the line crosses the y-axis, which is the value of y at x = 0. b = Slope of the regression line, that is, the number of units of increase (positive slope) or decrease (negative slope) in y for each unit increase in x. 0 x-axis Independent Variable y-axis Dependent variable a

The Regression Line

The context for simple linear regression is that we have a random sample of persons from a set of well-defined populations, each defined by a specific value for x- variable. We have measurements of another variable, the y-variable so that we have two variables for each person. For simple linear regression, we focus on a straight line that depicts the relationship between these two variables. The best straight line is the one for which the sum of the squared vertical distances of each point from the line is the least. This "least squares" line has slope and intercept

For this situation, the sample line is an estimate of the population line and a and b are estimates of α and  respectively. For a specific value of x, such as x = 10, the value for y calculated from the regression equation is which is called the regression estimate of Y at the value x = 10.

Simple Regression Example The following data are diastolic blood pressure (DBP) measurements taken at different times after an intervention for n = 5 persons. For each person, the data available include the time of the measurement and the DBP level. Of interest is the relationship between these two variables.

Time DPB Patient x x 2 y y 2 xy , , , , ,3561,320 Sum ,8923,310 Mean n55

Example: AJPH, Dec. 2003; 93:

Never Smoking Regression Worksheet

For the never smoking data The slopes are

The intercepts are The best lines are:

y male = x y female = x

Regression ANOVA If the regression line is flat in the sense that the regression estimate of Y, being ŷ, is the same for all values of x, then there is no gain from considering the x variable as it is having no impact on ŷ. This situation occurs when the estimated slope b = 0. An important question is whether or not the population parameter  = 0, that is, whether the truth is that there is no linear relationship between y and x. To test this situation, we can proceed with a formal test.

The Hypothesis: H 0 :  = 0 vs H 1 :  ≠ 0 2. The  level:  = The assumptions: Random normal samples for y- variable from populations defined by x-variable 4. The test statistic: 5. The rejection region : Reject H 0 :  = 0 if the value calculated for F is greater than F 0.95 (1, n-2)

R 2 is the total amount of variation in the dependent variable y explained by its regression relationship with x.

Blood Pressure Example

We can apply these tools to the Goldman-Tono-Pen example. Note that while we test the null hypothesis H 0 :  = 0, it is of little interest as it is not a very meaningful hypothesis. Goldman-Tono-Pen Example

Create a new table

The Hypothesis: H 0 :  = 0 vs H 1 :   0 2. The Assumptions: Random samples, x measured without error, y normal distributed for each level of x 3. The  -level:  = The test statistic: ANOVA 5. The rejection region: Reject H 0 :  = 0, if Regression ANOVA – Goldman Tono-Pen Example

Example: AJPH, Aug. 1999; 89:

y = x At x = 45, y = r = 0.70

Regression ANOVA Social Capital and Self-Rated Health Example 1. The Hypothesis: H 0 :  = 0 vs H 1 :   0 2. The Assumptions: Random samples, x measured without error, y normal distributed for each level of x 3. The  -level:  = The test statistic: ANOVA 5. The rejection region: Reject H 0 :  = 0, if

Example: AJPH, July 1999; 89:

Socioeconomic Environment and Adult Health Example

Men SS(x) = SS(y) = SS(xy) = b = 1.38 a = r = SS(Reg) = SS(Res) = SS(Total) = Women SS(x) = SS(y) = SS(xy) = b = 1.56 a = r = SS(Reg) = SS(Res) = SS(Total) = Socioeconomic Environment and Adult Health Example

Men Women 1. The hypothesis: H 0 :  = 0 vs H 1 :   0 H 0 :  = 0 vs H 1 :   0 2. The assumptions: Random samples The same as that of men x measured without error y normal distributed for each level of x 3. The  -level :  = 0.05  = The test statistic: ANOVA ANOVA 5. The rejection region: Reject H 0 :  = 0, if The same as that of men Socioeconomic Environment and Adult Health Example

The result: ANOVA Men Women Source df SS MS F Regression Residual Total The conclusion: Reject H 0 :  = 0 since F > F 0.95(1,11) = 4.08 Regression ANOVA Socioeconomic Environment and Adult Health Example Create a new table