Linear regression involves finding the equation of the line of best fit on a scatter graph. The equation obtained can then be used to make an estimate.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Section 10-3 Regression.
Kin 304 Regression Linear Regression Least Sum of Squares
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Correlation and Regression
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
Regression and Correlation
Correlation and Linear Regression
Linear Regression.
Regression lesson 4 Starter Dangers of Predicting (extrapolation) Interpretation questions Exam questions.
Biostatistics Unit 9 – Regression and Correlation.
Jon Curwin and Roger Slater, QUANTITATIVE METHODS: A SHORT COURSE ISBN © Thomson Learning 2004 Jon Curwin and Roger Slater, QUANTITATIVE.
Applied Quantitative Analysis and Practices LECTURE#22 By Dr. Osman Sadiq Paracha.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Basic Statistics Linear Regression. X Y Simple Linear Regression.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
REGRESSION Stats 1 with Liz. AIMS By the end of the lesson, you should be able to… o Understand the method of least squares to find a regression line.
Copyright © Cengage Learning. All rights reserved. 8 9 Correlation and Regression.
Correlation and Linear Regression
Chapter 2 Linear regression.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
The simple linear regression model and parameter estimation
Copyright © Cengage Learning. All rights reserved.
Correlation & Forecasting
Regression and Correlation
CHAPTER 3 Describing Relationships
Correlation & Regression
Unit 4 LSRL.
LSRL.
Least Squares Regression Line.
CHAPTER 3 Describing Relationships
Regression and Correlation
Least-Squares Regression
Chapter 5 LSRL.
LSRL Least Squares Regression Line
Chapter 3.2 LSRL.
CHAPTER 10 Correlation and Regression (Objectives)
Lecture Slides Elementary Statistics Thirteenth Edition
Least Squares Regression Line LSRL Chapter 7-continued
Section 10.2: Fitting a Linear Model to Data
Least-Squares Regression
Least-Squares Regression
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 5 LSRL.
Chapter 5 LSRL.
Chapter 5 LSRL.
Correlation and Regression
Chapter 3: Describing Relationships
Least-Squares Regression
11C Line of Best Fit By Eye, 11D Linear Regression
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Ch 4.1 & 4.2 Two dimensions concept
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
9/27/ A Least-Squares Regression.
Chapter 3: Describing Relationships
Presentation transcript:

Linear regression involves finding the equation of the line of best fit on a scatter graph. The equation obtained can then be used to make an estimate of one variable given the value of the other variable. There are two cases to consider, depending upon whether: Regression S1 deals with the with the first situation. 1. We wish to find a value of y given a value for x, or 2. We want to estimate x given y.

Linear regression involves finding the equation of the line of best fit on a scatter graph. The equation obtained can then be used to make an estimate of one variable given the value of the other variable. There are two cases to consider, depending upon whether: Regression S1 deals with the with the first situation. 1. We wish to find a value of y given a value for x, 2. We want to estimate x given y.

Regression The best fitting line is the one that minimizes the sum of the squared deviations,, where d i is the vertical distance between the i th point and the line. d1d1 d2d2 d3d3 d4d4 d5d5 d6d6 The distances d i are sometimes referred to as residuals.

Regression As stated previously, the best fitting line should pass through the mean point,.

The line that minimizes the sum of squared deviations is formally known as the least squares regression line of y on x. The equation of the least squares regression line of y on x is: Regression and: Recall:and y = a + bx b is sometimes referred to as the regression coefficient. where:

Example: The table shows the latitude, x, and mean January temperature(°C), y, for a sample of 10 cities in the northern hemisphere. Calculate the equation of the regression line of y on x and use it to predict the mean January temperature for the city of Los Angeles, which has a latitude of 34°N. Regression CityLatitudeMean Jan. temp. (°C) Belgrade451 Bangkok1432 Cairo3014 Dublin503 Havana2322 Kuala Lumpur327 Madrid405 New York410 Reykjavik30–1 Tokyo365

City Belgrade451 Bangkok1432 Cairo3014 Dublin503 Havana2322 Kuala Lumpur 327 Madrid405 New York410 Reykjavik30–1 Tokyo365 TOTALS Regression - EXAMPLE

Regression We begin by finding summary statistics for the table: We then use these to calculate the gradient ( b ) and y -intercept ( a ) for the regression line. CityLatitude ( x ) Mean Jan. temp. (°C) ( y ) Belgrade451 Bangkok1432 Cairo3014 Dublin503 Havana2322 Kuala Lumpur327 Madrid405 New York410 Reykjavik30–1 Tokyo365

Regression To find the gradient, we need S xy and S xx : Therefore: –0.720 (to 3 s.f.)

Therefore, the equation of the regression line is: y = 33.3 – x This is our estimate of the mean January temperature in Los Angeles. Regression To find the y -intercept we also need and : So: = 33.3 (to 3 s.f.) So, when x = 34, y = 33.3 – × 34 = 8.82°C.

This prediction for the mean January temperature in Los Angeles is based purely on the city’s latitude. There are likely to be additional factors that can affect the climate of a city, for example: Regression The concept of regression we have considered here can be extended to incorporate other relevant factors, producing a new formula. This allows for more accurate prediction. altitude; proximity to the coast; ocean currents; prevailing winds.

A regression equation can only confidently be used to predict values of y that correspond to x values that lie within the range of the data values available. The dangers of extrapolation It can be dangerous to extrapolate (i.e. to predict) from the graph, a value for y that corresponds to a value of x that lies beyond the range of the values in the data set. It is reasonably safe to make predictions within the range of the data. It is unwise to extrapolate beyond the given data. This is because we cannot be sure that the relationship between the two variables will continue to be true.

Examination-style question: The average weight and wingspan of 9 species of British birds are given in the table. Examination-style question: regression BirdWeight (g) Wingspan (cm) Wren1015 Robin1821 Chaffinch1824 Cuckoo5733 Blackbird10037 Pigeon30067 Lapwing22070 Crow50099 Common gull a)Plot the data on a scatter graph. Comment on the relationship between the variables. b)Calculate the regression line of wingspan on weight. c)Use your regression line to estimate the wingspan of a jay, if its average weight is 160 g. d)Explain why it would be inappropriate to use your line to estimate the wingspan of a duck, if the average weight of a duck is 1 kg.

Examination-style question: regression a) The graph indicates that there is fairly strong positive correlation between weight and wingspan – this means that wingspan tends to be longer in heavier birds.

b) Summary values for the paired data are: Examination-style question: regression These can be used to find the gradient of the regression line: Therefore: x = weight y = wingspan (to 3 s.f.)

Examination-style question: regression To find the y -intercept we also need and : So: Therefore, the equation of the regression line is: y = x where y = wingspan and x = weight.

c)When the weight is 160 g, we can predict the wingspan to be: y = x = d)The average weight of a duck is outside the range of weights provided in the data. It would therefore be inappropriate to use the regression line to predict the wingspan of a duck, as we cannot be certain that the same relationship will continue to be true at higher weights. Note: The regression coefficient (0.176) can be interpreted here as follows: as the weight increases by 1 g, the wingspan increases by cm, on average. Examination-style question: regression (0.176 × 160) = 48.2 cm (to 3 s.f.)