CIVE2602 - Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 11 Dr Duncan Borman Linear Regression -Techniques to assess how.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Correlation and regression
Hypothesis Testing Steps in Hypothesis Testing:
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Copyright © 2010 Pearson Education, Inc. Slide
Inference for Regression
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Econ 140 Lecture 81 Classical Regression II Lecture 8.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Objectives (BPS chapter 24)
Lecture 23: Tues., Dec. 2 Today: Thursday:
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #18.
BA 555 Practical Business Analysis
The Basics of Regression continued
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Chapter Topics Types of Regression Models
Introduction to Probability and Statistics Linear Regression and Correlation.
Regression Diagnostics Checking Assumptions and Data.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions.
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Correlation and Regression
Active Learning Lecture Slides
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Correlation and Linear Regression
Hypothesis Testing in Linear Regression Analysis
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Other Regression Models Andy Wang CIS Computer Systems Performance Analysis.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Chapter 10 Correlation and Regression
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
Go to Table of Content Single Variable Regression Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Scatterplots & Regression Week 3 Lecture MG461 Dr. Meredith Rolfe.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 21 The Simple Regression Model.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Chapter 12 Inference for Linear Regression. Reminder of Linear Regression First thing you should do is examine your data… First thing you should do is.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
CIVE1620- Engineering Mathematics 1.1 Lecturer: Dr Duncan Borman Differentiation –complex functions - Product Rule - Quotient Rule 2 nd Derivatives Lecture.
BPS - 5th Ed. Chapter 231 Inference for Regression.
CIVE Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 10 Correlation (r-values) Linear Regression - Dependent and independent.
CIVE Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 12- Linear regression review Transforming data -Coursework questions?
Stats Methods at IC Lecture 3: Regression.
Regression Analysis AGEC 784.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
(Residuals and
Statistical Methods For Engineers
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Hypothesis testing and Estimation
Inferential Statistics and Probability a Holistic Approach
Simple Linear Regression
Presentation transcript:

CIVE Engineering Mathematics 2.2 (20 credits) Statistics and Probability Lecture 11 Dr Duncan Borman Linear Regression -Techniques to assess how good the regression is 1) Coefficient of determination 2) Examine residuals 3) Significance testing on the residuals -Non-linear regression – transformations -Multiple linear regression ©Claudio Nunez 2010, sourced from _Building_destroyed_in_Concepci%C3%B3n.jpg?uselang=en-gb Available under creative commons license

Regression analysis Aim: To predict y from x (‘regressing y on x’) y x Residual variation ~N(0,σ 2 )

Dependant variable (y) Independent or Control variable (x) What’s the “best fit” line? “minimises squared residuals”

How do we calculate the values of a and b (the least squares estimators) ? and and are just the means of the sample data x and y

Idling time (s) Emissions (PPM)

How good is the regression? How well does our data fit the straight line we have produced?

How good is the regression? 3 techniques to be aware of – 1) Coefficient of Determination (R 2 ) 2) Examine the residuals 3) Significance tests (value between 0 and 1 (i.e %)

3) Significance testing on regression parameters e.g how well does X predict Y in the model statistical testing on and i.e. Does x explain any of the variability in y? We can perform a hypothesis test to see whether or not a variable X actually explains any of the variability in Y hypothesis test If there was no relationship between X and Y - we would expect the slope of the best fit line to be zero. Take, the gradient of the line (estimated by b) H0:H0:Null hypothesis

H0:H0: Statistics give us a tool for saying what is large and what is small. Technically this is done by converting the measured slope into a t-statistic : Where s b is the equivalent to the standard error when we were forming comparing sample means

H0:H0:Null hypothesis Statistics give us a tool for saying what is large and what is small. Technically this is done by converting the measured slope into a t-statistic : p90 of notes - Excel/SPSS can be made to generate these values automatically. !!!!!!!!!!

H0:H0:Null hypothesis Statistics give us a tool for saying what is large and what is small. Technically this is done by converting the measured slope into a t-statistic : Once we have a t-statistic we can use t-tables to look up a P-value (remembering to double the value from tables for a 2TT) We compare the P-value with the significance level (say 5%) If P>5% KEEP H 0 If P<5% REJECT H 0 This means that the slope of the graph has NOT been proven significantly different from 0. Model is not a good predictor of y. This means that the slope of the graph has IS significantly different from 0. Model is a good predictor of y.

H0:H0: Null hypothesis We can do something very similar for Repeat a similar process p92 of notes Again generate P-values to compare with 1% or 5% significance level

This lets us find with 95% Confidence an interval either side of a and b that we will find and. i.e. if we had taken a different sample of data we would have got a different set of values for a and b - so how good are the ones we’ve found p90 of notes. Confidence intervals for and Also we can find....

Clickers

Regression questions

Don’t be confused: High R 2 Low R 2 High b Low b

Some data… Amplitude of vibrations measured on a bridge support vs number of cars driving across at any one time Graphs>interactive>scatter> Number of cars Amplitude of vibration (mm) ©Terraplanner 2007, sourced from Available under creative commons license

Assumptions of regression The independent (x) variable is measured without error (!) ‘Errors’ in dependent (y) variable are normally distributed Variance in dependent variable is constant Relationship between variables is linear

Assumptions of regression Seldom true… but nearly true when experimental treatments are used. Can (should?) be tested by remeasuring x variable Only a serious problem if errors in x are nearly as large as those in y If so: use other techniques The independent (x) variable is measured without error (!) ‘Errors’ in dependent (y) variable are normally distributed Variance in dependent variable is constant Relationship between variables is linear

Assumptions of regression To check: try a scatterplot OKBAD More formally: can plot residuals, or save them and test for normality The independent (x) variable is measured without error (!) ‘Errors’ in dependent (y) variable are normally distributed Variance in dependent variable is constant Relationship between variables is linear

Assumptions of regression Again: try a scatterplot OKBAD (homoscedastic) (heteroscedastic) There are tests available But generally: your eye will be more sensitive than most tests! The independent (x) variable is measured without error (!) ‘Errors’ in dependent (y) variable are normally distributed Variance in dependent variable is constant Relationship between variables is linear

Assumptions of regression Try a scatterplot (again!) or plot residuals… Residuals OKBAD The independent (x) variable is measured without error (!) ‘Errors’ in dependent (y) variable are normally distributed Variance in dependent variable is constant Relationship between variables is linear

If we suspect that relation between the x and y is NOT linear we can try to apply transforms to x and/or y to see if we can find a relationship

Testing the assumptions Variance: not OK Linearity: OK So let’s log transform y variable… Number of cars Amplitude of vibration (mm) R 2 =0.74

What happens… R 2 =0.54 Residuals: Not OK Variance: OK Linearity: not OK Number of cars Vibrations (mm) ln(y) log transform y variable applied

Transformation affects linearity…. Log (x) Log (y) Log(x), Log(y) BeforeAfter There are lots of other transforms you can try e.g. squaring or cubing x or y or both etc

May need to transform x variable as well… R 2 =0.79 Residuals: OK Variance: OK Linearity: OK Number of cars (ln(x)) Vibrations (mm) (ln(y)) ln(y) = ln(1.88)+1.47ln(x) ln(y) = ln(1.88)+ln(x 1.47 ) ln(y) = ln(1.88x 1.47 ) y= 1.88x 1.47

R 2 =0.82 Residuals: OK Variance: OK Linearity: OK Sample diameter(ln(x)) Failure strength (KN) (ln(y)) ln(y) = ln(2)+1.5ln(x) ln(y) = ln(2)+ln(x 1.5 ) ln(y) = ln(2x 1.5 ) y= 2x 1.5 Another log log graph What is the equation for y in terms of x? ln(y) = 3 + 5x

More than 1 independent (‘predictor’) variable: Multiple Regression e.g. bridge vibrations (z) as a function of number of cars (x) and wind speed (y) Z (vibrations) No. of cars Wind speed

Multiple Regression Fit best Plane (x,y) to explain z - minimise (residual)² just as in linear Conceptually identical to linear model Can similarly use any number of predictor variables - fitting hyperplanes in multidimensional space… z (amplitude of vibrations) x (no. of cars) Model : z = a + b 1 x + b 2 y Note: effects of x & y are both linear, and are additive y (wind speed)

CIVE Engineering Mathematics 2.2 feedback TURN ON 1)Enter Character in <> brackets seen on yellow bar (if it asks for a user ID put a “0” – this make you anonymous) 2)When asked a question you can enter a letter or number and press enter (green button) Use the scale below for A-E to answer the TEST question ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree

CIVE Engineering Mathematics 2.2 Feedback on resources for the whole level 2 Engineering Maths Module (Stats and other maths) ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 1 In general I have found the range of online resources useful for this module useful. (e.g. VLE material, Mathlab, support, links to online resources etc)

CIVE Engineering Mathematics 2.2 Feedback on resources ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 2 I have found the Mathlab tasks have helped with my understanding of the module (Eng Math and Stats).

CIVE Engineering Mathematics 2.2 Feedback on resources ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 2b I have found the weekly Examples classes useful for the Statistics part of the module.

CIVE Engineering Mathematics 2.2 Feedback on resources ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 3 I have (or intend to) make use of the online lecture slides or lecture videos that are posted on the VLE (the ones of the lectures).

CIVE Engineering Mathematics 2.2 Feedback on resources ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 4 I have made use of some of the other online links to Maths resources that have been linked to from the VLE page.

CIVE Engineering Mathematics 2.2 Feedback on resources ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 5 I feel the approaches used in the Engineering Maths module (which include working through examples in lecture and using directed out of lecture tasks) has helped to improve my understanding of the maths material.

CIVE Engineering Mathematics 2.2 Module feedback ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 6 It is difficult to read/follow the text added to the slides using the Tablet

CIVE Engineering Mathematics 2.2 ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 7 I find it easier to follow mathematical material when it is written during the lecture on the tablet

CIVE Engineering Mathematics 2.2 ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 8 The use of the A, B, C, D cards during a lecture is helpful for feeding back understanding of lecture material

CIVE Engineering Mathematics 2.2 ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 9 The use of the A, B, C, D cards and can be useful for helping me to engage with a lecture

CIVE Engineering Mathematics 2.2 ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 10 The general interactive elements in the lectures help me to engage with the lecture material.

CIVE Engineering Mathematics 2.2 ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 11 I feel I understand the majority of the material covered in this module.

CIVE Engineering Mathematics 2.2 ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 12 I would have liked this module to have covered more Civil Engineering Maths Examples

CIVE Engineering Mathematics 2.2 ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 13 I feel I am confident with my maths ability

CIVE Engineering Mathematics 2.2 Feedback on resources ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 4 I have found the weekly Examples classes useful for the Statistics part of the module.

CIVE Engineering Mathematics 2.2 Feedback on resources ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 5 For the Limits and series section of the module I found the Problem Sheets useful for developing my understanding of the material.

CIVE Engineering Mathematics 2.2 Module feedback ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 6 Having material hand written using the Tablet computer was difficult to read.

CIVE Engineering Mathematics 2.2 ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 7 I find it easier to follow maths material when it is written during the lecture on the tablet

CIVE Engineering Mathematics 2.2 ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 8 I can see the value of using the A, B, C, D cards during the lecture.

CIVE Engineering Mathematics 2.2 ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 9 Interactive elements in lectures have helped me to engage with the material

CIVE Engineering Mathematics 2.2 ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 10 I can see the relevance of the mathematical content of this module to my degree course.

CIVE Engineering Mathematics 2.2 Feedback on resources ABCDE Strongly agree AgreeNeither agree or disagree DisagreeStrongly disagree Question 11 I feel I understand the majority of the material covered in this module.

Coursework Due in on Tuesday 16 th March 12 sides maximum Need to submit online (VLE) and hardcopy by 4pm (late penalties apply until both submitted) Major coursework rules apply. Please take care not to plagiarise! –It will be taken very seriously, group work on this major coursework would constitute plagiarism (plagiarism software is a now used as normal practice on all submissions) Final lecture tomorrow is in Computer cluster 504. If you have any questions regarding coursework etc- I will make time to answer them) (examples class continue into next week)

More than one answer is allowed!

Definition of a mutually exclusive event If event A happens, then event B cannot, or vice-versa. The two events "it rained on Tuesday" and "it did not rain on Tuesday" are mutually exclusive events. Independent events The outcome of event A, has no effect on the outcome of event B. Such as "It rained on Tuesday" and "My chair broke at work".

Where S x is the larger

Is F 20,20 < F calculated if no, then NO significant difference

z or t