Correlation and regression

Slides:



Advertisements
Similar presentations
Graphical Analysis of Data
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Lesson 10: Linear Regression and Correlation
Forecasting Using the Simple Linear Regression Model and Correlation
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
Objectives (BPS chapter 24)
© The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE.
CORRELATON & REGRESSION
Statistics for the Social Sciences
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Linear Regression and Correlation
Chapter Topics Types of Regression Models
Simple Linear Regression Analysis
SIMPLE LINEAR REGRESSION
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Correlation and Regression Analysis
Chapter 7 Forecasting with Simple Regression
Simple Linear Regression Analysis
Relationships Among Variables
Correlation and Linear Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Lecture 16 Correlation and Coefficient of Correlation
Correlation and Regression
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Linear Regression and Correlation
Correlation and Linear Regression
The Scientific Method Interpreting Data — Correlation and Regression Analysis.
Chapter 6 & 7 Linear Regression & Correlation
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Examining Relationships in Quantitative Research
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 16 Data Analysis: Testing for Associations.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Correlation & Regression Analysis
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
CORRELATION ANALYSIS.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
REGRESSION REVISITED. PATTERNS IN SCATTER PLOTS OR LINE GRAPHS Pattern Pattern Strength Strength Regression Line Regression Line Linear Linear y = mx.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
The simple linear regression model and parameter estimation
Chapter 20 Linear and Multiple Regression
Regression and Correlation
Regression Analysis AGEC 784.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
Simple Linear Regression
Chapter Thirteen McGraw-Hill/Irwin
Linear Regression and Correlation
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Correlation and regression http://sst.tees.ac.uk/external/U0000504

Introduction Scientific rules and principles are often expressed mathematically There are two main approaches to finding a mathematical relationship between variables Analytical Based on theory Empirical Based on observation and experience

The straight line (1) Most graphs based on numerical data are curves. The straight line is a special case Data is often manipulated to yield straight line graphs as the straight line is relatively easy to analyse

The Straight line (2) Straight line equation y = mx + c slope = m m = Dy/Dx Intercept = c

Correlation & Regression These are statistical processes which; Suggest the existence of a relationship Determine the best equation to fit the data Correlation is a measure of the strength of a relationship between two variables Regression is the process of determining that relationship

Correlation and Regression The next few slides illustrate correlation and regression

No Correlation

Positive correlation

Negative correlation

Curvilinear correlation

Correlation coefficient A statistical measure of the strength of a relationship between two variables. Pearson’ product-moment correlation coefficient, r Spearman’s rank correlation coefficient, r All these take a value in the range -1.0 to + 1.0 r or r = +1.0 represents a perfect positive correlation r or r = -1.0 represents a perfect negative correlation r or r = 0.0 represents a no correlation values of r or r are associated with a probability of there being a relationship.

Linear regression Is the process of trying to fit the best straight line to a set of data. The usual method is based on minimising the squares of the errors between the data and the predicted line For this reason, it is called “the method of least squares”

Linear regression - assumptions The error in the independent (x) variable is negligible relative to the error in the dependant (y) variable The errors are normally, independently and identically distributed with mean 0 and constant variance - NIID(0,s2)

Linear regression model For a set of data, (x,y), there is an equation that best fits the data of the form Y = a + bx + e x is the independent variable or the predictor y is the measured dependant or predicted variable Y is the calculated dependant or predicted variable e is the error term and accounts for that part of Y not “explained” by x. For any individual data point, i, the difference between the observed and predicted value of y is called the residual, ri i.e. ri = yi – Yi = yi - (a + bxi) The residuals provide a measure of the error term

Regression analysis (1) Check the correlation coefficient Null Hypothesis H0: There is no correlation between x & y H1: There is a correlation between x & y Decision rule reject H0 if |r|  critical value at a = 0.05 If you cannot reject H0 then proceed no further, otherwise carry out a full regression

Regression analysis (2) Regression analysis can be carried out using either Excel or Minitab. Excel will need the analysis ToolPak add-in installed. The output from both Minitab and Excel will give the following information The regression equation ( in the form y = a + bx) Probabilities that a  0 and b  0 The coefficient of determination, R2 Analysis of variance In addition you will need to produce at least one of Residuals vs. fitted values Residuals vs. x-values Residuals vs. y values

Interpreting output Regression equation:- this is the equation that best fits the data and provides the predicted values of y Analysis of variance:- Determines the proportion of the variation in x & y that can be accounted for by the regression equation and what proportion is accounted for by the error term. The p-value arising out of this tells us how well the regression equation fits the data. The proportion of the variation in the data accounted for by the regression equation is called the coefficient of determination, R2 and is equal to the square of the correlation coefficient

Output plots The output plots are used to check the assumptions about the errors The normal probability plot should show the residuals lying on a straight line. The residual plots should have no obvious pattern and should not show the residuals increasing or decreasing with increase in the fitted or measured values.

Non linear relationships Many functions can be manipulated mathematically to yield a straight line equation. Some examples are given in the next few slides

Linearisation (2)

Linearisation (3)

Functions involving logs (1) Some functions can be linearised by taking logs These are y = A xn and y = A ekx

Functions involving logs (2) For y = Axn, taking logs gives log y = log a + n log x A graph of log y vs. log x gives a straight line, slope n and intercept log A. To find A you must take antilogs (= 10x)

Functions involving logs (3) For y = Aekx, we must use natural logs ln y = ln A + kx This gives a straight line slope k and intercept ln A To find A we must take antilogs (= ex)

Polynomials These are functions of general formula y = a + bx + cx2 + dx3 + … They cannot be linearised Techniques for fitting polynomials exist Both Excel and Minitab provide for fitting polynomials to data