Correlation and Regression Analysis

Slides:



Advertisements
Similar presentations
Correlation and Regression Analysis Many engineering design and analysis problems involve factors that are interrelated and dependent. E.g., (1) runoff.
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation and regression
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Use of regression analysis Regression analysis: –relation between dependent variable Y and one or more independent variables Xi Use of regression model.
Simple Regression. Major Questions Given an economic model involving a relationship between two economic variables, how do we go about specifying the.
Ch11 Curve Fitting Dr. Deshi Ye
P M V Subbarao Professor Mechanical Engineering Department
Lecture (14,15) More than one Variable, Curve Fitting, and Method of Least Squares.
Least Square Regression
Curve-Fitting Regression
Statistics for Business and Economics
Least Square Regression
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Calibration & Curve Fitting
Least-Squares Regression
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Statistics for Business and Economics Chapter 10 Simple Linear Regression.
3/2003 Rev 1 I – slide 1 of 33 Session I Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
Regression Regression relationship = trend + scatter
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
CORRELATION. Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson’s coefficient of correlation.
Ch14: Linear Least Squares 14.1: INTRO: Fitting a pth-order polynomial will require finding (p+1) coefficients from the data. Thus, a straight line (p=1)
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Linear Statistical.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression.
Topics, Summer 2008 Day 1. Introduction Day 2. Samples and populations Day 3. Evaluating relationships Scatterplots and correlation Day 4. Regression and.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical.
Fundamentals of Data Analysis Lecture 10 Correlation and regression.
Chapter 11: Linear Regression and Correlation
Lecture 11: Simple Linear Regression
Chapter 20 Linear and Multiple Regression
Regression and Correlation
Regression Analysis AGEC 784.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Part 5 - Chapter
LEAST – SQUARES REGRESSION
Chapter 9 Multiple Linear Regression
ENME 392 Regression Theory
Chapter 5 STATISTICS (PART 4).
SIMPLE LINEAR REGRESSION MODEL
Simple Linear Regression
Evgeniya Anatolievna Kolomak, Professor
Simple Linear Regression
Simple Linear Regression - Introduction
Correlation and Regression
Inference about the Slope and Intercept
REGRESSION.
CHAPTER- 17 CORRELATION AND REGRESSION
Inference about the Slope and Intercept
Linear regression Fitting a straight line to observations.
Undergraduated Econometrics
BA 275 Quantitative Business Methods
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
Correlation and Regression
Discrete Least Squares Approximation
CHAPTER 14 MULTIPLE REGRESSION
Least Square Regression
Multiple Linear Regression
Product moment correlation
Chapter 14 Inference for Regression
3.2. SIMPLE LINEAR REGRESSION
Multiple Regression Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
St. Edward’s University
Correlation and Simple Linear Regression
Presentation transcript:

Correlation and Regression Analysis Many engineering design and analysis problems involve factors that are interrelated and dependent. E.g., (1) runoff volume, rainfall; (2) evaporation, temperature, wind speed; (3) peak discharge, drainage area, rainfall intensity; (4) crop yield, irrigated water, fertilizer. Due to inherent complexity of system behaviors and lack of full understanding of the procedure involved, the relationship among the various relevant factors or variables are established empirically or semi-empirically. Regression analysis is a useful and widely used statistical tool dealing with investigation of the relationship between two or more variables related in a non-deterministic fashion. If a variable Y is related to several variables X1, X2, …, XK and their relationships can be expressed, in general, as Y = g(X1, X2, …, XK) where g(.) = general expression for a function; Y = Dependent (or response) variable; X1, X2,…, XK = Independent (or explanatory) variables.

Correlation When a problem involves two dependent random variables, the degree of linear dependence between the two can be measured by the correlation coefficient r(X,Y), which is defined as where Cov(X,Y) is the covariance between random variables X and Y defined as   where <Cov(X,Y)< and  (X,Y)  . Various correlation coefficients are developed in statistics for measuring the degree of association between random variables. The one defined above is called the Pearson product moment correlation coefficient or correlation coefficient. If the two random variables X and Y are independent, then (X,Y)= Cov(X,Y)= . However, the reverse statement is not necessarily true.

Cases of Correlation Perfectly linearly correlated in opposite direction Strongly & positively correlated in linear fashion Perfectly correlated in nonlinear fashion, but uncorrelated linearly. Uncorrelated in

Calculation of Correlation Coefficient Given a set of n paired sample observations of two random variables (xi, yi), the sample correlation coefficient ( r) can be calculated as

Auto-correlation Consider following daily stream flows (in 1000 m3) in June 2001 at Chung Mei Upper Station (610 ha) located upstream of a river feeding to Plover Cove Reservoir. Determine its 1-day auto-correlation coefficient, i.e., r(Qt, Qt+1). 29 pairs: {(Qt, Qt+1)} = {(Q1, Q2), (Q2, Q3), …, (Q29, Q30)}; Relevant sample statistics: n=29 The 1-day auto-correlation is 0.439

Chung Mei Upper Daily Flow

Regression Models due to the presence of uncertainties a deterministic functional relationship generally is not very appropriate or realistic. The deterministic model form can be modified to account for uncertainties in the model as Y = g(X1, X2, …, XK) + e where e = model error term with E(e)=0, Var(e)=s2. In engineering applications, functional forms commonly used for establishing empirical relationships are  Additive: Y = b0 + b1X1 + b2X2 + … + bKXK +e Multiplicative: +e.

Least Square Method Suppose that there are n pairs of data, {(xi, yi)}, i=1, 2,.. , n and a plot of these data appears as What is a plausible mathematical model describing x & y relation? x y

Least Square Method Considering an arbitrary straight line, y =b0+b1 x, is to be fitted through these data points. The question is “Which line is the most representative”?

Least Square Criterion What are the values of b0 and b1 such that the resulting line “best” fits the data points? But, wait !!! What goodness-of-fit criterion to use to determine among all possible combinations of b0 and b1 ? The least squares (LS) criterion states that the sum of the squares of errors (or residuals, deviations) is minimum. Mathematically, the LS criterion can be written as:   Any other criteria that can be used?

Normal Equations for LS Criterion The necessary conditions for the minimum values of D are: and Expanding the above equations Normal equations:

LS Solution (2 Unknowns)

Fitting a Polynomial Eq. By LS Method

Fitting a Linear Function of Several Variables

Matrix Form of Multiple Regression by LS or y = X b + e in short LS criterion is: The LS solutions are:

Measure of Goodness-of-Fit

Example 1 (LS Method)

Example 1 (LS Method)

LS Example

LS Example (Matrix Approach)

LS Example (by Minitab w/ b0)

LS Example (by Minitab w/o b0)

LS Example (Output Plots)