Creating Empirical Models Constructing a Simple Correlation and Regression-based Forecast Model Christopher Oludhe, Department of Meteorology, University.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
Correlation and Linear Regression.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Chapter 13 Multiple Regression
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Simple Regression
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding.
Correlation and Regression Analysis
Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Topic 3: Regression.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Lecture 5: Simple Linear Regression
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Multiple Regression Research Methods and Statistics.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Correlation and Regression Analysis
Chapter 7 Forecasting with Simple Regression
Relationships Among Variables
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Lecture 5 Correlation and Regression
Correlation & Regression
Example of Simple and Multiple Regression
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Relationship of two variables
Chapter 13: Inference in Regression
Chapter 11 Simple Regression
Linear Trend Lines Y t = b 0 + b 1 X t Where Y t is the dependent variable being forecasted X t is the independent variable being used to explain Y. In.
Correlation.
EQT 272 PROBABILITY AND STATISTICS
Measures of relationship Dr. Omar Al Jadaan. Agenda Correlation – Need – meaning, simple linear regression – analysis – prediction.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Examining Relationships in Quantitative Research
CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Chapter 16 Data Analysis: Testing for Associations.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Stat 112 Notes 9 Today: –Multicollinearity (Chapter 4.6) –Multiple regression and causal inference.
Simple Linear Regression (SLR)
Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Linear Regression Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand, 12 – 16 January 2015.
Correlation & Regression Analysis
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Free Powerpoint Templates ROHANA BINTI ABDUL HAMID INSTITUT E FOR ENGINEERING MATHEMATICS (IMK) UNIVERSITI MALAYSIA PERLIS.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Free Powerpoint Templates ROHANA BINTI ABDUL HAMID INSTITUT E FOR ENGINEERING MATHEMATICS (IMK) UNIVERSITI MALAYSIA PERLIS.
REGRESSION AND CORRELATION SIMPLE LINEAR REGRESSION 10.2 SCATTER DIAGRAM 10.3 GRAPHICAL METHOD FOR DETERMINING REGRESSION 10.4 LEAST SQUARE METHOD.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Stats Methods at IC Lecture 3: Regression.
Inference for Least Squares Lines
Correlation and Regression
Simple Linear Regression
Prepared by Lee Revere and John Large
Multiple Regression Models
Presentation transcript:

Creating Empirical Models Constructing a Simple Correlation and Regression-based Forecast Model Christopher Oludhe, Department of Meteorology, University of Nairobi Clips training workshop for Eastern and Southern Africa, DMCN. 30 th July 2002

Simple Linear Correlation Analysis Many problems in seasonal climate prediction start by trying to establish some relationship (linear) between two sets of variables. An example would be to try and see whether the (SST) over any of the global oceans (variable one) is related to rainfall (variable two) at a certain given location of the globe.

Simple Linear Correlation Cont.. The Knowledge of such a relationship would be useful in that the expected rainfall of the given location can be predicted if the SSTs of the global Oceans are known in advance. The strength of the two relationship can be determined by computing the Pearson’s coefficient of correlation, r.

Simple Linear Correlation Cont.. The statistical significance of the computed correlation coefficient r may be tested by using the t- statistic given by: Accept or reject your null hypothesis ( r = 0) depending on the results of the comparison between computed and tabulated t.

Simple Linear Regression Equation A simple linear regression equation gives the functional relationship between two variables such as: Y =  +  x where x is the independent variable (predictor) and Y the dependent variable (response or predictand). The estimation of the regression constants,  (Y- intercept) and  (slope of the line), are possible through the method of least-squares.

Regression Cont.. The solutions for the regression constants are given by the relations: and

Plotting of paired data Given a set of paired standardised rainfall (Y ) and SST (X ) data below i x i y i i x i y i

Scatter Plot and Line of Best Fit

Linear Regression Fit It can be seen that the relationship is linear but negatively, i.e. when the SST index increases (decreases), the rainfall index decreases (increases), i.e, positive SST index are associated with a negative rainfall index, or drier than average conditions. Using this type of relationship, it is possible to make a qualitative statement regarding the expected rainfall for a coming season if knowledge of the seasonal lag SST index can obtained just before the beginning of the season to be forecasted.

Goodness of fit measure The goodness of “fit” of a regression model can be determined by examining the mean-squared error (MSE) in the ANOVA table output. This measure indicates the variability of the observed values around the forecast regression line. A perfect linear relationship between the predictor and predictand gives an MSE of zero, while poor fits results in large values of MSE. Another measure of the fit of a regression is the coefficient of determination (R 2 ) which is, the squared value of the Pearson correlation coefficient between predictor and predictand.

Measure Cont.. Qualitatively, R 2 can be interpreted as the proportion of the variance of the predictand that is described or accounted for by the regression. For a perfect regression, the R 2 = 1, while for R 2 close to 0 indicates that very little of the variance is being explained by the regression line. In majority of applications, however, the response of a predictand can be predicted more adequately by a collection of many variables and not just on the basis of a single independent input variable.

Multiple Linear Regression In a multiple linear regression model, a single predictant, Y, (e.g. SOND rainfall) has more than one predictor variable, i.e, it can be influenced by ENSO, QBO, SSTs over the Indian Ocean AND/OR the Atlantic Ocean, etc. For K predictors: Y = β 0 + β 1 x 1 + β 2 x 2 + … + β k x k The procedure for estimating the regression coefficients is the same as those for simple linear regression models.

Stepwise Regression analysis Forward Selection: In this procedure, only the best potential predictors that improves the model the most, are examined individually and added into the model equation, starting with the one that explains the highest variance, etc. Backward Elimination: The regression model starts with all potential predictors and at each step of model construction, the least important predictor is removed until only the best predictors remain. A stopping criteria should be selected in both cases.

Cross-Validation Year1Year 2Year 3Year 4Year 5Year 6 Model 1 omitted Model 2 omitted Model 3 omitted Model 4 omitted Model 5 omitted Model 6 omitted Model 7

Forecast Skill Estimation (Contingency Table) OAOA ONON OBOB FAFA RST FNFN UVW FBFB XYZ O: Observed F: Forecast A: Above-normal N: Near-normal B: Below-normal

Accuracy Measures of Multicategory Forecasts (1) Hit Score (HS): Number of times a correct category is forecast HS = R+V+Z

Accuracy Measures of Multicategory Forecasts (2) False Alarm Ratio (FAR): The fraction of forecast events that failed to materialize Best FAR=0; worst FAR=1 For Above-Normal=(S+T)/(R+S+T) For Near-Normal=(U+W)/(U+V+W) For Below-Normal=(X+Y)/(X+Y+Z)

Accuracy Measures of Multicategory Forecasts (3) Bias: Comparison of the average forecast with the average observation Bias > 1 : overforecasting Bias < 1 : underforecasting For Above-Normal=(R+S+T)/(R+U+X) For Near-Normal=(U+V+W)/(S+V+Y) For Below-Normal=(X+Y+Z)/(T+W+Z)

Example of Model Testing (1) Regression lines for some of the 20 cross-validation models: Mod. 1: Y 1 = x (years 2 to 20) Mod. 2: Y 2 = x (year 1 and 3 to 20) : Mod. 18: Y 18 = x (years 1 to 17 and 19 and 20) : Mod. 20: Y 20 = x (years 1 to 19)

Example of Model Testing (2) Linear Fits of the 20 cross-validation models. The red line is the fit of Model 18 which excludes the outlier at about (1.5,1.5)

Example of Model Testing (3) Cross-val. forecasts (dashed) and observed (solid) using data from 18 seasons. Horizontal lines on either side of zero line: upper and lower limits of Near-Normal

Example of Model Testing (4) OAOA ONON OBOB FAFA 410 FNFN 224 FBFB 032 HS=4+2+2 BIAS A =(4+1+0)/(4+2+0); BIAS N =(2+2+4)/(1+2+3); BIAS B =(0+3+2)/(0+4+2) FAR A =(1+0)/(4+1+0); FAR N =(2+4)/(2+2+4); FAR B =(0+3)/(0+3+2)