of Temperature in the San Francisco Bay Area

Slides:



Advertisements
Similar presentations
Forecasting Using the Simple Linear Regression Model and Correlation
Advertisements

Inference for Regression
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 21 Autocorrelation and Inferences about the Slope.
Chapter 12 Simple Linear Regression
Introduction to Regression Analysis
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Korelasi Ganda Dan Penambahan Peubah Pertemuan 13 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
© 2003 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
SIMPLE LINEAR REGRESSION
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Chapter 11 Multiple Regression.
Linear Regression Example Data
SIMPLE LINEAR REGRESSION
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Chapter 7 Forecasting with Simple Regression
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Lecture 5 Correlation and Regression
Correlation and Linear Regression
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Hypothesis Testing in Linear Regression Analysis
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Lecture 14 Multiple Regression Model
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Statistics for Business and Economics Dr. TANG Yu Department of Mathematics Soochow University May 28, 2007.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
© 2000 Prentice-Hall, Inc. Chap Chapter 10 Multiple Regression Models Business Statistics A First Course (2nd Edition)
Statistical Evaluation of High-resolution WRF Model Forecasts Near the SF Bay Peninsula By Ellen METR 702 Prof. Leonard Sklar Fall 2014 Research Advisor:
Forecasting. Model with indicator variables The choice of a forecasting technique depends on the components identified in the time series. The techniques.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 13.
Lecture 9 Forecasting. Introduction to Forecasting * * * * * * * * o o o o o o o o Model 1Model 2 Which model performs better? There are many forecasting.
of Temperature in the San Francisco Bay Area
Regression and Correlation
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Statistics for Managers using Microsoft Excel 3rd Edition
Basic Estimation Techniques
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Inferences for Regression
Chapter 11: Simple Linear Regression
Simple Linear Regression
Chapter 11 Simple Regression
Chapter 13 Simple Linear Regression
Basic Estimation Techniques
CHAPTER 26: Inference for Regression
MATH 2140 Numerical Methods
Pemeriksaan Sisa dan Data Berpengaruh Pertemuan 17
SIMPLE LINEAR REGRESSION
SIMPLE LINEAR REGRESSION
Inferences for Regression
St. Edward’s University
Chapter 13 Simple Linear Regression
Presentation transcript:

of Temperature in the San Francisco Bay Area Statistical Evaluation of High-resolution Numerical Weather Model Forecasts of Temperature in the San Francisco Bay Area Yilin Lu & Dave Dempsey Department of Earth & Climate Sciences Introduction Why are weather predictions not always accurate? Inspired by this question, I pursued this research with the ultimate goal of improving a forecast model by first spotting its potential defects. In particular, I used the Weather Research & Forecasting (WRF) model. The Weather Research & Forecasting (WRF) model can make weather forecasts on user-specified bounded regions in space (domains) with high spatial and temporal resolution. In this project, we begin by looking to see whether forecast errors worsen the further ahead in time that the model makes forecasts. We used weather station observations to evaluate the accuracy of the model temperature forecasts in the San Francisco Bay Area by posing and testing the following statistical hypothesis: Strategy: Test H0 statistically at the 95% confidence level Statistical Evaluation How do we evaluate forecast accuracy? Mean Absolute Error (MAE) = Problem 1: Model grid points and observations are at different locations (Fig.1). Solution: Interpolate forecasts from grid points to observation points Problem 2: In a plot of MAE against forecast hour, two patterns appear: a 6 hour cyclic pattern and an increasing trend (Fig. 2). The non-random cyclic pattern interferes with our hypothesis test of statistical significance of the trend. (The linear regression residuals must be random.) Solution: Remove the cyclic pattern by applying a 6-hour “moving average” to mean absolute errors (Fig. 4). Problem 3: MAEs at successive hours, like both observations and forecasts, are autocorrelated (i.e., not entirely independent). This dependence will interference with our hypothesis test of statistical significance of the trend. (The linear regression residuals must be independent.) Solution: Calculate the autocorrelation function for MAE vs. forecast hour. Use it to identify a sampling interval to ensure sample independence. Result: Sample every 5 forecast hours. Problem 4: In Figs. 2 & 4, each MAE comprises an average from 4 forecasts/day for 111 days, which sacrifices independent information from individual forecasts. We want to use this information, but MAEs at successive days are also autocorrelated. Solution: (a) Average MAEs for each day (4 forecasts/day) to remove the diurnal cycle; (b) Calculate the autocorrelation function for daily average MAE vs. day. Use it to identify a sampling interval to ensure sample independence. Result: Sample one-day MAE average every 3 days. (Fig.5 shows the results of solutions to both Problems 3 & 4) We are now ready to test our hypothesis about the trend of MAE vs. forecast hour. Statistical t-test & Results Null Hypothesis: The trend (slope) of MAE vs. forecast hour ≤ 0 oC/hr Select confidence level: 95% Simple size of MAE = 297 (degrees of freedom = 295) Estimate the slope using least squares linear regression (result: 0.005 oC/hr) Calculate a sample t-statistic: (estimated slope – hypothesized slope) / standard error of the slope Table 1: Results of right-tailed t-test Figure 5: Mean absolute error in forecast temperature, sampled every 5 forecast hours and every 3 days, in San Francisco Bay Area from January 1 to April 22 Research Question: Do WRF model temperature forecasts become less and less accurate as forecast time increases? Null Hypothesis (H0): No, the model won’t make worse forecasts as forecast time increases. Figure 2: Mean absolute error (MAE) in temperature from 4 runs/day for 111 days, vs. forecast hour. Figure 3: MAE in temperature from 111 days for the10 pm forecasts only, vs. forecast hour. Why there is a 6 hour cyclic pattern? Because there is a 12 hour semi-diurnal pattern in forecast errors, and model is run every 6 hours. Methods Model Configuration WRF = Initial Weather Conditions + Changes over a period Forecast (Solving a set of equations) (at discrete (Lower-resolution grid points) National Weather Forecast boundary Service forecast model) conditions on the domain Our model forecast runs: Daily frequency: 4 times/day (4 am, 10 am, 4 pm, & 10 pm PST) Forecast length: 48 hours Output interval: 1 hour Test period: 111 days from January 1 to April 22 Weather Observations Source: Meteorological Assimilation Data Ingest System (MADIS) Numbers of stations: About 90 in San Francisco Bay Area domain Quality Control: Reject missing and unreliable data provides Confidence Level Critical t Sample t-Statistic Result of test 95% 1.65 2.461 Reject H0 required for provides < Discussion & Conclusion The result indicates that we can reject the null hypothesis and conclude with 95% confidence that the WRF model shows a statistically significant positive trend in forecast errors vs. forecast hour. Although the trend seems relatively small (0.24oC in 48 hours), the impacts of decreasing accuracy with respect to forecast hour will depend on the application. Additional observations leading to future research questions: Forecast errors appear at the initialization time. We don’t know how this might affect the trend. The semi-diurnal pattern in MAE was a surprising discovery. Understanding the cause could lead to improvement of model forecasts. Many more! Figure 1: Low-resolution model grid (right, ), and the high-resolution WRF model grid for our San Francisco Bay Area domain (left, ), where there is one model forecast temperature at each grid point. Color-filled contours show an example of a forecast temperature pattern. Blue stars (left, ) show locations of weather stations used to evaluate the model forecasts. Figure 4 (right): Temperature MAE smoothed with a 6-hour moving average (compare with Fig. 2). A least squares linear regression line is fit to the data. Acknowledgements Atmospheric Sciences Education and Research Grants (ASERG) Reference Wilks, D.S. (2006). Statistical Methods in the Atmospheric Sciences. Elsevier Inc.