Summary of NDM Data Sample Analysis

Slides:



Advertisements
Similar presentations
Positive Gradient the s_ _ _ _ ness of the slope and the direction – UP (SW to NE)
Advertisements

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
© Crown copyright Met Office 2011 Climate impacts on UK wheat yields using regional model output Jemma Gornall 1, Pete Falloon 1, Kyungsuk Cho 2,, Richard.
Monitoring of Suspension Bridge Main Cables: The Forth Road Bridge Presented by Alastair Andrew B.Sc, C.Eng, MICE. General Manager & Bridgemaster Cambridge.
Replacement WCF parameter Review Group 176, December 2007.
HSRP 734: Advanced Statistical Methods July 24, 2008.
Text Exercise 4.43 (a) 1 for level A X = 0 otherwise Y =  0 +  1 X +  or E(Y) =  0 +  1 X  0 =  1 = the mean of Y for level B the amount that the.
1 Operational low visibility statistical prediction Frédéric Atger (Météo-France)
1 BA 275 Quantitative Business Methods Residual Analysis Multiple Linear Regression Adjusted R-squared Prediction Dummy Variables Agenda.
Statistics. Overview 1. Confidence interval for the mean 2. Comparing means of 2 sampled populations (or treatments): t-test 3. Determining the strength.
© 2003 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
2011 Long-Term Load Forecast Review ERCOT Calvin Opheim June 17, 2011.
Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture Review 2.Transforming data, the log transform i.liver.
Maths Study Centre CB Open 11am – 5pm Semester Weekdays
DESC Action DE0301 & DE0302 (New Project Nexus Allocation Algorithm) 4th April 2012.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Diploma in Statistics Introduction to Regression Lecture 2.21 Introduction to Regression Lecture Review of Lecture 2.1 –Homework –Multiple regression.
Mod 0194 Presentation on the conclusions reached by British Gas Nick Wye On behalf of the I&C Group.
New Decline Curve Tool for GOM 3 /GOMsmart & “Forgotten Oil & Gas Study” 2012 GOM 3 User Conference at Anadarko John D. Grace, Ph.D. Earth Science Associates.
Regression. Population Covariance and Correlation.
AP Statistics Chapter 15 Notes. Inference for a Regression Line Goal: To determine if there is a relationship between two quantitative variables. Goal:
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
AP Statistics Chapter 15 Notes. Inference for a Regression Line Goal: To determine if there is a relationship between two quantitative variables. –i.e.
Shrinkage Forum 22 January 2014 Roy Malin /15 Shrinkage Proposals  Initial proposals issued December 2013  Proposed Shrinkage Quantities calculated.
Allocation of Unidentified Gas Statement 2013/14 6 th February 2012.
RbD Subgroup Dean Johnson 8 th April 2008 RbD Subgroup Dean Johnson 8 th April 2008.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
© British Gas Trading Limited 2011 NDM Data Sample Option C: Regression Analysis.
Statistics Correlation and regression. 2 Introduction Some methods involve one variable is Treatment A as effective in relieving arthritic pain as Treatment.
Transportation Planning Asian Institute of Technology
Announcements There’s an in class exam one week from today (4/30). It will not include ANOVA or regression. On Thursday, I will list covered material and.
Stats Methods at IC Lecture 3: Regression.
Presented at CARSP Conference
of Temperature in the San Francisco Bay Area
DESC Technical Workgroup Review of draft 2017/18 NDM Algorithms
ELEC-E3151 Mathematical Computing
PJM Load Product (Consumption Product)
Regression Analysis Part D Model Building
Belinda Boateng, Kara Johnson, Hassan Riaz
Easter Bank Holiday Closures
Easter Bank Holiday Closures
QM222 A1 On tests and projects
QM222 A1 Nov. 27 More tips on writing your projects
of Temperature in the San Francisco Bay Area
Impact of rain on Daily Bus ridership: A Brisbane Case Study
NDM Data Sample Analysis: Final Results
Lecture 19.
…Don’t be afraid of others, because they are bigger than you
Rebecca Knight Mod 644 UIG Analysis.
The role of weather conditions on PM2.5 concentrations in Beijing
Statistics and Modelling 3.8
BASIC REGRESSION CONCEPTS
Simple Linear Regression
8.5 Tin Whisker Status Jack McCullen (Intel) File name
NDM Data Sample Analysis: Final Results (2)
Correlation and Regression
UIG Task Force Progress Report
TERM JUNIOR PROGRAM MONDAY 29th JANUARY – SUNDAY 15th APRIL
South Milford Pre-School Playgroup
South Milford Pre-School Playgroup
School Year Calendar You can print this template to use it as a wall calendar, or you can copy the page for any month to add it to your own presentation.
Nomination requirements changes
Relationships.
Extended Christmas Hours Thursday December 8th 9am -6:30pm Friday December 9th 9am -6:30pm Saturday December 10th 9am-6pm Thursday December.
Demand Estimation Impacts
DNO Obligations to Facilitate Resolution of Unresolved USRVs
Introduction to simple linear regression
Offtake Workstream : 19 March 2012
Algorithm Option A DESC 1/8/2012.
BEC 30325: MANAGERIAL ECONOMICS
Presentation transcript:

Summary of NDM Data Sample Analysis Option C: Regression Analysis

Contents Regression Analysis per LDZ In-Sample Results Out-of-Sample Model fit CWV Contribution Conclusion

Regression Analysis Regression Model as follows: Dummy variables (Bank Holidays, Easter, Christmas and so forth). Weather variables introduced as per DESC meeting on 4th April (e.g. Temperature, Global Radiation, Rainfall and so forth). Time intervals used based on office hours and domestic habits. Slot 1 from 5am to 8am Slot 2 from 9am to 4pm Slot 3 from 5pm to 10pm Slot 4 from 11pm to 4am

Regression Analysis Data normalised by AQ because of erratic level changes observed year on year. Yearly cut-off date is of 1st April due to time span of original files and data deletion process Binary permutation of variables used to seek out best regression fit with p≤5% significance level.

Regression Analysis Models used A benchmark model was used for each LDZ as the following: Normalised Consumption = Intercept + a0 * CWV Using Binary permutations, a most optimised linear regression model (based on best R2 fit) is chosen. The linear regression is of the form: Normalised Consumption = Intercept + a0 * CWV + a1 * Temperature + a2* Windspeed + a3* Solar Radiation + … In-Sample data runs from April 2008 to March 2011 whereas Out-of-Sample data spans from April 2011 to March 2012. These models were applied to End-User Category 1 only (EUC1).

Regression Analysis Parameters (1 of 2) EA EM NE NO NW SC SE SW WM WS Intercept 0.006914 0.006223 0.005724 0.005377 0.006518 0.005753 0.007031 0.007086 0.006198 0.006572 CWV -0.00039 -0.00038 -0.00032 -0.00033 -0.00029 -0.0004 -0.00044 -0.00035 mean_Temp -0.00009 0.00011 0.000075 -0.00002 -0.00014 0.000019 -0.00006 0.000024 0.00000351 -0.00005 mean_Windspeed 0.000025   0.000015 0.00006 mean_WindDirection -7.55E-07 -1.03E-06 -7.45E-07 -2.85E-07 mean_Humidity -1.16E-07 0.00000266 0.000012 0.00000427 0.00000174 mean_Global_Radiation -5.13E-07 -2.52E-07 -1.15E-06 4.23E-07 6.97E-07 mean_Rainfall 0.00024 0.000178 0.000179 0.000476 0.000145 mean_Temp_lag1 -0.00001 -8.53E-06 -8.35E-06 -8.83E-06 0.0000034 mean_Windspeed_lag1 0.000022 0.000026 0.000014 0.000041 0.00003 mean_WindDirection_lag1 4.623E-07 3.49E-07 3.48E-07 mean_Humidity_lag1 -7.05E-07 -2.05E-06 -2.06E-06 -1.87E-06 mean_Global_Radiation_lag1 1.376E-07 -1.47E-07 6.75E-08 mean_Rainfall_lag1 0.000113 -0.00007 0.000096 WeekEnd 0.000094 0.000073 0.000071 Mon_Fri 0.00000264 WeekEnd_from__Friday Bank__Hols 0.000064 School_Hols 0.000066 0.00007 0.00012 -2.35E-06 Mon_Thurs -0.00013 -0.00003 Slot1_Windspeed Slot1_Rainfall -0.0001

Regression Analysis Parameters ( of 2) EA EM NE NO NW SC SE SW WM WS Slot1_GlobalRadiation 2.293E-07   3.061E-07 0.00000351 Slot1_Temp 0.000032 -0.00005 -0.00004 0.000021 Slot1_WindDirection -3.68E-07 2.86E-07 Slot1_Humidity -3.08E-06 Slot2_Windspeed 9.999E-06 -0.00003 Slot2_Rainfall 0.000033 Slot2_GlobalRadiation -3.51E-08 -2.45E-07 -4.49E-07 Slot2_Temp -0.00002 0.000059 Slot2_WindDirection 5.639E-07 3.091E-07 -4.25E-07 Slot2_Humidity 3.617E-06 Slot3_Temp 0.000053 0.000028 0.000029 0.000013 0.000043 0.000022 Slot3_Windspeed -7.84E-06 Slot3_GlobalRadiation 1.357E-08 2.51E-07 Slot3_Rainfall -0.00009 0.000061 Slot3_WindDirection 4.538E-07 Slot3_Humidity 3.044E-06 Slot4_Temp 0.000015 Slot4_WindDirection 3.71E-07 Slot4_Humidity -1.75E-06 -5.51E-06 -2.28E-06 Slot4_GlobalRadiation Slot4_Windspeed Slot4_Rainfall 0.000031 0.000072

In-Sample MAPE Results

In-Sample R2 Results

Out-of-Sample MAPE Results

Out-of-Sample R2 Results

Analysis of Contribution of CWV in Optimised Models

Conclusion Improvements against Benchmark Results are made using weather and/or calendar effects on top of CWV. The significance, or non-significance, level of Weekend/Weekday/Bank Holiday is very much LDZ-specific. Global Radiation is a significant variable in all LDZ’s. Time Intervals (i.e., Slot 1 to 4) and Monday-to-Thursday dummy variable help explain customer behaviour in some LDZ’s. Relative Humidity stands out in almost every LDZ’s. CWV heavily contributes in the optimised models obtained. No cross-effects utilised in Regression models. LDZ SO and NT need further investigations