Quantifying efficiency of homogenisation methods Dr. Peter Domonkos COST HOME ES0601.

Slides:



Advertisements
Similar presentations
Key Metrics for Effective Storage Performance and Capacity Reporting.
Advertisements

Evaluating the Use of Outbreak Detection Algorithms to Detect Tuberculosis Outbreaks in Scotland Ben Tait Dr Janet Stevenson.
1 Detection and Analysis of Impulse Point Sequences on Correlated Disturbance Phone G. Filaretov, A. Avshalumov Moscow Power Engineering Institute, Moscow.
The General Linear Model. The Simple Linear Model Linear Regression.
Time Series Analysis Autocorrelation Naive & Simple Averaging
Short-term, platform- like inhomogeneities in observed climatic time series Peter Domonkos Centre for Climate Change University Rovira i Virgili, Tortosa,
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
Chapter 5 Time Series Analysis
Curve-Fitting Regression
Software Quality Control Methods. Introduction Quality control methods have received a world wide surge of interest within the past couple of decades.
Chapter 11 Multiple Regression.
Part II – TIME SERIES ANALYSIS C2 Simple Time Series Methods & Moving Averages © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Trend analysis: methodology
A Regression Model for Ensemble Forecasts David Unger Climate Prediction Center.
Slides 13b: Time-Series Models; Measuring Forecast Error
 Deviation is a measure of difference for interval and ratio variables between the observed value and the mean.  The sign of deviation (positive or.
Constant process Separate signal & noise Smooth the data: Backward smoother: At any give T, replace the observation yt by a combination of observations.
Two and a half problems in homogenization of climate series concluding remarks to Daily Stew Ralf Lindau.
Accuracy Assessment. 2 Because it is not practical to test every pixel in the classification image, a representative sample of reference points in the.
Multiple testing correction
Statistics for Linguistics Students Michaelmas 2004 Week 1 Bettina Braun.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.
Quantifying the dynamics of Binary Search Trees under combined insertions and deletions BACKGROUND The complexity of many operations on Binary Search Trees.
Detection of inhomogeneities in Daily climate records to Study Trends in Extreme Weather Detection of Breaks in Random Data, in Data Containing True Breaks,
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
After HOME : Progress in the practical application of statistical homogenisation Peter Domonkos Dimitrios Efthymiadis Centre for Climate Change University.
Statistical Methods II Session 8 Non Parametric Testing – The Wilcoxon Signed Rank Test.
VARIANCE & STANDARD DEVIATION By Farrokh Alemi, Ph.D. This lecture is organized by Dr. Alemi and narrated by Yara Alemi. The lecture is based on the OpenIntro.
CS433: Modeling and Simulation Dr. Anis Koubâa Al-Imam Mohammad bin Saud University 15 October 2010 Lecture 05: Statistical Analysis Tools.
On the multiple breakpoint problem and the number of significant breaks in homogenisation of climate records Separation of true from spurious breaks Ralf.
Breaks in Daily Climate Records Ralf Lindau University of Bonn Germany.
Time series Decomposition Farideh Dehkordi-Vakil.
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 5: Exponential Smoothing (Ch. 8) Material.
Bayesian inference for Plackett-Luce ranking models
On the reliability of using the maximum explained variance as criterion for optimum segmentations Ralf Lindau & Victor Venema University of Bonn Germany.
Benchmark database inhomogeneous data, surrogate data and synthetic data Victor Venema.
Basic Business Statistics
Development and testing of homogenisation methods: Moving parameter experiments Peter Domonkos and Dimitrios Efthymiadis Centre for Climate Change University.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
A Re-Evaluation of The Tennessee STAR Project
CLASSICAL NORMAL LINEAR REGRESSION MODEL (CNLRM )
ES 07 These slides can be found at optimized for Windows)
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
Developing long-term homogenized climate Data sets Olivier Mestre Météo-France Ecole Nationale de la Météorologie Université Paul Sabatier, Toulouse.
1 Detection of discontinuities using an approach based on regression models and application to benchmark temperature by Lucie Vincent Climate Research.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
The joint influence of break and noise variance on break detection Ralf Lindau & Victor Venema University of Bonn Germany.
The Unscented Kalman Filter for Nonlinear Estimation Young Ki Baik.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Homogenization of daily data series for extreme climate index calculation Lakatos, M., Szentimey T. Bihari, Z., Szalai, S. Meeting of COST-ES0601 (HOME)
Modelling Multiple Lines of Business: Detecting and using correlations in reserve forecasting. Presenter: Dr David Odell Insureware, Australia.
Inhomogeneities in temperature records deceive long-range dependence estimators Victor Venema Olivier Mestre Henning W. Rust Presentation is based on:
Demand Management and Forecasting Chapter 11 Portions Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Fundamentals of Data Analysis Lecture 10 Correlation and regression.
Data Assimilation Research Testbed Tutorial
Data Mining: Concepts and Techniques
Descriptive Statistics (Part 2)
Correlation – Regression
Break and Noise Variance
The break signal in climate records: Random walk or random deviations
Fundamentals of regression analysis
Adjustment of Temperature Trends In Landstations After Homogenization ATTILAH Uriah Heat Unavoidably Remaining Inaccuracies After Homogenization Heedfully.
DAY 3 Sections 1.2 and 1.3.
Properties of Random Numbers
Dipdoc Seminar – 15. October 2018
Lecture Slides Elementary Statistics Twelfth Edition
15.1 The Role of Statistics in the Research Process
Presentation transcript:

Quantifying efficiency of homogenisation methods Dr. Peter Domonkos COST HOME ES0601

Measuring efficiency our expectations Gaining the real climatic trends, Gaining the real trends and fluctuations, Identifying large inhomogeneity-shifts one-by-one, Identifying as many shifts as we can

Measuring efficiency general practice Usually the rate of correct detection is examined (Ducré- Robitaille, Mestre, Menne and Williams, etc.) Menne and Williams (2005) apply the hit rate (or power, = H), false detection rate (F), false alarm rate (FAR), bias of detection frequency (B), and the improvement in skill compared to random forecasts (HSS).

Measuring efficiency general practice

Measuring efficiency this presentation Arbitrary, but reasonable choices 1 = standard deviation of estimated noise Factual shift: Shift with M  M 0 magnitude between two adjacent 3 year long periods. M 0 = 2 or M 0 = 3 here. Right detection: A shift with M  1.5 for M 0 = 2 (M  2 for M 0 = 3) is detected with maximum 1 year lapse. False detection: A shift with M  1.5 for M 0 = 2 (M  2 for M 0 = 3) is detected at year j, but there is no shift of the same direction than the detected one with M > 0 within the (j-2,j+2) period.

Measuring efficiency this presentation Let the number of the time series be m, the total of the factual shifts is k, the number of right detections is D R, that of false detections is D F, then

Measuring efficiency this presentation Reliability of trends!? Let the mean bias of trend slopes, caused by inhomogeneities is t 0 before the homogenisation, and t after the homogenisation. Then the improvement in trend reliability is indicated by General (combined) efficiency (Domonkos, 5th Seminar, 2006)

Properties of time series Five versions of simulated datasets are examined here. Each dataset has 10,000, one hundred year long time series. The scale of the properties is wide from a single inhomogeneity per time series to the inclusion of very complex inhomogeneity-structures „Hungarian standard” (Domonkos, 5th Seminar, 2006). (1) 1 shift with M = 3; (2) 1 shift with M = 3 and 4 shifts with M = 1.5; (3) and (4) Shifts with 1/ decade frequency, exponential distribution of M above 1, and uniform distribution of M below 1. (3) M max <2; (4) M max <3; (5) Hungarian standard

Distribution of difference (percentage) between the detected inhomogeneity-properties of simulated and real climatic time series for HU STANDARD. k : simple, wk : weighted with sample size

Homogenisation methods 15 objective homogenisation methods: 2-2 versions of Bayes-test [Bay, Ba1], Buishand-test [Bu1, Bu2], SNHT [SNH, SNT] and t-test [tt1, tt2]; Caussinus-Mestre test [C-M], Easterling-Peterson test [E-P], Mann-Kendall test [M-K], MASH [MAS], Multiple Linear Regression [MLR], Pettitt-test [Pet] and Wilcoxon Rank Sum test [WRS].

Method parameterisation With original parameterisations the chance of detecting at least 1 inhomogeneity is ~5% in pure white noise. Minimum length of subperiods for calculating own statistical properties: usually 5 years, but in C-M and MAS 1 year, and in E-P 3 years. Outliers are prefiltered; Concerning multiple inhomogeneities the semihierarchic algorithm of Moberg and Alexandersson (1997) is included in Bay, Ba1, Bu1, Bu2, M-K, MLR, Pet, SNH, SNT and WRS. In a few experiments optimised parameterisation is applied (its use is indicated).

Red = C-M Blue = MASH Green = E-P Black = t- test (tt1) Brown = SNHT for shifts Lila = MLR

Identification A, 1 shift (M=3)

Identification A, 1 shift (M=3) + 4 small shifts

Identif. A of M  3, Exp. M<6

Identif.A of M  2, Hu standard

Identif.A of M  3, Hu standard

Identif.B of M  2, Exp. M<2

Identif.B of M  3, Exp. M<2

Absence of large shifts number of kinds: 7, best: tt1, C-M, Bay

Trends, 1 shift (M=3) filled columns = optimised parameters

Trends, 1 shift + 4 small shifts

Trends, Exp. M<2

Trends, Exp. M<6

Trends, Hu standard

Identification A, 1 shift

Identif.A, 1 shift + 4 small shifts

Identif.B of M  2, Exp. M<2

Identif.B of M  3, Exp. M<2

Identif.A of M  2, Hu standard

Identif.A of M  3, Hu standard

Identif.A of M  3, Exp. M<6

Discussion Identification of M>3 shifts is best with MASH, but its reproduction of climatic trends is not among the best results. This drawback of MASH can be reduced with parameter-optimisation. Many results with C-M are on the top, except for cases of very low rate of large inhomogeneities. If the evaluations of shorter than 3-year sections are excluded, and detection results with M 3 exceeds the performance of MASH.

Conclusions The efficiency-order of homogenisation methods strongly depends on the properties of time series, the purposes/priorities of the homogenisation, and on the way of the efficiency evaluation. Direct methods for identifying multiple inhomogeneities (C-M and MASH) usually perform better, than the other methods. When the avoidance of false detection has enhanced importance t-test and E- P methods are also competitive. Parameter-optimisation may yield improved results.

Thank you for your attention! COST HOME ES0601