On the multiple breakpoint problem and the number of significant breaks in homogenisation of climate records Separation of true from spurious breaks Ralf.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne.
Break Position Errors in Climate Records Ralf Lindau & Victor Venema University of Bonn Germany.
Visual Recognition Tutorial
© 2010 Pearson Prentice Hall. All rights reserved Single Factor ANOVA.
First introduced in 1977 Lots of mathematical derivation Problem : given a set of data (data is incomplete or having missing values). Goal : assume the.
Error Propagation. Uncertainty Uncertainty reflects the knowledge that a measured value is related to the mean. Probable error is the range from the mean.
Image processing. Image operations Operations on an image –Linear filtering –Non-linear filtering –Transformations –Noise removal –Segmentation.
SMOS SMOS Workshop – Hamburg, November 2006 Uni Bonn contributions to SMOS Validation Ralf Lindau Bonn University.
Simulation Modeling and Analysis Session 12 Comparing Alternative System Designs.
Corporate Finance Portfolio Theory Prof. André Farber SOLVAY BUSINESS SCHOOL UNIVERSITÉ LIBRE DE BRUXELLES.
Estimation and the Kalman Filter David Johnson. The Mean of a Discrete Distribution “I have more legs than average”
Chapter 2 Simple Comparative Experiments
7. Homogenization Seminar Budapest – October 2011 What is the correct number of break points hidden in a climate record? Ralf Lindau Victor Venema.
Daily Stew Kickoff – 27. January 2011 First Results of the Daily Stew Project Ralf Lindau.
Basic Image Processing January 26, 30 and February 1.
ELE 745 – Digital Communications Xavier Fernando
Standard error of estimate & Confidence interval.
Constant process Separate signal & noise Smooth the data: Backward smoother: At any give T, replace the observation yt by a combination of observations.
Two and a half problems in homogenization of climate series concluding remarks to Daily Stew Ralf Lindau.
Regression and Correlation Methods Judy Zhong Ph.D.
Handling Data and Figures of Merit Data comes in different formats time Histograms Lists But…. Can contain the same information about quality What is meant.
PBG 650 Advanced Plant Breeding
Detection of inhomogeneities in Daily climate records to Study Trends in Extreme Weather Detection of Breaks in Random Data, in Data Containing True Breaks,
1 LES of Turbulent Flows: Lecture 1 Supplement (ME EN ) Prof. Rob Stoll Department of Mechanical Engineering University of Utah Fall 2014.
Bayesian inference review Objective –estimate unknown parameter  based on observations y. Result is given by probability distribution. Bayesian inference.
1 Trend Analysis Step vs. monotonic trends; approaches to trend testing; trend tests with and without exogeneous variables; dealing with seasonality; Introduction.
DDS – 12. December 2011 What is the correct number of break points hidden in a climate record?
Breaks in Daily Climate Records Ralf Lindau University of Bonn Germany.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
7. Homogenization Seminar Budapest – 24. – 27. October 2011 What is the correct number of break points hidden in a climate record? Ralf Lindau Victor Venema.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
On the reliability of using the maximum explained variance as criterion for optimum segmentations Ralf Lindau & Victor Venema University of Bonn Germany.
Estimators and estimates: An estimator is a mathematical formula. An estimate is a number obtained by applying this formula to a set of sample data. 1.
Radiation Detection and Measurement, JU, 1st Semester, (Saed Dababneh). 1 Radioactive decay is a random process. Fluctuations. Characterization.
Development and testing of homogenisation methods: Moving parameter experiments Peter Domonkos and Dimitrios Efthymiadis Centre for Climate Change University.
A novel methodology for identification of inhomogeneities in climate time series Andrés Farall 1, Jean-Phillipe Boulanger 1, Liliana Orellana 2 1 CLARIS.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Active Learning Lecture Slides For use with Classroom Response Systems Chapter 9 Random Variables.
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
BASIC STATISTICAL CONCEPTS Statistical Moments & Probability Density Functions Ocean is not “stationary” “Stationary” - statistical properties remain constant.
Correction of spurious trends in climate series caused by inhomogeneities Ralf Lindau.
The joint influence of break and noise variance on break detection Ralf Lindau & Victor Venema University of Bonn Germany.
7.2 Means & Variances of Random Variables AP Statistics.
Quantifying efficiency of homogenisation methods Dr. Peter Domonkos COST HOME ES0601.
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
Geology 5670/6670 Inverse Theory 6 Feb 2015 © A.R. Lowry 2015 Read for Mon 9 Feb: Menke Ch 5 (89-114) Last time: The Generalized Inverse; Damped LS The.
CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.
Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Tests of hypothesis Contents: Tests of significance for small samples
Identify the random variable of interest
Measurement, Quantification and Analysis
C4.5 - pruning decision trees
Statistical Data Analysis - Lecture10 26/03/03
Break and Noise Variance
Chapter 2 Simple Comparative Experiments
The break signal in climate records: Random walk or random deviations
Information Units of Measurement
Adjustment of Temperature Trends In Landstations After Homogenization ATTILAH Uriah Heat Unavoidably Remaining Inaccuracies After Homogenization Heedfully.
Error rate due to noise In this section, an expression for the probability of error will be derived The analysis technique, will be demonstrated on a binary.
Weighted Least Squares Fit
Dipdoc Seminar – 15. October 2018
Estimates Made Using Sx
Statistical Process Control
REPORT of the REGIME SHIFTS DETECTION GROUP
Continuous Random Variables
Contrasts & Statistical Inference
Continuous Random Variables: Basics
Presentation transcript:

On the multiple breakpoint problem and the number of significant breaks in homogenisation of climate records Separation of true from spurious breaks Ralf Lindau & Victor Venema University of Bonn

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 Internal and External Variance Consider the differences of one station compared to a neighbour or a reference. Breaks are defined by abrupt changes in the station-reference time series. Internal variance within the subperiods External variance between the means of different subperiods Criterion: Maximum external variance attained by a minimum number of breaks

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 Decomposition of Variance n total number of years N subperiods n i years within a subperiod The sum of external and internal variance is constant.

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 First Question How do random data behave? Needed as stop criterion for the number of significant breaks.

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 Random Time Series with stddev = 1 Segment averages x i scatter randomly mean : 0 stddev:1/ Because any deviation from zero can be seen as inaccuracy due to the limited number of members.

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012  2 -distribution The external variance is equal to the mean square sum of a random standard normal distributed variable. Weighted measure for the variability of the subperiods‘ means

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 From  2 to  distribution n = 21 years k = 7 breaks As the total variance is normalized to 1, a kind of normalized chi 2 -distribution is expected: This is the  -distribution. data   The exceeding probability P gives the best (maximum) solution for v Incomplete Beta Function 7 breaks in 21 years

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 Added variance per break Incomplete  -function: Transformation to dv/dk: mean 90% 95%

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 The extisting algorithm Prodige Original formulation of Caussinus and Mestre for the penalty term in Prodige Translation into terms used by us. Normalisation by k* = k / (n -1) Derivation to get the minimum In Prodige it is postulated that the relative gain of external variance is a constant for given n.

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 Shorter length, less certainty n = 21 yearsn = 101 years Exceeding probability 1/128 1/64 1/32 1/16 1/8 1/4

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 Second Question How do true breaks behave?

True Breaks 12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 

Identical Behaviour True breaks behave identical to random data. But the abscissa-scale is now: k / n k instead of k / n. Compared to random time series the external variance grows faster by the factor n / n k 12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 data theory n k = 19 true breaks within n = 100 years time series Assumed / True Break Number k / n k

Break vs Scatter Regime Simulated data with 19 breaks interfered by scatter The internal variance decrease as a function of break number. In the break regime the variance decrease faster by the factor: 15 breaks are detectable, depending on signal to noise ratio. 12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 Time series length Number of true breaks

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012 Conclusions The analysis of random data shows that the external variance is  -distributed, which leads to a new formulation for the penalty term. True breaks are also  -distributed. Their external variance increases faster by a factor of n/n k compared to random scatter.