Break Position Errors in Climate Records Ralf Lindau & Victor Venema University of Bonn Germany.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Chapter 12 Simple Linear Regression
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Sampling: Final and Initial Sample Size Determination
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
L.M. McMillin NOAA/NESDIS/ORA Regression Retrieval Overview Larry McMillin Climate Research and Applications Division National Environmental Satellite,
Statistics 1: Introduction to Probability and Statistics Section 3-3.
Introduction to Statistics
Chapter 12 Simple Linear Regression
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Error Propagation. Uncertainty Uncertainty reflects the knowledge that a measured value is related to the mean. Probable error is the range from the mean.
Sample size computations Petter Mostad
Chapter 6 Continuous Random Variables and Probability Distributions
Topic 2: Statistical Concepts and Market Returns
Chapter 2 Simple Comparative Experiments
Chapter 5 Continuous Random Variables and Probability Distributions
Chap 6-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 6 Continuous Random Variables and Probability Distributions Statistics.
Experimental Evaluation
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
7. Homogenization Seminar Budapest – October 2011 What is the correct number of break points hidden in a climate record? Ralf Lindau Victor Venema.
Measurement, Quantification and Analysis Some Basic Principles.
Daily Stew Kickoff – 27. January 2011 First Results of the Daily Stew Project Ralf Lindau.
On the Accuracy of Modal Parameters Identified from Exponentially Windowed, Noise Contaminated Impulse Responses for a System with a Large Range of Decay.
Chapter 4 Continuous Random Variables and Probability Distributions
Two and a half problems in homogenization of climate series concluding remarks to Daily Stew Ralf Lindau.
V. Rouillard  Introduction to measurement and statistical analysis ASSESSING EXPERIMENTAL DATA : ERRORS Remember: no measurement is perfect – errors.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
Regression Method.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Detection of inhomogeneities in Daily climate records to Study Trends in Extreme Weather Detection of Breaks in Random Data, in Data Containing True Breaks,
Chapter 13 Wiener Processes and Itô’s Lemma
Lecture 3 A Brief Review of Some Important Statistical Concepts.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
Chapter 2 Characterizing Your Data Set Allan Edwards: “Before you analyze your data, graph your data.
1 Statistical Distribution Fitting Dr. Jason Merrick.
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
On the multiple breakpoint problem and the number of significant breaks in homogenisation of climate records Separation of true from spurious breaks Ralf.
DDS – 12. December 2011 What is the correct number of break points hidden in a climate record?
Breaks in Daily Climate Records Ralf Lindau University of Bonn Germany.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
StatsParallelLogsGeoMisc Question The class average on a test is 90% with a standard deviation of 2.7. How many standard deviations.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
7. Homogenization Seminar Budapest – 24. – 27. October 2011 What is the correct number of break points hidden in a climate record? Ralf Lindau Victor Venema.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Descriptive Statistics Used to describe a data set –Mean, minimum, maximum Usually include information on data variability (error) –Standard deviation.
On the reliability of using the maximum explained variance as criterion for optimum segmentations Ralf Lindau & Victor Venema University of Bonn Germany.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Data Analysis.
Radiation Detection and Measurement, JU, 1st Semester, (Saed Dababneh). 1 Radioactive decay is a random process. Fluctuations. Characterization.
Review of Statistical Terms Population Sample Parameter Statistic.
Handbook for Health Care Research, Second Edition Chapter 10 © 2010 Jones and Bartlett Publishers, LLC CHAPTER 10 Basic Statistical Concepts.
Correction of spurious trends in climate series caused by inhomogeneities Ralf Lindau.
The joint influence of break and noise variance on break detection Ralf Lindau & Victor Venema University of Bonn Germany.
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Chapter 13 Wiener Processes and Itô’s Lemma 1. Stochastic Processes Describes the way in which a variable such as a stock price, exchange rate or interest.
Identify the random variable of interest
SUR-2250 Error Theory.
Measurement, Quantification and Analysis
Break and Noise Variance
The break signal in climate records: Random walk or random deviations
Adjustment of Temperature Trends In Landstations After Homogenization ATTILAH Uriah Heat Unavoidably Remaining Inaccuracies After Homogenization Heedfully.
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
Dipdoc Seminar – 15. October 2018
Random WALK, BROWNIAN MOTION and SDEs
Real-time Uncertainty Output for MBES Systems
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Presentation transcript:

Break Position Errors in Climate Records Ralf Lindau & Victor Venema University of Bonn Germany

12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 Internal and External Variance Consider the differences of one station compared to a neighbor reference. Breaks are defined by abrupt changes in the station-reference time series. Internal variance within the subperiods External variance between the means of different subperiods Break criterion: Maximum external variance

12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 Decomposition of Variance n total number of years N subperiods n i years within a subperiod The sum of external and internal variance is constant.

Position errors Two segments of lengths n 1 and n 2 with means x 1 and x 2. A subsegment of length m with mean x 0 is erroneously exchanged from segment 2 to segment 1. x 1 is strongly reduced, x 2 differs slightly. x 1 and x 2 converge. This reduces the external variance, and the wrong segmentation is rejected. 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

Change of external variance 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 The change of external variance  v is only a function of the means and lengths of the two segments and the exchanged subsegment.

Express x 0 by x 2 plus scatter 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 The mean of the exchanged subsegment x 0 is equal to x 2, the segment mean where it stem from, plus a random scatter variable   depends on the internal variance  2 and the length m, because it is a mean over m random numbers. with

Quadratic function for  v 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 Replace x 0 by  and normalize by the square of the jump height d. The change of the normalized external variance v*, which is the decision criterion for break detection, is a quadratic function of a random variable  which depends on the signal to noise ratio and the length of the exchanged segment.

Zero points If the parabola becomes positive, the shift of the break position by m items leads to increased external variance so that this solution is preferred by mistake. Zero points at: 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

Simulated data 10,000 random time series of length 100. Internal  = 1 Jump height = 2 Data confirm the existence of different parabolae for different m. But data coverage only for scatter near zero, never reaching the negative solution. 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 m=1 m=2 m=3  (n  v) / 4 } SNR = 1

The negative solution Typical situation: SNR extreme low. A drastically disturbed measurement near the break. Its exchange leads to x 1 ’ x 1. The two means diverge so that the external variance grows. 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 X1 X1’X1 X1’ X2’ X2X2’ X2

The positive solution A subsegment adjacent to the true break is randomly lifted by more than half of the jump height. Including it to the neighboring segment will reduce the internal variance. An erroneous break position is concluded. Criterion: Maximum hatched area 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

Brownian motion with drift 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 Drift = - SNR d 

Theoretical retrace 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

Distribution of the time of the maximum of a Brownian motion with drift Strictly valid only for continuous processes. 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 Buffet, 2003, J Appl Math Stoch Anal _ _ _ _ _ Buffet, Numerical simulation of a discrete Brownian motion with drift Complete break search simulation SNR = 0.5 SNR = 1SNR = 2

Two more problems 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 Buffet, 2003 Hit rate is not accurately reproduced Break errors are a two-sided symmetric process. Both, too early and too late breaks are possible.

Hit rate The hit rate h can be estimated for all drifts d by: 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 true estimated

Two-sided processes 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

Practical application The hit rate drops from from 95% for SNR = 2 to 29% for SNR = 0.5 SNR > 1  becoming quickly very exact. SNR < 1  becoming quickly very inexact. 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 true estimated SNR = 1 SNR = 2 SNR = 0.5

12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013 Conclusions Break position errors can be described by the distribution of the time of maximum of a Brownian motion with drift. The drift parameter is equal to the signal to noise ratio, as given by the half jump height between and the internal standard deviation within homogeneous subperiods.

Hit rate simulation The hit rate is the probability that the initial value is never exceeded. For realistic drift sizes the value converges after a few steps. 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

Preliminary maximum 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013

Hit rate estimate 12 th International Meeting on Statistical Climatology, Jeju, Korea – 24. June 2013