On the reliability of using the maximum explained variance as criterion for optimum segmentations Ralf Lindau & Victor Venema University of Bonn Germany.

Slides:



Advertisements
Similar presentations
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Advertisements

FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Break Position Errors in Climate Records Ralf Lindau & Victor Venema University of Bonn Germany.
Chapter 13: Inference for Distributions of Categorical Data
Chapter 10: Hypothesis Testing
Introduction to Hypothesis Testing
Error Propagation. Uncertainty Uncertainty reflects the knowledge that a measured value is related to the mean. Probable error is the range from the mean.
MAE 552 Heuristic Optimization
Topic 2: Statistical Concepts and Market Returns
Diplomanden-Doktoranden-Seminar Bonn – 18. Mai 2008 LandCaRe 2020 Temporal downscaling of heavy precipitation and some general thoughts about downscaling.
Efficient Estimation of Emission Probabilities in profile HMM By Virpi Ahola et al Reviewed By Alok Datar.
7. Homogenization Seminar Budapest – October 2011 What is the correct number of break points hidden in a climate record? Ralf Lindau Victor Venema.
Daily Stew Kickoff – 27. January 2011 First Results of the Daily Stew Project Ralf Lindau.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Relationships Among Variables
Constant process Separate signal & noise Smooth the data: Backward smoother: At any give T, replace the observation yt by a combination of observations.
Two and a half problems in homogenization of climate series concluding remarks to Daily Stew Ralf Lindau.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Determining Sample Size
Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.
Week 9 Testing Hypotheses. Philosophy of Hypothesis Testing Model Data Null hypothesis, H 0 (and alternative, H A ) Test statistic, T p-value = prob(T.
RMTD 404 Lecture 8. 2 Power Recall what you learned about statistical errors in Chapter 4: Type I Error: Finding a difference when there is no true difference.
Estimation of Statistical Parameters
Detection of inhomogeneities in Daily climate records to Study Trends in Extreme Weather Detection of Breaks in Random Data, in Data Containing True Breaks,
Statistical inference. Distribution of the sample mean Take a random sample of n independent observations from a population. Calculate the mean of these.
Chapter 8 Introduction to Hypothesis Testing
Measures of Variability In addition to knowing where the center of the distribution is, it is often helpful to know the degree to which individual values.
Physics 114: Exam 2 Review Lectures 11-16
On the multiple breakpoint problem and the number of significant breaks in homogenisation of climate records Separation of true from spurious breaks Ralf.
DDS – 12. December 2011 What is the correct number of break points hidden in a climate record?
Research Process Parts of the research study Parts of the research study Aim: purpose of the study Aim: purpose of the study Target population: group whose.
Breaks in Daily Climate Records Ralf Lindau University of Bonn Germany.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
Copyright © 2012 Pearson Education. All rights reserved © 2010 Pearson Education Copyright © 2012 Pearson Education. All rights reserved. Chapter.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Section 10.1 Confidence Intervals
Chapter 8 Delving Into The Use of Inference 8.1 Estimating with Confidence 8.2 Use and Abuse of Tests.
7. Homogenization Seminar Budapest – 24. – 27. October 2011 What is the correct number of break points hidden in a climate record? Ralf Lindau Victor Venema.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Two-Sample Hypothesis Testing. Suppose you want to know if two populations have the same mean or, equivalently, if the difference between the population.
Lecture 17 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
Chapter 3: Statistical Significance Testing Warner (2007). Applied statistics: From bivariate through multivariate. Sage Publications, Inc.
Robust Estimators.
Data Analysis.
Correlation & Regression Analysis
Correction of spurious trends in climate series caused by inhomogeneities Ralf Lindau.
Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.
The joint influence of break and noise variance on break detection Ralf Lindau & Victor Venema University of Bonn Germany.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
MODEL DIAGNOSTICS By Eni Sumarminingsih, Ssi, MM.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
J. A. Landsheer & G. van den Wittenboer
Break and Noise Variance
CHAPTER 11 Inference for Distributions of Categorical Data
The break signal in climate records: Random walk or random deviations
Adjustment of Temperature Trends In Landstations After Homogenization ATTILAH Uriah Heat Unavoidably Remaining Inaccuracies After Homogenization Heedfully.
Dipdoc Seminar – 15. October 2018
NanoBPM Status and Multibunch Mark Slater, Cambridge University
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Product moment correlation
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
STA 291 Spring 2008 Lecture 17 Dustin Lueker.
Presentation transcript:

On the reliability of using the maximum explained variance as criterion for optimum segmentations Ralf Lindau & Victor Venema University of Bonn Germany

8 th Seminar for Homogenization and QC in Climatological Databases – Internal and External Variance Consider the differences of one station compared to a neighbor reference. The dominating natural variance is cancelled out, because it is very similar at both stations. Breaks become visible by abrupt changes in the station-reference time series. Internal variance within the subperiods External variance between the means of different subperiods Break criterion: Maximum external variance

Part I True skill (signal RMS) and explained variance are only weakly correlated

RMS 2 as skill measure Consider the time series of the inhomogeneities as a signal that we want to detect. 8 th Seminar for Homogenization and QC in Climatological Databases –

RMS 2 as skill measure 8 th Seminar for Homogenization and QC in Climatological Databases – Consider the time series of the inhomogeneities as a signal that we want to detect. This is hampered by superimposed noise.

RMS 2 as skill measure 8 th Seminar for Homogenization and QC in Climatological Databases – Consider the time series of the inhomogeneities as a signal that we want to detect. This is hampered by superimposed noise. Homogenization algorithms search for the maximum external variance of the noisy data. This is the proposed signal.

RMS 2 as skill measure 8 th Seminar for Homogenization and QC in Climatological Databases – Consider the time series of the inhomogeneities as a signal that we want to detect. This is hampered by superimposed noise. Homogenization algorithms search for the maximum external variance of the noisy data. This is the proposed signal. The Mean Squared Difference (RMS 2 ) between proposed and true signal is an appropriate skill measure.

Explained Variance versus Signal RMS We start from very simplistic (random) segmentations to see the full variety of solutions and their correlation. We have two measures: 1.The variance explained by the tested breaks in the noisy data. 2.The Mean Squared Deviation between proposed and true signal. For real cases, 1 is the only available measure as the true signal is not known. With simulated data we are able to compare 1 and 2. 8 th Seminar for Homogenization and QC in Climatological Databases –

100 instead of 1 time series Repeat the exercise for 100 instead of one time series. The best of 100 random solutions are marked for each of the 100 time series. For the explained variance with 0 For the really best solution with + 8 th Seminar for Homogenization and QC in Climatological Databases –

Best solutions Show only the best solutions (0 and +) for each time series. Crosses and circles are clearly separated. 8 th Seminar for Homogenization and QC in Climatological Databases –

From 100 to 1000 Increase numbers from 100 by 100 to 1000 by Show only circles, the normally proposed solutions, determined by the maximum explained variance. Mean explained variance is Mean RMS 2 is 0.881, not far away from 1 (no skill). 8 th Seminar for Homogenization and QC in Climatological Databases –

Dynamic Programming Now use Dynamic Programming to find the optimum in explained variance, instead of choosing just the best of Explained variance increases, but also the signal deviation. With it is larger than 1, which is worse than doing nothing. Continuing the search until the true number of breaks is reached, produces very bad solutions. 8 th Seminar for Homogenization and QC in Climatological Databases –

Standard search 8 th Seminar for Homogenization and QC in Climatological Databases –

RMS Standard vs. arbitrary For SNR = ½, the skills of standard search and an arbitrary segmentation are comparable. Obviously, the standard search is mainly optimizing the noise, producing completely random results. 8 th Seminar for Homogenization and QC in Climatological Databases –

Increased Stop Criterion Does a higher stop criterion help? Increase the stop criterion by a factor of 1.5 (from 2 ln(n) to 3 ln(n)). The signal deviation even increases from to The reason is that more zero solutions are produced (with RMS of 1), which is not compensated by more accurate non-zero solutions. 8 th Seminar for Homogenization and QC in Climatological Databases –

Which SNR is sufficient? RMS skill for: 0Random segmentation +Standard search for different SNRs. So far we considered SNR = ½ Random segmentation and standard search have comparable skills. Only for SNR > 1, the standard search is significantly better. 8 th Seminar for Homogenization and QC in Climatological Databases – Random Standard

Conclusions Part I Break search algorithm rely on the explained variance to identify the breakpoints. For signal to noise ratios of ½, the explained variance does not reflect the true skill. Consequently, the obtained segmentations do not differ significantly from random. 8 th Seminar for Homogenization and QC in Climatological Databases –

Part II Theoretical Explanation: Break and Noise Variance

Behavior of Noise 8 th Seminar for Homogenization and QC in Climatological Databases – optimum mean (random) Lindau, R. and V. Venema, 2013: On the multiple breakpoint problem and the number of significant breaks in homogenization of climate records, Idöjaras - Quarterly Journal of the Hungarian Meteorological Service, 117, No. 1, 1-34.

True Breaks 8 th Seminar for Homogenization and QC in Climatological Databases – For true breaks, constant periods exist. Tested segment averages are the (weighted) means of such (few) constant periods. This is quite the same situation as for random scatter, only that less independent data is underlying. Obviously, the number of breaks n k plays the same role as the time series length n did before for random scatter. Consequently, we expect the same mathematical behaviour, but on another scale.

Behavior of breaks 8 th Seminar for Homogenization and QC in Climatological Databases – optimum mean (random) k / n k

Why k/(n k +k) ? Short explanation: Consider a random segmentation trial with k break positions, where k is equal to the correct number of breaks n k. k = n k Each test segment spans (in average) over two true segments. This means that always 2 segment means are averaged, which reduces the variance by a factor of 2. 8 th Seminar for Homogenization and QC in Climatological Databases –

Four formulae 8 th Seminar for Homogenization and QC in Climatological Databases –

Best and mean break variance Signal to noise ratio = 1 / 3. V break = 0.1 V noise = 0.9 Draw known formulas for v(k). Best break segmentation Bb reaches full break variance early before n k. Mean break segmentation Bm reaches half break variance at n k. 8 th Seminar for Homogenization and QC in Climatological Databases –

Best and mean noise variance Best noise segmentation grows with about 4% per break; Mean noise segmentation with 1% per break. The correct segmentation combines the best break Bb with mean noise Nm (solid). An alternative combination is best noise Nb with mean break Bm (dashed). Here, only the noise is optimally segmented. 8 th Seminar for Homogenization and QC in Climatological Databases –

Wrong and right combination The false (noise) segmentation is always larger than the true (break) segmentation. However, there is still the stop criterion, which may reject any segmentation at all, preventing in this way these wrong solutions. 8 th Seminar for Homogenization and QC in Climatological Databases –

Stop criterion as last help Only solutions exceeding the stop criterion are accepted. So it seems that it actually prevents the false combinations. However, we will see that this is not always the case. Consider not only the two extremes (completely wrong, completely right), but all transitions in between. 8 th Seminar for Homogenization and QC in Climatological Databases – Stop criterion

Break search with simulated data Create 1000 time series of length 100 with 7 breaks and SNR of 1/3. Search for the best segmentation and check, which part of the break variance and which part of the noise variance is explained. 1: Break part 2: Noise part 3: Sum of both 4: Totally explained As the best solution is chosen, 1 and 2 are typically correlated, enhancing the total explained variance (4) compared to (3). 8 th Seminar for Homogenization and QC in Climatological Databases –

Solutions are varying At first glance, the totally explained variance does not exceed the threshold. However, up to now we looked at the means over 1000 realisations. But these solutions are varying so that the threshold is often exceeded, at least for low break numbers. 8 th Seminar for Homogenization and QC in Climatological Databases –

Conclusions Part II Random segmentations are able to explain a considerable fraction of the break variance. For reasonable break numbers they explain about one half. Consequently, the breaks are set to positions where a maximum of noise is explained. Hereby, the explained noise part is increased by a factor of four compared to random. Unfortunately, this is a profitable strategy as the signal part decreases in return only by a factor of 2, compared to the optimum. 8 th Seminar for Homogenization and QC in Climatological Databases –

A priori formula The different reaction of breaks and noise on randomly inserted breaks makes it possible to estimate break variance and break number a priori. If we insert many breaks, almost the entire break variance is explained plus a known fraction of noise. At k = n k half of the break variance is reached (22.8% in total). 8 th Seminar for Homogenization and QC in Climatological Databases –

Break variance Repeated for all station pairs we find a mean break variance of about 0.2 Thus the ratio of break and noise variance is 0.2 / 0.8 = ¼ The signal to noise ratio SNR = ½ 8 th Seminar for Homogenization and QC in Climatological Databases –

Trend differences from Data German climate stations have SNR of 0.5. Trend differences of neighboring stations reflect the true uncertainty of trends (position of crosses). Errrors calculated by assuming homogeneous data are much smaller (vertical extend of crosses). We conclude that the data is strongly influenced by breaks. 8 th Seminar for Homogenization and QC in Climatological Databases –

Overall Conclusions For signal-to-noise ratios of ½ standard break search algorithms are not superior to random segmentations. This can be understood by considering the theoretical behavior of break and noise variance. For monthly temperature at German climate stations the SNR can be estimated by an a priori method to ½. Although the relative break variance might be small, breaks influence the trend estimates strongly. 8 th Seminar for Homogenization and QC in Climatological Databases –

Interpretation of k/(n-1) 8 th Seminar for Homogenization and QC in Climatological Databases –

Why k/(n k +k) ? For a random segmentation of true breaks: Originally, the time series contains n k +1 independent values. Each inserted break k cuts a true segment into two pieces, which contribute then to two different tested segments. The effective number of independents is increased from n k +1 to n k +1+k. n-1 is replaced by n k +k 8 th Seminar for Homogenization and QC in Climatological Databases –

Signal Noise to Ratio = ½ For SNR = ½, things do not look much better. The correct combination is slightly better than the false one, but still comparable in magnitude. In return, the threshold is easier to exceed. 8 th Seminar for Homogenization and QC in Climatological Databases –