Download presentation
Presentation is loading. Please wait.
Published byDwight Oliver Modified over 6 years ago
1
The break signal in climate records: Random walk or random deviations
The break signal in climate records: Random walk or random deviations? Ralf Lindau
2
Break signal Climate records are affected by breaks resulting from relocations or changes in the measuring techniques. For the detection, differences of neighboring stations are considered to reduce the dominating natural variance. Homogenization algorithms identify breaks by searching for the maximum external variance (explained by the jumps). Dipdoc Seminar – 30. May 2016
3
Benchmark datasets Benchmarking data sets are used to assess the skill of homogenization algorithms. These are artificial data sets with known breaks so that an evaluation of the algorithms is possible. However, benchmark datasets should reflect as much as possible the statistical properties of real data . An important question is how to model the breaks: As free random walk (Brownian motion) As random deviation from a fixed level (random noise) Dipdoc Seminar – 30. May 2016
4
Conceptual model Same signal, two approaches:
Which of the two DT is assumed to be an independent random variable? The deviations or the jumps? Depending on our choice different statistical properties of break signal will result. Random deviations Brownian motion Dipdoc Seminar – 30. May 2016
5
Different effects of identical s
Difference: The introduced break variance of random deviations is larger by a factor of 2 compared to Brownian motion. Reason: All jumps are created by the sum of two random numbers, while it is only one in case of Brownian motion. Preliminary: Random deviations 𝑽𝒂𝒓 𝑩𝑴 = 𝟏 𝟐 𝑽𝒂𝒓 𝑹𝑫 Brownian motion Dipdoc Seminar – 30. May 2016
6
Linearly growing variance of BM
The variance of a Brownian motion grows linearly in time. At the end of a BM time series the variance is: Var = k s2 with k: number of breaks and s2: break variance The average variance (over time) is VarBM = k/2 s2 For RD the variance is (shown before): VarRD = 2 s2 k is in the order of 5, for difference time series twice: 10. Thus k/4 is in the order of 2.5. Brownian motion created by the same s is much easier to detect. 𝑽𝒂𝒓 𝑩𝑴 = 𝒌 𝟒 𝑽𝒂𝒓 𝑹𝑫 Dipdoc Seminar – 30. May 2016
7
Which type is more realistic?
There are indications for both of the two break types: For random deviations: Relocations are bound to fixed position. Stations have geographical names and their positions are not free to fluctuate away. For random walk: Changes in measuring techniques can be seen as elimination of error sources one after the other. Ideal case: Today most errors are eliminated. Then the break signal can be seen as Brownian motion backward in time. Dipdoc Seminar – 30. May 2016
8
Different “schools” For a long time we were not aware that there are these two approaches. Williams et al. (2012) modelled random walk. Venema et al. (2012) modelled random deviations. Only the standard deviations applied were communicated. But these are not comparable for RD and BM. Dipdoc Seminar – 30. May 2016
9
Platforms & Stairs Venema et al. (2012) analyzed the statistics of the retrieved signal to decide whether breaks are BM or RD type. Platforms Stairs p (RD) = 0.67 p (actual) = 0.59 p (BM) = 0.50 But, the result was hardly significant due to the small number. And (more important): The result is dependent on the performance of the homogenization algorithm. T3 T1 T2 T3 T1 T2 Dipdoc Seminar – 30. May 2016
10
Platforms are difficult to detect
Running a homogenization algorithm with artificial pure RD data results in 0.62 – 0.64 platform frequency ( < 0.67 ). In the retrieved signal, the platforms are underestimated. The detected frequency is not suited as independent indication parameter to distinguish RD from BM. Therefore, it would be convenient to be independent from the retrieved break signal and instead able to derive break parameters directly from the data. Dipdoc Seminar – 30. May 2016
11
Two superimposed signals
We assume that the climate time series consists of two superimposed signals: Inhomogeneities and noise 𝑥 𝑖 = 𝜀 𝑏 𝑆 𝑖 + 𝜀 𝑛 𝑖 , 𝜀 𝑏 ~ 𝑁 0, 𝜎 𝑏 2 , 𝜀 𝑛 ~ 𝑁 0, 𝜎 𝑛 2 Each yearly value can be thought as the sum of two random numbers, eb and en, where eb depends on segment number S, which is defined as the number of breaks lying temporally behind. Dipdoc Seminar – 30. May 2016
12
Random deviation breaks
In case of random deviation breaks we calculate the Lag-covariance C(L): 𝐶 𝐿 = 1 𝑛−𝐿 𝑖=1 𝑛−𝐿 𝑥 𝑖 − 𝑥 𝑥 𝑖+𝐿 − 𝑥 For internal pairs E(C(L)) = 0 For external pairs E(C(L)) = sb2 𝐶 𝐿 = 𝑝 𝑖𝑛𝑡 ∙𝜎 𝑏 2 Dipdoc Seminar – 30. May 2016
13
Probability of internal pairs
The probability for internal pairs increase with segment length l and decrease with time lag L. 𝑝 𝑦𝑒𝑎𝑟 (𝑙)= 𝑙 𝑘+1 𝑛 𝑛−1−𝑙 𝑘−1 𝑛−1 𝑘 𝑝 𝑒𝑎𝑟𝑙𝑦 𝑙 = 𝑙− min (𝑙,𝐿) 𝑙 𝑝 𝑖𝑛𝑡 (𝐿)= 𝑙=1 𝑛−𝑘 𝑝 𝑦𝑒𝑎𝑟 (𝑙)∙ 𝑝 𝑒𝑎𝑟𝑙𝑦 (𝑙, 𝐿) Probability of a specific year to belong to segment of length l: Probability of a specific year to have sufficient spacing to the next break: Probability of internal pairs is the sum over all length of the product. Dipdoc Seminar – 30. May 2016
14
Probability of internal pairs
𝑝 𝑖𝑛𝑡 = 𝑙=1 𝑛−𝑘 𝑙 𝑘+1 𝑛 ∙ 𝑛−1−𝑙 𝑘−1 𝑛−1 𝑘 ∙ 𝑙− min 𝑙,𝐿 𝑙 𝑝 𝑖𝑛𝑡 = 𝑛−1−𝐿 𝑘 𝑛−1 𝑘 𝑝 𝑖𝑛𝑡 = 𝑒 − 𝑘𝐿 𝑛−𝑘 The long version of the product : By some purely arithmetic transformations we get: By some further approximations we get: Dipdoc Seminar – 30. May 2016
15
Lag covariance for RD The covariance is an exponential function of the time lag. C(L) = a exp (-bL) break a = sb2 strength sb b = k/(n-k) number k As byproduct we have a nice method to retrieve also strength and number of breaks directly from the data. Input: sb = 1.000 k = 5.000 Output: k = 4.984 Dipdoc Seminar – 30. May 2016
16
Brownian motion type For Brownian motion type breaks the covariance depends only on the segment number of the earlier of the two years , because they have all random numbers eb constituting the break signal at x(i) in common. 𝐶𝑜𝑣 𝑥(𝑖),𝑥 𝑗 = 𝑆 𝑖 𝜎 𝑏 , 𝑖<𝑗 The segment number is a stochastic variable growing linearly in time: 𝑆 𝑖 = 𝑖−1 𝑛−1 𝑘 , 𝑖≤𝑛 Consequently, also the covariance grows linearly with time: 𝐶𝑜𝑣 𝑥 𝑖 ,𝑥(𝑗) = 𝑖−1 𝑛−1 𝑘 𝜎 𝑏 , 𝑖 <𝑗≤𝑛 Dipdoc Seminar – 30. May 2016
17
Time dependent Cov for BM
Input: sb = 1.000 k = 5.000 Output: sb = 1.005 k = 4.920 The covariance is a linear function in time. C(i) = a i + b a = k/(n-1) sb2 b = ( 1 - k/(n-1)) sb2 Dipdoc Seminar – 30. May 2016
18
Conclusion Brownian motion and random deviation break types can be distinguished by calculating: Lag covariance C(L) Time dependent covariance C(i) For Random deviations C(L) is decreasing with L. For Brownian motion C(i) is increasing with j. The two other combinations remain constant. As byproduct we get an estimate for break size and number without running a full homogenization algorithm. Dipdoc Seminar – 30. May 2016
19
Platforms & Stairs Venema et al. (2012) analyzed the statistics of the retrieved signal to decide whether breaks are BM or RD type. They distinguish platforms: from stairs: T3 T1 T2 T3 T1 T2 Dipdoc Seminar – 30. May 2016
20
Platform probability for RD
For RD break types T1, T2, T3 are iid random variables (not the case for BM). There are 6 possibilities of rank order, which all have the same probability: Upward stair T1 < T2 < T3 T1 < T3 < T2 T2 < T1 < T3 T2 < T3 < T1 T3 < T1 < T2 T3 < T2 < T1 Upward and downward stairs have both the probability 1/6. Every other combination is a platform. (Either T2 is the smallest or T2 is the largest element of the triple.) Downward stair For RD break types the probability of platforms is 2/3. Dipdoc Seminar – 30. May 2016
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.