Presentation is loading. Please wait.

Presentation is loading. Please wait.

The break signal in climate records: Random walk or random deviations

Similar presentations


Presentation on theme: "The break signal in climate records: Random walk or random deviations"— Presentation transcript:

1 The break signal in climate records: Random walk or random deviations
The break signal in climate records: Random walk or random deviations? Ralf Lindau

2 Break signal Climate records are affected by breaks resulting from relocations or changes in the measuring techniques. For the detection, differences of neighboring stations are considered to reduce the dominating natural variance. Homogenization algorithms identify breaks by searching for the maximum external variance (explained by the jumps). Dipdoc Seminar – 30. May 2016

3 Benchmark datasets Benchmarking data sets are used to assess the skill of homogenization algorithms. These are artificial data sets with known breaks so that an evaluation of the algorithms is possible. However, benchmark datasets should reflect as much as possible the statistical properties of real data . An important question is how to model the breaks: As free random walk (Brownian motion) As random deviation from a fixed level (random noise) Dipdoc Seminar – 30. May 2016

4 Conceptual model Same signal, two approaches:
Which of the two DT is assumed to be an independent random variable? The deviations or the jumps? Depending on our choice different statistical properties of break signal will result. Random deviations Brownian motion Dipdoc Seminar – 30. May 2016

5 Different effects of identical s
Difference: The introduced break variance of random deviations is larger by a factor of 2 compared to Brownian motion. Reason: All jumps are created by the sum of two random numbers, while it is only one in case of Brownian motion. Preliminary: Random deviations 𝑽𝒂𝒓 𝑩𝑴 = 𝟏 𝟐 𝑽𝒂𝒓 𝑹𝑫 Brownian motion Dipdoc Seminar – 30. May 2016

6 Linearly growing variance of BM
The variance of a Brownian motion grows linearly in time. At the end of a BM time series the variance is: Var = k s2 with k: number of breaks and s2: break variance The average variance (over time) is VarBM = k/2 s2 For RD the variance is (shown before): VarRD = 2 s2 k is in the order of 5, for difference time series twice: 10. Thus k/4 is in the order of 2.5. Brownian motion created by the same s is much easier to detect. 𝑽𝒂𝒓 𝑩𝑴 = 𝒌 𝟒 𝑽𝒂𝒓 𝑹𝑫 Dipdoc Seminar – 30. May 2016

7 Which type is more realistic?
There are indications for both of the two break types: For random deviations: Relocations are bound to fixed position. Stations have geographical names and their positions are not free to fluctuate away. For random walk: Changes in measuring techniques can be seen as elimination of error sources one after the other. Ideal case: Today most errors are eliminated. Then the break signal can be seen as Brownian motion backward in time. Dipdoc Seminar – 30. May 2016

8 Different “schools” For a long time we were not aware that there are these two approaches. Williams et al. (2012) modelled random walk. Venema et al. (2012) modelled random deviations. Only the standard deviations applied were communicated. But these are not comparable for RD and BM. Dipdoc Seminar – 30. May 2016

9 Platforms & Stairs Venema et al. (2012) analyzed the statistics of the retrieved signal to decide whether breaks are BM or RD type. Platforms Stairs p (RD) = 0.67 p (actual) = 0.59 p (BM) = 0.50 But, the result was hardly significant due to the small number. And (more important): The result is dependent on the performance of the homogenization algorithm. T3 T1 T2 T3 T1 T2 Dipdoc Seminar – 30. May 2016

10 Platforms are difficult to detect
Running a homogenization algorithm with artificial pure RD data results in 0.62 – 0.64 platform frequency ( < 0.67 ). In the retrieved signal, the platforms are underestimated. The detected frequency is not suited as independent indication parameter to distinguish RD from BM. Therefore, it would be convenient to be independent from the retrieved break signal and instead able to derive break parameters directly from the data. Dipdoc Seminar – 30. May 2016

11 Two superimposed signals
We assume that the climate time series consists of two superimposed signals: Inhomogeneities and noise 𝑥 𝑖 = 𝜀 𝑏 𝑆 𝑖 + 𝜀 𝑛 𝑖 , 𝜀 𝑏 ~ 𝑁 0, 𝜎 𝑏 2 , 𝜀 𝑛 ~ 𝑁 0, 𝜎 𝑛 2 Each yearly value can be thought as the sum of two random numbers, eb and en, where eb depends on segment number S, which is defined as the number of breaks lying temporally behind. Dipdoc Seminar – 30. May 2016

12 Random deviation breaks
In case of random deviation breaks we calculate the Lag-covariance C(L): 𝐶 𝐿 = 1 𝑛−𝐿 𝑖=1 𝑛−𝐿 𝑥 𝑖 − 𝑥 𝑥 𝑖+𝐿 − 𝑥 For internal pairs E(C(L)) = 0 For external pairs E(C(L)) = sb2 𝐶 𝐿 = 𝑝 𝑖𝑛𝑡 ∙𝜎 𝑏 2 Dipdoc Seminar – 30. May 2016

13 Probability of internal pairs
The probability for internal pairs increase with segment length l and decrease with time lag L. 𝑝 𝑦𝑒𝑎𝑟 (𝑙)= 𝑙 𝑘+1 𝑛 𝑛−1−𝑙 𝑘−1 𝑛−1 𝑘 𝑝 𝑒𝑎𝑟𝑙𝑦 𝑙 = 𝑙− min (𝑙,𝐿) 𝑙 𝑝 𝑖𝑛𝑡 (𝐿)= 𝑙=1 𝑛−𝑘 𝑝 𝑦𝑒𝑎𝑟 (𝑙)∙ 𝑝 𝑒𝑎𝑟𝑙𝑦 (𝑙, 𝐿) Probability of a specific year to belong to segment of length l: Probability of a specific year to have sufficient spacing to the next break: Probability of internal pairs is the sum over all length of the product. Dipdoc Seminar – 30. May 2016

14 Probability of internal pairs
𝑝 𝑖𝑛𝑡 = 𝑙=1 𝑛−𝑘 𝑙 𝑘+1 𝑛 ∙ 𝑛−1−𝑙 𝑘−1 𝑛−1 𝑘 ∙ 𝑙− min 𝑙,𝐿 𝑙 𝑝 𝑖𝑛𝑡 = 𝑛−1−𝐿 𝑘 𝑛−1 𝑘 𝑝 𝑖𝑛𝑡 = 𝑒 − 𝑘𝐿 𝑛−𝑘 The long version of the product : By some purely arithmetic transformations we get: By some further approximations we get: Dipdoc Seminar – 30. May 2016

15 Lag covariance for RD The covariance is an exponential function of the time lag. C(L) = a exp (-bL) break a = sb2 strength sb b = k/(n-k) number k As byproduct we have a nice method to retrieve also strength and number of breaks directly from the data. Input: sb = 1.000 k = 5.000 Output: k = 4.984 Dipdoc Seminar – 30. May 2016

16 Brownian motion type For Brownian motion type breaks the covariance depends only on the segment number of the earlier of the two years , because they have all random numbers eb constituting the break signal at x(i) in common. 𝐶𝑜𝑣 𝑥(𝑖),𝑥 𝑗 = 𝑆 𝑖 𝜎 𝑏 , 𝑖<𝑗 The segment number is a stochastic variable growing linearly in time: 𝑆 𝑖 = 𝑖−1 𝑛−1 𝑘 , 𝑖≤𝑛 Consequently, also the covariance grows linearly with time: 𝐶𝑜𝑣 𝑥 𝑖 ,𝑥(𝑗) = 𝑖−1 𝑛−1 𝑘 𝜎 𝑏 , 𝑖 <𝑗≤𝑛 Dipdoc Seminar – 30. May 2016

17 Time dependent Cov for BM
Input: sb = 1.000 k = 5.000 Output: sb = 1.005 k = 4.920 The covariance is a linear function in time. C(i) = a i + b a = k/(n-1) sb2 b = ( 1 - k/(n-1)) sb2 Dipdoc Seminar – 30. May 2016

18 Conclusion Brownian motion and random deviation break types can be distinguished by calculating: Lag covariance C(L) Time dependent covariance C(i) For Random deviations C(L) is decreasing with L. For Brownian motion C(i) is increasing with j. The two other combinations remain constant. As byproduct we get an estimate for break size and number without running a full homogenization algorithm. Dipdoc Seminar – 30. May 2016

19 Platforms & Stairs Venema et al. (2012) analyzed the statistics of the retrieved signal to decide whether breaks are BM or RD type. They distinguish platforms: from stairs: T3 T1 T2 T3 T1 T2 Dipdoc Seminar – 30. May 2016

20 Platform probability for RD
For RD break types T1, T2, T3 are iid random variables (not the case for BM). There are 6 possibilities of rank order, which all have the same probability: Upward stair T1 < T2 < T3 T1 < T3 < T2 T2 < T1 < T3 T2 < T3 < T1 T3 < T1 < T2 T3 < T2 < T1 Upward and downward stairs have both the probability 1/6. Every other combination is a platform. (Either T2 is the smallest or T2 is the largest element of the triple.) Downward stair For RD break types the probability of platforms is 2/3. Dipdoc Seminar – 30. May 2016


Download ppt "The break signal in climate records: Random walk or random deviations"

Similar presentations


Ads by Google