Download presentation
Presentation is loading. Please wait.
Published byRosamund Cain Modified over 8 years ago
1
Quality control and homogenization of the COST benchmark dataset Petr Štěpánek Pavel Zahradníček Czech Hydrometeorological Institute, regional office Brno e-mail: petr.stepanek@chmi.czpetr.stepanek@chmi.cz zahradnicek@chmi.cz
2
Processing before any data analysis Software AnClim, AnClim, ProClimDB ProClimDB
3
Data Quality Control Finding Outliers Two main approaches: Using limits derived from interquartile ranges (time series) Using limits derived from interquartile ranges (time series) comparing values to values of neighbouring stations (spatial analysis) comparing values to values of neighbouring stations (spatial analysis)
4
for monthly data for monthly data weighted /unweighted mean from neighbouring stations weighted /unweighted mean from neighbouring stations Power of weight is 1 for temperature (1/d) and 3 for precipitation (1/d 3 ) - IDW Power of weight is 1 for temperature (1/d) and 3 for precipitation (1/d 3 ) - IDW criterions used for stations selection criterions used for stations selection (or combination of it): (or combination of it): best correlated / nearest neighbours (correlations – from the first differenced series) best correlated / nearest neighbours (correlations – from the first differenced series) limit correlation, limit distance limit correlation, limit distance limit difference in altitudes limit difference in altitudes neighbouring stations series should be standardized to test series AVG and / or STD/ Atlitude neighbouring stations series should be standardized to test series AVG and / or STD/ Atlitude Comparison with „expected“ value – Comparison with „expected“ value – (calculated as weighted mean (calculated as weighted mean from standardized neighbours values) from standardized neighbours values) Creating Reference Series
5
Example: Proposed list of stations used for creating reference series
6
„Outliers“ temperature sur1, network 1 detected 12 „outliers“ 10 errors for station 150 (5 in year 1909) Mean difference between measured outliers and expect value is about 6°C
7
„Outliers“ precipitation sur1, network 1 detected 8 „outliers“ Mean difference between measured outliers and expect value is about 180 mm Max difference is 313 mm (station 4307012, 8/1971)
8
Months, seasons, year
9
for monthly, for monthly, weighted /unweighted mean from neighbouring stations weighted /unweighted mean from neighbouring stations criterions used for stations selection (or combination of it): criterions used for stations selection (or combination of it): best correlated / nearest neighbours (correlations – from the first differenced series) best correlated / nearest neighbours (correlations – from the first differenced series) limit correlation, limit distance limit correlation, limit distance limit difference in altitudes limit difference in altitudes neighbouring stations series neighbouring stations series should be standardized to test series AVG and / or STD should be standardized to test series AVG and / or STD (temperature - elevation, precipitation - variance) (temperature - elevation, precipitation - variance) - missing data are not so big problem then - missing data are not so big problem then Creating Reference Series
10
Relative homogeneity testing Test series – 40 years Test series – 40 years Longer series – divide to the more section with overlay 10 years Longer series – divide to the more section with overlay 10 years Tests: SNHT, Bivarite, t-test Tests: SNHT, Bivarite, t-test
11
Example of the detected breaks – temperature, sur1, network 1 - Detected 63 breaks Station no. 50, break 1928 Station no. 50, break 1975 Difference between test and reference seriesTest and reference seriesTest statistics
12
Station no. 100, break 1983
13
Example of the detected breaks – precipitation, sur1, network 1 - Detected 10 breaks Station no. 4309900, break 1909 Station no. 4311803, break 1991
14
Adjusting monthly data using reference series based on distance using reference series based on distance Power of weight is 0.5 for temperature and 1 for precipitation Power of weight is 0.5 for temperature and 1 for precipitation adjustment: from differences/ratios 20 years before and after a change, monhtly adjustment: from differences/ratios 20 years before and after a change, monhtly smoothing monthly adjustments (low-pass filter for adjacent values) smoothing monthly adjustments (low-pass filter for adjacent values) Station no. 50, break 1928Station no. 100, break 1983
15
Adjusting values – evaluation After adjust must correlation increase – if not, the series is not adjust Temperature Precipitation
16
Absolute values of adjustment for temperature, surg1, network 1
17
Iterative homogeneity testing several iteration of testing and results evaluation several iteration of testing and results evaluation several iterations of homogeneity testing and series adjusting (3 iterations should be sufficient) several iterations of homogeneity testing and series adjusting (3 iterations should be sufficient) question of homogeneity of reference series is thus solved: question of homogeneity of reference series is thus solved: possible inhomogeneities should be eliminated by using averages of several neighbouring stations possible inhomogeneities should be eliminated by using averages of several neighbouring stations if this is not true: in next iteration neighbours should be already homogenized if this is not true: in next iteration neighbours should be already homogenized
18
Example – homogenized temperature series Station no. 50 Station no. 100
19
Example – homogenized precipitation series Station no. 4309900, break 1909 Station no. 4311803, break 1991
20
http://www.climahom.eu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.