Benchmark database inhomogeneous data, surrogate data and synthetic data Victor Venema.

Slides:



Advertisements
Similar presentations
Statistical modelling of precipitation time series including probability assessments of extreme events Silke Trömel and Christian-D. Schönwiese Institute.
Advertisements

Zentralanstalt für Meteorologie und Geodynamik 1. Comparison of HOM, SPLIDHOM and INTERP 2. Ideas for the daily benchmark dataset (temperature) Christine.
Developing a Caribbean Climate Interactive Database (CCID) Rainaldo F. Crosbourne, Michael A. Taylor, A. M. D. Amarakoon** CLIMATE STUDIES GROUP MONA Department.
Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne.
Benchmark database based on surrogate climate records Victor Venema.
Short-term, platform- like inhomogeneities in observed climatic time series Peter Domonkos Centre for Climate Change University Rovira i Virgili, Tortosa,
REFERENCES Begert M., Schlegel T., Kirchhofer W., Homogeneous temperature and precipitation series of Switzerland from 1864 to Int. J. Climatol.,
Stratospheric Temperature Variations and Trends: Recent Radiosonde Results Dian Seidel, Melissa Free NOAA Air Resources Laboratory Silver Spring, MD SPARC.
A Procedure for Automated Quality Control and Homogenization of historical daily temperature and precipitation data (APACH). Part 1: Quality Control of.
TR32 time series comparison Victor Venema. Content  Jan Schween –Wind game: measurement and synthetic –Temporal resolution of 0.1 seconds  Heye Bogena.
Sorin CHEVAL*, Tamás SZENTIMREY**, Ancuţa MANEA*** *National Meteorological Administration, Bucharest, Romania and Euro-Mediterranean Centre for Climate.
FMRI: Biological Basis and Experiment Design Lecture 26: Significance Review of GLM results Baseline trends Block designs; Fourier analysis (correlation)
Global analysis of recent frequency component changes in interannual climate variability Murray Peel 1 & Tom McMahon 1 1 Civil & Environmental Engineering,
Statistical characteristics of surrogate data based on geophysical measurements Victor Venema 1, Henning W. Rust 2, Susanne Bachner 1, and Clemens Simmer.
Benchmark database inhomogeneous data, surrogate data and synthetic data Victor Venema.
Detected Inhomogeneities In Wind Direction And Speed Data From Ireland Predrag Petrović Republic Hydrometeorological Service of Serbia Mary Curley Met.
Utskifting av bakgrunnsbilde: -Høyreklikk på lysbildet og velg «Formater bakgrunn» -Under «Fyll», velg «Bilde eller tekstur» og deretter «Fil…» -Velg ønsket.
Spatial Interpolation of monthly precipitation by Kriging method
Detection of inhomogeneities in Daily climate records to Study Trends in Extreme Weather Detection of Breaks in Random Data, in Data Containing True Breaks,
10 IMSC, August 2007, Beijing Page 1 An assessment of global, regional and local record-breaking statistics in annual mean temperature Eduardo Zorita.
Benchmark dataset processing P. Štěpánek, P. Zahradníček Czech Hydrometeorological Institute (CHMI), Regional Office Brno, Czech Republic, COST-ESO601.
After HOME : Progress in the practical application of statistical homogenisation Peter Domonkos Dimitrios Efthymiadis Centre for Climate Change University.
COSTOC Olivier MestreMétéo-FranceFrance Ingebor AuerZAMGAustria Enric AguilarU. Rovirat i VirgiliSpain Paul Della-MartaMeteoSwissSwitzerland Vesselin.
ES0601 Action progress report COST ES0601 MC5Bucuresti, May 2010 Advances in HOmogenisation MEthods for climate series an integrated approach COST.
SCIENTIFIC REPORT ON COST SHORT TERM SCIENTIFIC MISSION Tania Marinova National Institute of Meteorology and Hydrology at the Bulgarian Academy of Sciences,
Did the recession impact recent decreases in observed sulfate concentrations? Shao-Hang Chu, US EPA/OAQPS/AQAD October, 2011.
TRENDS IN U.S. EXTREME SNOWFALL SEASONS SINCE 1900 Kenneth E. Kunkel NOAA Cooperative Institute for Climate and Satellites - NC David R. Easterling National.
Status of stochastic background’s joint data analysis by Virgo and INFN resonant bars G. Cella (INFN Pisa) For Auriga-ROG-Virgo collaborations Prepared.
SIXTH SEMINAR FOR HOMOGENIZATION AND QUALITY CONTROL IN CLIMATOLOGICAL DATABASES AND COST ES-0601 “HOME” ACTION MANAGEMENT COMMITTEE AND WORKING GROUPS.
On the multiple breakpoint problem and the number of significant breaks in homogenisation of climate records Separation of true from spurious breaks Ralf.
Breaks in Daily Climate Records Ralf Lindau University of Bonn Germany.
ISpheresImage iSpheresImage Feature Overview and Progress Summary.
Noise in 3D Laser Range Scanner Data Xianfang Sun Paul L. Rosin Ralph R. Martin Frank C. Langbein School of Computer Science Cardiff University, UK.
Progress Toward a New Weather Generator Eric Schmidt, Colorado State University - Pueblo Dr. James O’Brien, Florida State University Anthony Arguez, Florida.
HOME-ES601WG-1 Report to the 2nd MC, Vienna 23/11/2007 WG1 REPORT TO THE 2nd MC Enric Aguilar URV, Tarragona, Spain
European Climate Assessment CCl/CLIVAR ETCCDMI meeting Norwich, UK November 2003 Albert Klein Tank KNMI, the Netherlands.
Quality control and homogenization of the COST benchmark dataset Petr Štěpánek Pavel Zahradníček Czech Hydrometeorological Institute, regional office Brno.
Correction of daily values for inhomogeneities P. Štěpánek Czech Hydrometeorological Institute, Regional Office Brno, Czech Republic
Quality control of daily data on example of Central European series of air temperature, relative humidity and precipitation P. Štěpánek (1), P. Zahradníček.
Randomization in Privacy Preserving Data Mining Agrawal, R., and Srikant, R. Privacy-Preserving Data Mining, ACM SIGMOD’00 the following slides include.
U.S. Department of Labor Employment and Training Administration 1 Data Mining Using the Federal Research and Evaluation Database Describe Explain Predict.
On the reliability of using the maximum explained variance as criterion for optimum segmentations Ralf Lindau & Victor Venema University of Bonn Germany.
Benchmark database inhomogeneous data, surrogate data and synthetic data Victor Venema.
Development and testing of homogenisation methods: Moving parameter experiments Peter Domonkos and Dimitrios Efthymiadis Centre for Climate Change University.
A novel methodology for identification of inhomogeneities in climate time series Andrés Farall 1, Jean-Phillipe Boulanger 1, Liliana Orellana 2 1 CLARIS.
VI Seminar Homogenization, Budapest 2008 M.Mendes, J.Neto, A.Silva, L.Nunes, P.Viterbo Instituto de Meteorologia, Portugal “Characterization of data sets.
ANOVA, Regression and Multiple Regression March
Experience regarding detecting inhomogeneities in temperature time series using MASH Lita Lizuma, Valentina Protopopova and Agrita Briede 6TH Homogenization.
ACTION COST-ES0601: Advances in homogenisation methods of climate series: an integrated approach (HOME), WG Meeting, Palma de Mallorca, January, 25-27,
Homogenization of Chinese daily surface air temperatures:An update for CHHT1.0 Li Qingxiang, Xu Wenhui, Xiaolan Wang, and coauthors (National Meteorological.
Developing long-term homogenized climate Data sets Olivier Mestre Météo-France Ecole Nationale de la Météorologie Université Paul Sabatier, Toulouse.
Federal Research and Evaluation Databases TAA and WIA Diagnostic and Planning Tools.
1 Detection of discontinuities using an approach based on regression models and application to benchmark temperature by Lucie Vincent Climate Research.
Data quality control for the ENSEMBLES grid Evelyn Zenklusen Michael Begert Christof Appenzeller Christian Häberli Mark Liniger Thomas Schlegel.
The joint influence of break and noise variance on break detection Ralf Lindau & Victor Venema University of Bonn Germany.
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA COST benchmark dataset homogenisation: issues and remarks of the “Slovenian team” Presentation.
Homogenization of daily data series for extreme climate index calculation Lakatos, M., Szentimey T. Bihari, Z., Szalai, S. Meeting of COST-ES0601 (HOME)
HYDROCARE Kick-Off Meeting 13/14 February, 2006, Potsdam, Germany HYDROCARE Actions 2.1Compilation of Meteorological Observations, 2.2Analysis of Variability.
Inhomogeneities in temperature records deceive long-range dependence estimators Victor Venema Olivier Mestre Henning W. Rust Presentation is based on:
Actions & Activities Report PP8 – Potsdam Institute for Climate Impact Research, Germany 2.1Compilation of Meteorological Observations, 2.2Analysis of.
Benchmark database Victor Venema, Olivier Mestre, Enric Aguilar, Ingeborg Auer, José A. Guijarro, Petr Stepanek, Claude.N.Williams, Matthew Menne, Peter.
Homogenisation of temperature time series in Croatia
Trends in floods in small catchments – instantaneous vs. daily peaks
The homogenization of GPS Integrated Water Vapour time series: methodology and benchmarking the algorithms on synthetic datasets R. Van Malderen1, E. Pottiaux2,
The homogenization of GPS Integrated Water Vapour time series: methodology and benchmarking the algorithms on synthetic datasets R. Van Malderen1, E. Pottiaux2,
The break signal in climate records: Random walk or random deviations
The Chinese University of Hong Kong
Meeting of COST-ES0601 (HOME) Mallorca JAN 2010
Dipdoc Seminar – 15. October 2018
European Climate Assessment & Dataset
Presentation transcript:

Benchmark database inhomogeneous data, surrogate data and synthetic data Victor Venema

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary Goals of COST-HOME working group 1  Literature survey  Benchmark dataset –Known inhomogeneities –Test the homogenisation algorithms (HA)

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary Benchmark dataset 1)Real (inhomogeneous) climate records  Most realistic case  Investigate if various HA find the same breaks  Good meta-data 2)Synthetic data  For example, Gaussian white noise  Insert know inhomogeneities  Test performance 3)Surrogate data  Empirical distribution and correlations  Insert know inhomogeneities  Compare to synthetic data: test of assumptions

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary Creation benchmark – Outline talk 1)Start with (in)homogeneous data 2)Multiple surrogate and synthetic realisations 3)Mask surrogate records 4)Add global trend 5)Insert inhomogeneities in station time series 6)Published on the web 7)Homogenize by COST participants and third parties 8)Analyse the results and publish

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary 1) Start with homogeneous data  Monthly mean temperature and precipitation  Later also daily data (WG4), maybe other variables  Homogeneous  No missing data  Longer surrogates are based on multiple copies  Generated networks are 100 a

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary 1) Start with inhomogeneous data  Distribution –Years with breaks are removed –Mean of section between breaks is adjusted to global mean  Spectrum –Longest period without any breaks in the stations –Surrogate is divided in overlapping sections –Fourier coefficients and phases are adjusted for every small section –No adjustments on large scales!

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary Surrogates from inhomogeneous data

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary 2) Multiple surrogate realisations  Multiple surrogate realisations –Temporal correlations –Station cross-correlations –Empirical distribution function  Annual cycle removed before, added at the end  Number of stations between 5 and 20  Cross correlation varies as much as possible

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary 5) Insert inhomogeneities in stations  Independent breaks  Determined at random for every station and time  5 breaks per 100 a  Monthly slightly different perturbations  Temperature –Additive –Size: Gaussian distribution, σ=0.8°C  Rain –Multiplicative –Size: Gaussian distribution, =1, σ=10%

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary Example break perturbations station

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary Example break perturbations network

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary 5) Insert inhomogeneities in stations  Correlated break in network  One break in 50 % of networks  In 30 % of the station simultaneously  Position random –At least 10 % of data points on either side

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary Example correlated break

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary 5) Insert inhomogeneities in stations  Outliers  Size –Temperature: 99 percentile –Rain: 99.9 percentile  Frequency –50 % of networks: 1 % –50 % of networks: 3 %

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary Example outlier perturbations station

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary Example outliers network

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary 5) Insert inhomogeneities in stations  Local trends (only temperature)  Linear increase or decrease in one station  Duration: 30, 60a  Maximum size: 0.2 to 1.5 °C  Frequency: once in 10 % of the stations  Also for rain?

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary Example local trends

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary 6) Published on the web  Inhomogeneous data will be published on the COST-HOME homepage  Everyone is welcome to download and homogenize the data  mitarbeiter/venema/themes/homogenisation

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary 7) Homogenize by participants  Return homogenised data  Should be in COST-HOME file format (next slide)  Return break detections –BREAK –OUTLI –BEGTR –ENDTR  Multiple breaks at one data possible

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary 7) Homogenize by participants  COST-HOME file format: venema/themes/homogenisation/costhome_fileformat.pdf  For benchmark & COST homogenisation software  New since Vienna: –Stations files include height –Many clarifications

Victor Venema, COST HOME, Mai 2008, Budapest, Hungary Work in progress  Preliminary benchmark: venema/themes/homogenisation/  Write report on the benchmark dataset  More input data  Set deadline for the availability benchmark  Deadline for the return of the homogeneous data  Agree on the details of the benchmark  Daily data: other, realistic, fair inhomogeneities