Download presentation
Presentation is loading. Please wait.
Published byJuan Francisco Rojas Murillo Modified over 6 years ago
1
Guidance for managing international precipitation data
Markus Ziese (GPCC) Agenda item number 5.2
2
Background and Agenda of GPCC
Global collection and analysis of in situ data of land-surface precipitation Established in early 1989 at Deutscher Wetterdienst (DWD) on request of WMO WCRP. More than 25 years of service indicates GPCC’s long term experience on the job Current staffing, 10 individuals: 1 head, 3 permanent scientists , 1 project scientist, 1 programmer, 4 support staff Contributing to GEWEX (Global Energy and Water Exchanges Project) and GCOS (Global Climate Observing System) Many users world wide, analyses used in IPCC-AR5, WMO Climate Statements, Copernicus Services and BAMS SoC Data sources: SYNOP, CLIMAT, SYNOP from CPC, ECA&D, CRU, FAO, GHCN, national meteorological services, regional data collections
3
GPCC Data Base Number of stations in data bank: more than 115,000
Monthly totals, collected since 1989 Daily totals, collected since 2012 2017 Number of stations in data bank: more than 115,000
4
GPCC Data Flow Data delivered in different formats
All data in same format Data stored in data bank QC QC Extracted data for analyses QC
5
Data Bank Model Provider based Parameter based Parameter based
Station based Provider Station Parameter Data Parameter Provider Station Data Station Parameter Provider Data
6
Data Bank Model Provider based Parameter based Parameter based
Station based Pro: Easy import as data feed in as delivered Contra: Station synchronizing at data post-processing needed Difficult comparison between providers Provider specific metadata Pro: Easy import as data feed in as delivered Contra: Synchronizing metadata between parameters Difficult cross-validation between parameters Pro: Stations synchronizing at data import Easy comparison between providers No multiple station for post-processing One station history for all providers Contra: Large effort to import data due to metadata synchronizing
7
GPCC Data Bank Station based data bank scheme
One station can get data from different data providers Allows comparison and quality control between different sources Station Provider A Provider B Provider C Provider D Date 1 value value value Date 2 value value value Date 3 value value value Date 4 value value value
8
Quality Control Delivered original data needs to be checked
Many causes for errors possible Not every apparently wrong value is erroneous, but could be caused by an extreme event QC mustn’t delete correct data Track changes during QC in data bank Metadata needs to be checked
9
Quality Control Automatic QC at aggregation of SYNOP data
Manual QC of offline checked statistically conspicuous aggregated SYNOP and CLIMAT data Manual QC of metadata and temporal shift at reformatting of non-real-time data Automatic QC of data and additional manual QC of statistically peculiar data during import in data bank
10
Quality Control Metadata: During import:
Station within borders of country? Station coordinates and elevation consistent with other sources (OSCAR, Google Maps, GEONAMES) ? Station over land? Spelling of station name During import: Data belongs to existing station in data bank? Metadata needs to be updated? New station?
11
Quality Control One station: Huddur Huduur Hudur Hodur Oddur Xuddur
Xudur 1982 1980 1953 1952 1951 1979
12
Station History Metadata can be changed due to relocations, automatisation etc. over time Store metadata as station history with validity date (valid from … till …) Coordinates & elevation Station number (national number, WMO-number) e.g., Erfurt-Bindersleben: 09554 from 1946/04/01 to 1990/10/02 10554 since 1990/10/03
13
Station History Use these metadata, which are valid at the date of the value/observation Location of station can change over time in the time series
14
Tracking Changes in Data
Reproduce changes at QC Use “best values” for further processing Keep a copy of original values in data bank
15
Tracking Changes in Data
Quality control leads to: Confirmed data: flagged values are caused by extreme events Corrected data: original value was wrong and replaced by better one, e.g., due to factor 10 error Deleted data: original value was obviously wrong, but no information available to correct it
16
Tracking Changes in Data
For one station - one provider - one date: Maximum two values can exist Original and confirmed original value Original and corrected value Original and deleted value Wrong data are flagged to be ignored in analyses, but not physically deleted from data bank -> option to still confirm original value if later arriving additional and more reliable information supports this
17
Experience from GPCC Store data in a station specific data bank
Bundle station history with observations Store data separated according to provider to allow for cross-comparison Apply advanced quality control procedures Track changes during QC in data bank, do not delete original data physically Use best available data at each time step Respect and document copyrights and intellectual property rights (IPR) of data providers
18
Thank you Merci
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.