Norwegian Meteorological Institute met.no QC2 Status 20080220

Slides:



Advertisements
Similar presentations
Configuration management
Advertisements

Software change management
Configuration management
Stacks - 3 Nour El-Kadri CSI Evaluating arithmetic expressions Stack-based algorithms are used for syntactical analysis (parsing). For example.
Karolina Muszyńska Based on
John Porter Why this presentation? The forms data take for analysis are often different than the forms data take for archival storage Spreadsheets are.
The Comparison of the Software Cost Estimating Methods
Mining Association Rules. Association rules Association rules… –… can predict any attribute and combinations of attributes … are not intended to be used.
Chapter 3: System design. System design Creating system components Three primary components – designing data structure and content – create software –
Operational Quality Control in Helsinki Testbed Mesoscale Atmospheric Network Workshop University of Helsinki, 13 February 2007 Hannu Lahtela & Heikki.
A Procedure for Automated Quality Control and Homogenization of historical daily temperature and precipitation data (APACH). Part 1: Quality Control of.
Testing an individual module
Hashing General idea: Get a large array
Karolina Muszyńska Based on
The Calibration Process
5-3 Inference on the Means of Two Populations, Variances Unknown
8/9/2015Slide 1 The standard deviation statistic is challenging to present to our audiences. Statisticians often resort to the “empirical rule” to describe.
, TargetProcesswww.targetprocess.com1 TargetProcess:Suite Agile Project Management System Powers iterative development Focuses on Project Planning,
SYSTEMS ANALYSIS. Chapter Five Systems Analysis Define systems analysis Describe the preliminary investigation, problem analysis, requirements analysis,
Managing Business Data Lecture 8. Summary of Previous Lecture File Systems  Purpose and Limitations Database systems  Definition, advantages over file.
FireRMS SQL Audit, Archiving & Purging Presented by Laura Small FireRMS Quality Assurance.
Software Engineering Modern Approaches
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
Boolean Algebra – the ‘Lingua Franca’ of the Digital World The goal of developing an automata is based on the following (loosely described) ‘ideal’: if.
Q2010, Helsinki Development and implementation of quality and performance indicators for frame creation and imputation Kornélia Mag László Kajdi Q2010,
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Basic Business Statistics 11 th Edition.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
Configuration Management (CM)
Edoardo PIZZOLI, Chiara PICCINI NTTS New Techniques and Technologies for Statistics SPATIAL DATA REPRESENTATION: AN IMPROVEMENT OF STATISTICAL DISSEMINATION.
Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.
5 June 2013 SDMX Technical Working Group Luxembourg 1 5 June 2013 SDMX Technical Working Group Luxembourg 1 WP Item 6 The Expressions Language of Banca.
DATABASE MGMT SYSTEM (BCS 1423) Chapter 5: Methodology – Conceptual Database Design.
The european ITM Task Force data structure F. Imbeaux.
I Power Higher Computing Software Development The Software Development Process.
Copyright © 1994 Carnegie Mellon University Disciplined Software Engineering - Lecture 3 1 Software Size Estimation I Material adapted from: Disciplined.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Introduction to Programming in R Department of Statistical Sciences and Operations Research Computation Seminar Series Speaker: Edward Boone
6/4/2016Slide 1 The one sample t-test compares two values for the population mean of a single variable. The two-sample t-test of population means (aka.
Quality control of daily data on example of Central European series of air temperature, relative humidity and precipitation P. Štěpánek (1), P. Zahradníček.
SIMO SIMulation and Optimization ”New generation forest planning system” Antti Mäkinen & Jussi Rasinmäki Dept. of Forest Resource Management.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules S.D. Lee, David W. Cheung, Ben Kao The University of Hong.
WFM 6311: Climate Risk Management © Dr. Akm Saiful Islam WFM 6311: Climate Change Risk Management Akm Saiful Islam Lecture-7:Extereme Climate Indicators.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Surface Water Quality Monitoring Information System (SWQMIS) Cindi Atwood Tetra Tech, Inc. (703) Nancy Ragland TCEQ.
Section 6-3 Estimating a Population Mean: σ Known.
Bootstrap Event Study Tests Peter Westfall ISQS Dept. Joint work with Scott Hein, Finance.
Verification tools at the Norwegian Meteorological Institute By Helen Korsmo EGOWS 2004.
Winter 2011SEG Chapter 11 Chapter 1 (Part 1) Review from previous courses Subject 1: The Software Development Process.
July 19, 2004Joint Techs – Columbus, OH Network Performance Advisor Tanya M. Brethour NLANR/DAST.
Day in the Life (DITL) Production Operations with Energy Builder Copyright © 2015 EDataViz LLC.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
6-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Norwegian Meteorological Institute met.no Qc2 Workshop
Maite Barroso – WP4 Workshop – 10/12/ n° 1 -WP4 Workshop- Developers’ Guide Maite Barroso 10/12/2002
BUS 308 Entire Course (Ash Course) For more course tutorials visit BUS 308 Week 1 Assignment Problems 1.2, 1.17, 3.3 & 3.22 BUS 308.
AliRoot survey: Calibration P.Hristov 11/06/2013.
Round Table on Time Series Some Remarks Eurostat.
Virtual University of Pakistan
Analysis Manager Training Module
EMPA Statistical Analysis
RCM Turbo SQL Version.
Essentials of Modern Business Statistics (7e)
Elementary Statistics
Qc2 Development
Qc2 Development
Chapter 2: Operating-System Structures
M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University
Chapter 2: Operating-System Structures
A handbook on validation methodology. Metrics.
Presentation transcript:

Norwegian Meteorological Institute met.no QC2 Status

Norwegian Meteorological Institute met.no QC2 QC2 Controls QC2 Functions QC1 Controlled Observations -Space Control Detect Outliers Construct Variogram … +Dip Test +Statistical Checks +Set Flags … -Interpolation Simple, Spline Kriging … +Distribute RR_24 +Dip Correction +Generate QC Products +Set Flags … QC2 Controlled Observations model values, corrected values, flags, products for HQC? prototype phase Objectives: To check viability of method(s) Construct an element of the whole QC2 system

Norwegian Meteorological Institute met.no RR_24 Step 1 Intepolated data calculated for all points (including missing rows as well as missing data ) using only RR_24 fd/c(12) = 1 as valid neighbours. Step 2 For data points where fd/c(12) = 2 and run of previous missing data or rows, redistributed values calculated based on interpolated data and original accumulated value. Step 3 Criteria for setting corrected value = redistributed value. Associated controlinfo flags to set. Responsibility for setting useinfo flags. Specification of user interfaces.

Norwegian Meteorological Institute met.no stationidobstimeoriginalpidtbtimetypeidslcorrectedcontrolinfouseinfocfailed :00: :40: :00: :52: :00: :11: :00: :41: :00: :10: :00: :40: :00: :44: :00: :26: :00: :27: :00: :05: QC :1,hqc :00: :05: QC :1,hqc :00: :05: QC :1,hqc :00: :05: QC b12:1,QC :1,hqchqc Missing data and Missing Rows

Norwegian Meteorological Institute met.no Equation Graphic: From Wikipedia, the free encyclopedia Original map courtesy of Ole Einar Tveitto (karttegner) IDW interpolation d u k u k u k u k u k u k u k For now: p = 2 ; only neighbours d < 50 km included. Interpolation method applied

Norwegian Meteorological Institute met.no Example of interpolation for all Norwegian Stations November 2007

Norwegian Meteorological Institute met.no IDW prediction for outlier Detect and exclude outliers

Norwegian Meteorological Institute met.no Stationid ObstimeParamidLevel Modelid Original (GIS model value) FlagProb% :005000?-12.3?75 QC2 Table Notes: Initial idea to hold all model data, e.g. interpolated data, in a new table in the db Concerns large table added to kvalobs db, impacts operational system, table is derived from the original data and contains mainly redundant information, maybe subject to change when applying different algorithms, difficult to track history Alternatives: store GIS data / (QC2 derived data) in separate database archive in a scientific file format, e.g. netCDF, HDF 5, specific GIS format

Norwegian Meteorological Institute met.no Full Automatic Application Taken RR_24 data from operational kvalobs db to Method interpolates, fills out missing rows, detects runs and then redistributes (5-10 mins). stationidobstimeoriginalpidtbtimetypeidslcorrectedcontrolinfouseinfocfailed :00: :52: :00: :00: :00: :00: :00: :00: :00: :11: :00: :27: :00: :05: QC :1,hqc :00: :05: QC :1,hqc :00: :05: QC :1,hqc :00: :05: QC b12:1,QC :1,hqchqc

Norwegian Meteorological Institute met.no interpolations original observations redistributed accumulated observation missing data ”data run”

Norwegian Meteorological Institute met.no precipcollected_flag.pl Some 2576 accumulated values (statistics in table below) fd aka Controlinfo(12) determines accumulation status Setting of c(12) relies on ”precipcollected_flag.pl” script … c(12)NumberFlag Interpretation 02651Not controlled (mainly missing data) Normal RR_ Accumulated 3683Times do not match Corrected with model data 70 Statistics for typeid = 302

Norwegian Meteorological Institute met.no How much data is already corrected? Of 2576 accumulated values –1358 not corrected (++ all the missing rows) –1218 corrected Corrected data may be used to test automatic method. Propose: maintain original corrected data. Uncorrected data is candidate for replacement with redistributed accumulation.

Norwegian Meteorological Institute met.no Typical Example Cases STIDDATEORIGINTPCORRREDISTYPEIDCONTROLINFOUSEINFO /29/ [ ][ ] /30/ [ ][ ] /01/ [ ][ ] /02/ [ ][ ] /03/ [ ][ ]

Norwegian Meteorological Institute met.no Typical Example Cases STIDDATEORIGINTPCORRREDISTYPEIDCONTROLINFOUSEINFO /29/ [ ][ ] /30/ [ ][ ] /01/ [ ][ ] /02/ [ ][ ] /03/ [ ][ ] Keep corrected value as is, criteria to substitute result of automatic calculation unclear? Use such cases to test automatic method. To do: generate more exacting test data (i.e. generated from complete observations).

Norwegian Meteorological Institute met.no Comparison of automatic method with HQC corrections Human and machine are in concert!

Norwegian Meteorological Institute met.no Typical Example Cases STIDDATEORIGINTPCORRREDISTYPEIDCONTROLINFOUSEINFO /29/ [ ][ ] /30/ [ ][ ] /01/ [ ][ ] /02/ [ ][ ] /03/ [ ][ ] 18030| :00:00 |-32767|34| :12:31 |302 | 0 | 0 | 3 | | |hqc 18030| :00:00 | 3|34| :57:23 |302 | 0 | 0 | 3 | | | 18030| :00:00 | 1|35| :57:23 |302 | 0 | 0 | 1 | | | 18030| :00:00 | 3|34| :57:23 |302 | 0 | 0 | 3 | | | 18030| :00:00 | 1|35| :57:23 |302 | 0 | 0 | 1 | | | confidence level in original value → to be tracked when corrected set to intp or redis value (i.e. cases where corrected does not already exist). inconsistency

Norwegian Meteorological Institute met.no … as compared to STIDDATEDATAINTPCORRREDISTYPEIDCONTROLINFOUSEINFO /21/ [ ][ ] /22/ [ ][ ] /23/ [ ][ ] /24/ [ ][ ] /25/ [ ][ ] /26/ [ ][ ] /27/ [ ][ ] /28/ [ ][ ] consistent

Norwegian Meteorological Institute met.no Measurement Accuracy STIDDATEORIGINTPCORRREDISTYPEIDCONTROLINFOUSEINFO /29/ [ ][ ] /30/ [ ][ ] /01/ [ ][ ] /02/ [ ][ ] /03/ [ ][ ] Redistributed data and previously corrected data correspond. Redistribution introduces data with unfeasible measurement accuracy Is this ok (homogenisation problems) ? Should 4.79 → 5.0 ; 0 → 0 ; 2.07 → 2 ; 0.63 → 0.5 ?

Norwegian Meteorological Institute met.no Another sanity check(1) STIDDATEORIGINTPCORRREDISTYPEIDCONTROLINFOUSEINFO /29/ [ ][ ] /30/ [ ][ ] /01/ [ ][ ] /02/ [ ][ ] /03/ [ ][ ]

Norwegian Meteorological Institute met.no Another sanity check(2)

Norwegian Meteorological Institute met.no … outlier … STIDDATEORIGINTPCORRREDISTYPEIDCONTROLINFOUSEINFO /02/ [ ][ ] /03/ [ ][ ] /04/ [ ][ ] /05/ [ ][ ] /06/ [ ][ ] /07/ [ ][ ] /08/ [ ][ ] /09/ [ ][ ] /10/ [ ][ ] => Use interpolated data directly for the Corrected Value

Norwegian Meteorological Institute met.no Typical Example Cases, contd. STIDDATEDATAINTPCORRREDISTYPEIDCONTROLINFOUSEINFO /29/ [ ][ ] /30/ [ ][ ] /01/ [ ][ ] /02/ [ ][ ] /03/ [ ][ ]

Norwegian Meteorological Institute met.no Data not previously corrected STIDDATEDATAINTPCORRREDISTYPEIDCONTROLINFOUSEINFO /23/ [ ][ ] /24/ [ ][ ] /25/ [ ][ ] /26/ [ ][ ]

Norwegian Meteorological Institute met.no precipcollected_flag exceeding itself since 25/9/2007 STIDDATEDATAINTPCORRREDISTYPEIDCONTROLINFOUSEINFO /08/ [ ][ ] /09/ [ ][ ] /10/ [ ][ ] /11/ [ ][ ] /12/ [ ][ ] /13/ [ ][ ] /14/ [ ][ ] /15/ [ ][ ] ?

Norwegian Meteorological Institute met.no Missing rows Before: STIDDATEDATAINTPCORRREDISTYPEIDCONTROLINFOUSEINFO /29/ [ ][ ] /04/ [ ][ ] /11/ [ ][ ] /18/ [ ][ ] /25/ [ ][ ] /02/ [ ][ ] /09/ [ ][ ] /23/ [ ][ ] /06/ [ ][ ] /20/ [ ][ ]

Norwegian Meteorological Institute met.no Missing rows After: STIDDATEDATAINTPCORRREDISTYPEIDCONTROLINFOUSEINFO /12/ [ ][ ] /13/ [ ][ ] /14/ [ ][ ] /15/ [ ][ ] /16/ [ ][ ] /17/ [ ][ ] /18/ [ ][ ] /19/ [ ][ ] /20/ [ ][ ] /21/ [ ][ ] /22/ [ ][ ] /23/ [ ][ ] /24/ [ ][ ] /25/ [ ][ ] Typeid, ControlInfo, etc., (all data row!) …, Useinfo to set for this case …

Norwegian Meteorological Institute met.no Discussions points …preliminary decisions/actions added (1) Specification of interpolation algorithm?Action: Paul, Matthias and Ole Einar to meet and discuss. (2) Criteria for setting: corrected value = redistributed value ( inter value) ? Associated controlinfo flags to set? Responsibility for setting useinfo flags? According to QC1 flags. Standard deviation of neighbours. If no flag set from point (6). Action: Lars will review examples included in this talk and advise on new flag settings. (3) Redistribution introduces data with unfeasible measurement accuracy. Any consequences? Round data to one decimal place, but keep sum equal to the accumulated value. (4) Storage of derived QC2 information?Store in external data files, i.e. netCDF as a first case. Include estimate of variability in the measurements / uniformity of data … (5) Scope of control, e.g. typeid 302, 402 …Run for both 302 and 402 … flagging will be the same in both cases. (6) Handle localised weather … comparison with satellite and radar data? To provide an indicator of how uniform the rainfall distribution is. First task to build in estimates of the variability from the space control, use of normal values, gradients of ratios etc. (7) Last six months we have precip_flag working well,and can use c(12)=2 criterion … what about older data? Priority is current data. Eventually process historic data too.

Norwegian Meteorological Institute met.no User Interface? Prototype code currently implemented in C++ with algorithms built in for performance. Either run process as per schedule and/or on demand. –Set of values that can be configured by a user/scheduler? Time Interval, Rules for Flags to set, i.e. utilise a scripting language or configuration file to set the controls. Run by operator who reviews results then clicks to submit change. Use of QC1 Perl Algorithm concept? Run algorithm on arbitrary set of 3D data? What are the boundaries? Priotity to develop an operational version with only essential user controls, a rich interface can be built on top later.