Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Quality Processes in MMEA platform 6.11.2013.

Similar presentations


Presentation on theme: "Data Quality Processes in MMEA platform 6.11.2013."— Presentation transcript:

1 Data Quality Processes in MMEA platform 6.11.2013

2 Topics -Quality control processing chain overview -Real time vs. non-real time time QC/AD -Current state of QC/AD in the MMEA platform -Planned work, Syke water quality case

3 Quality control processing chain overview

4 Real time vs. non-real time QC and AD -Real time QC and AD -Usually computationally inexpensive tasks -Range checks, missing data detection, etc. -Complex event processing with Esper -Non-real time QC and AD -Missing value imputation, trend analysis, modeling, etc. -Large datasets, computationally heavy tasks -Batch jobs -QC/AD Library

5 QC/AD Library A reusable set of Java classes for data quality control computations and anomaly detection The library is independent of MMEA-specific schemas or components Supports Java generics (computation parameters and return types can be simple primitive data types, but also complex ones, such as objects)

6 Complex event processing with Esper Detecting patterns from data streams. Queries in EPL (‘Event Processing Language’), resembles SQL Data streams are run against the queries. A listener is attached to the query. It reacts when a matching pattern is found.

7 Current state of QC/AD in the MMEA platform -Detection of anomalies from water level and pollen concentration forecasts could be implemented in the near future. -Oulu university has been developing models that could be integrated with the platform. -Planned Syke water quality case.

8 QC/AD in the MMEA platform QC1 QC2 QC0 Mediator

9 Anomaly detection example Poller

10 ComputationService Prototype was developed earlier this year. Runs in Tomcat. Web service interfaces for managing tasks: –Starting computation jobs –Terminating running jobs –Polling for job status

11 Planned work, Syke water quality case Integration of the SYKE water quality measurement service into MMEA platform. A user can ask the MMEA platform for phosphorus and suspended solid contents in water for a specified area. The quality of the data will be controlled and quality estimate will be returned to the user.

12 Planned work, Syke water quality case QC tests: –Missing data –Missing value –Variation –Range –Outlier detection –Trend analysis –Comparison with other relevant meteorological or hydrological data Óther computations: –The result of the query, phosphorus and suspended solid contents in water, are computed from turbidity information.


Download ppt "Data Quality Processes in MMEA platform 6.11.2013."

Similar presentations


Ads by Google