Presentation is loading. Please wait.

Presentation is loading. Please wait.

Status Report of EDI on the CAA

Similar presentations


Presentation on theme: "Status Report of EDI on the CAA"— Presentation transcript:

1 Status Report of EDI on the CAA

2 CAA Public website All pages fully accessible by anyone

3 CAA Test Area Currently services accessible only by ESTEC/ESAC
Individual instrument team members will be allowed on a need basis Some of the CAA Command-line services are available to all

4 EDI Datasets for Downloading

5 CAA EDI Graphics

6 CAA EDI Graphics

7 Dataset Inventory Analysis
There are several similar datasets where the main difference is that data vectors or matrices are converted into another system. For instance, scientific data are given in different scientific units or coordinate systems Consistence of such datasets have never been investigated although there is some evidence that some errors can occur Such similar datasets can often be expected to have the same number of records Exception: if a dataset is given in raw units, the corresponding datasets in scientific units may have less records if poor records were deleted However, if poor data have been replaced with FILLVAL, the same number of records should exist In addition such datasets should have similar values for Version numbers Generation and ingestion dates Note: these metadata can have significant differences, so there is no automatic CAA has developed a tool that collects such metadata and gives them in a text file

8 Inventory Output The beta version of the inventory tool is available (currently) at When the tool is ready, inventory is executed at TBD frequency (only for the newly ingested files)

9 Example: RAPID - ESPCT6 Inventory analysis for C1 "Electron, omni-directional distribution" (C1_CP_RAP_ESPCT6). Compared datasets: 1: C1_CP_RAP_ESPCT6 2: C1_CP_RAP_ESPCT6_R Generation date T21:34:36Z Analysis is based on the database content at T11:31:06Z Data coverage T00:00:00Z/ T23:59:59Z Columns description: Date: YYMMDD OK?: OK: comparison OK, ERR: error ERR: if ERR, the reason for error: F: not all files exist R: number of records don't match T: timestamps don't match V: versions don't match Rx: number of records in file x Vx: version of file x Gx: generation time Ix: ingestion time

10 Example: RAPID ESPCT6

11 Example of Timing errors
UT time stamps can differ up to 3 milliseconds

12 Automation Tool Purpose of the tool:
Keeping track of dataset updates that may cause re-deliveries of other products Avoid risk of having products that are out-of-date Tool consists of a number of distinct components identification of intervals in need of (re-)processing scheduling pipeline: to issue jobs across the CAA machines standard wrapper and support routines for execution of pipelines common logging, pre-validation and submission system Instrument teams may benefit of this service, particularly the first part that identifies the intervals that are in need of (re-)processing

13 Automation Tool: Example
# Check FGM since given date yesterday # C1_CP_FGM_FULL, Output: C1_CP_FGM_FULL # T18:45:42Z/ T03:48:59Z # T03:00:11Z/ T12:35:48Z # T02:40:53Z/ T08:57:25Z # T00:09:44Z/ T09:47:42Z # T04:31:36Z/ T06:05:48Z The check is being made on all data from to the most recent day primary dataset = time specification ->ingestion date of for the whole mission dependent dataset = C1_CP_FGM_FULL (a minimum ingestion date is specified but is not really needed in this case since it is the same as the primary dataset; it was included to avoid picking up the FGM data which was re-ingested with detached headers but the data were unchanged so did not want to trigger a reprocessing of the entire mission). Result = looking for intervals where the dependent dataset has been ingested more recently than the primary, so in this case it is finding all intervals where C1_CP_FGM_FULL has been ingested since

14 Automation Tool: Example, cont …
If the prime and dependent specifications are swapped, it would then list all C1_CP_FGM_FULL intervals that had not been ingested since # Check FGM not ingested since given date yesterday C1_CP_FGM_FULL # Output: C1_CP_FGM_FULL T00:00:00Z/ T18:45:42Z C1_CP_FGM_FULL T03:48:59Z/ T03:00:11Z C1_CP_FGM_FULL T12:35:48Z/ T02:40:53Z C1_CP_FGM_FULL T08:57:25Z/ T00:09:44Z C1_CP_FGM_FULL T09:47:42Z/ T04:31:36Z C1_CP_FGM_FULL T06:05:48Z/ T00:00:00Z

15 RAPID Example # Check if dataset C1_CP_RAP_EPITCH needs updating
yesterday C1_CP_RAP_EPITCH C1_CP_FGM_FULL, C1_CP_RAP_EPITCH T18:45:42Z/ T03:48:59Z C1_CP_RAP_EPITCH T03:00:11Z/ T12:35:48Z C1_CP_RAP_EPITCH T00:00:00Z/ T09:47:42Z C1_CP_RAP_EPITCH T23:59:59Z/ T06:05:48Z

16 RAPID Example, cont … Output can optionally be given as interval split/aligned e.g. by day T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z ... T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z T00:00:00Z/ T00:00:00Z Option also provided to give the next available version number for each interval

17 Search of Missing Files
# Find missing FGM_FULL files yesterday yesterday Output: C1_CP_FGM_FULL T00:00:00Z/ T00:10:02Z C1_CP_FGM_FULL T12:10:14Z/ T21:16:07Z C1_CP_FGM_FULL T19:32:41Z/ T04:39:09Z C1_CP_FGM_FULL T06:05:48Z/ T00:00:00Z C3_CP_FGM_FULL T00:00:00Z/ T00:10:02Z C3_CP_FGM_FULL T05:10:27Z/ T14:18:08Z C3_CP_FGM_FULL T07:30:17Z/ T16:35:23Z C3_CP_FGM_FULL T15:34:55Z/ T00:40:56Z C3_CP_FGM_FULL T12:17:51Z/ T21:23:44Z C3_CP_FGM_FULL T00:38:55Z/ T09:47:08Z C3_CP_FGM_FULL T01:32:09Z/ T10:38:51Z C3_CP_FGM_FULL T06:05:48Z/ T00:00:00Z

18 EDI Delivery/Ingestion Activity
The plots are regenerated daily around mid-night Monthly and 6-month plots Top two panels are taken from database Top: Number of files ingested into the database 2nd from top: average time used for one file to validate/add into the database Bottom five shows an instantaneous situation at the time of plot production 3rd: Number of files failed validation: e.g. wrong version number 4th and 5th: number of CEF and nn-CEF files in the delivery area 6th and 7th: number of CEF and non-CEF files waiting for validation

19 Status of File Transfer to CSA

20 EDI inventory Notes: If EGD exists, there is a chance for PP/SPIN/MP
If EGD does not exist, no chance for PP/SPIN/MP QZC and CRF should exist always in EF-mode, so they should have the same coverage as CLIST/EF-mode PP and SPIN should have identical coverage MP should have a wider coverage than PP/SPIN

21 EDI inventory Inventory plots are visible in annex 2


Download ppt "Status Report of EDI on the CAA"

Similar presentations


Ads by Google