Training Course on EDIT 2013 For Users
Outline of the module Introduction Using EDIT - integration with other tools Objects in EDIT for Users EDIT Graphical User Interface
A - Introduction
EDIT is a tool for data validation and data editing/imputation What is data validation? - An activity aimed at verifying whether the value of a data item comes from the given set of acceptable values: What is data editing? - The activity aimed at identifying erroneous entries and correcting them if necessary. Example: the response is missing or incorrect.
How EDIT works? Define a format A format contains a description of the data in a dataset A dataset is a set of data according to a specific format Define a format Define a program containing rules and file operations to be executed on the dataset(s) Uploads dataset(s) from external sources (e.g. CSV files) For users Execute the job Get the report containing errors (if any)
EDIT roles 'User‘ - Executes programs on datasets and accesses the reports. 'Programmer‘ - Manages the metadata needed by the user to execute programs; Implements 'formats‘; Implements ‘validation rules’ by means of 'programs'; Defines other operations on files by means of 'programs'; Sets up the configuration (if needed) relating to automatic processing, validation flows, connection templates, etc. 'Administrator' Manages users, domains and permissions.
'User' functionalities ‘Change Password’ ‘Dataset Import/Export’ Allows users to change their password; ‘Dataset Import/Export’ Allows users to import and export data to and from EDIT as well as monitor any ongoing import/export processes; ‘Job Execution’ Allows users to execute programs on imported datasets and view/export the results of the execution.
The ‘user workflow’ Data Import Job Execution Job Results Data Export
The link between ‘user workflow' and ‘user interface'
Accepted dataset(s) formats SDMX-ML GESMES CSV FLR
GESMES (BOP ITS, BOP FDI) UNA:+.? ' UNB+UNOC:3+FR2+4D0+100929:1637+IREF000243++GESMES/TS' UNH+MREF000001+GESMES:2:1:E6' BGM+74' NAD+Z02+ECB' NAD+MR+4D0' NAD+MS+FR2' IDE+10+EUROSTAT_BOP_01 reporting' DSI+BOP_FDI_A' STS+3+7' DTM+242:201009291637:203' DTM+Z02:20072009:702' IDE+5+EUROSTAT_BOP_01' GIS+AR3' GIS+1:::-' ARR++A:FR:N:2:330:N:4A:E:9999:9999:20072009:702:0:A:F+0:A:F+0:A:F‘ ARR++A:FR:N:2:330:N:4F:E:9999:9999:20072009:702:0:A:F+0:A:F+0:A:F' ARR++A:FR:N:2:330:N:7Z:E:9999:9999:20072009:702:0:A:F+0:A:F+0:A:F' ARR++A:FR:N:2:330:N:A1:E:1100:9999:20072009:702:5824:A:F+5930:A:F+4204:A:F' ARR++A:FR:N:2:330:N:A1:E:1495:9999:20072009:702:5828:A:F+5932:A:F+4206:A:F' CSV (with or without header) (SBS, CVTS,TOURISM) 9H; 2008; LT; 2; B-N_X_K642; 11930; 16236; ; ; ; ; UNIT; ; ; ; ; ; TT0; ; ; ; ; D08 9H; 2008; LT; 3; B-N_X_K642; 11930; 1001; ; ; ; ; UNIT; ; ; ; ; ; TT; ; ; ; ; D08 9H; 2008; LT; 4; B-N_X_K642; 11930; 529; ; ; ; ; UNIT; ; ; ; ; ; TT; ; ; ; ; D08 9H; 2008; LT; 30; B-N_X_K642; 11930; 17766; ; ; ; ; UNIT; ; ; ; ; ; TT; ; ; ; ; D08 9H; 2008; LT; 2; B-E; 11930; 1138; ; ; ; ; UNIT; ; ; ; ; ; TT; ; ; ; ; D08 9H; 2008; LT; 3; B-E; 11930; 104; ; ; ; ; UNIT; ; ; ; ; ; TT; ; ; ; ; D08 9H; 2008; LT; 4; B-E; 11930; 61; ; ; ; ; UNIT; ; ; ; ; ; TT; ; ; ; ; D08 FLR example 1 001E20100121814 00 804.822 001E20100121816 93 5295.54 001E20100121814 99 6166.24 001E20100125290334 581.371 FLR example 2 2010010011 010252000405595911005909580E 01ZZZZZ 2691.966 2734482.0 0.0 2010010011 010252000405595911004009600E 01ZZZZZ 237.543 341202.0 0.0
B - Using EDIT - integration with other tools
Ways of using EDIT As a web-based application – called by other applications; Standalone – running on a PC; Client – server – running in a Data Centre.
EDIT as Web-based application https://webgate. ec. europa Web-based Interface Unified interface for both the standalone version and the server deployment EUROSTAT Look & Feel Light interface, simplified workflows ECAS account is needed PS Web-based access is not intended for confidential data
EDIT running standalone Downloadable package; Standalone installation supported by Windows XP and Windows 7; Simple installation wizard; Full functionality; Standard authentication is requested.
Client - server mode for EDIT EDIT runs on a UNIX machine; The current setup is EDIT installed at Eurostat & other DGs; Contains all registered domains (= user specific workspaces) as by default imbedded; ECAS credentials needed for external users.
EDAMIS integration EDAMIS allows transmitting data files through a single entry point; EDAMIS can send data to EDIT by placing the files in a configurable location; EDIT detects metadata based on the EDAMIS naming convention; EDIT performs the processing in unattended mode.
SDMX integration Statistical Data and Metadata Exchange (SDMX) initiative is sponsored by seven institutions (the BIS, the ECB, Eurostat, the IMF, the OECD, the UN and the World Bank); SDMX describes and universalises the way to exchange statistical data and metadata; EDIT can import SDMX-ML datasets.
C - Objects in EDIT for Users Datasets Programmes, Jobs
1 - Datasets Dataset is a collection of data rows structured according to a format; A two dimensional table composed by rows and columns: Columns correspond to the fields defined in the format; Records – no limit on size or number.
Dataset example –AES (Adult Education Survey)
Example: 'Format' – 'Dataset'
The same format – different datasets
2 - Programs, Jobs Program – a set of operations to be performed on a dataset defined by a specific format; No specific dataset is associated with a program, only formats (dataset definitions) should be specified; Job – the association between a 'Program' and concrete 'Dataset(s) Instance(s)'; Possible operations types of rules/checks: Single and Multiple column(s), Vertical and Hierarchical.
Job: error reports It contains: View dataset– information about the job View statistics – summary of errors/statistics View detailed statistics report – downloadable excel file containing the summary of errors Export – new dataset containing errors can be exported here
Error report Error report is made up of errors contained in the imported dataset. Among other information, the following can be found: Rule name: The name of the program rule that failed; No of Failures: Individual rows that the error appeared through job execution; Rule Message: Rule’s error message as defined in the program.
Error report – view dataset
Error report - View statistics
Error report – View detailed statistics report
D - EDIT GRAPHICAL USER INTERFACE
EDIT - Log in (through ECAS)
Web-based access – not intended for confidential data
EDIT Home page EDIT 2013 User Manual EDIT Concepts (Tabs) Your role in EDIT
Import dataset – file import Locate your dataset Name your dataset Dataset Predefined information
Advanced configuration (I part) If it is empty Click here to configure Threshold (can be changed) Choose among GESMES/ CSV/ FLR/ SDMX Fill in the information accordingly Properties can be saved (to be reused)
Advanced configuration (II part) Select Format Reuse saved selection Click here when you are ready to import
Importing a CSV file with a header
Header definition meaningless Fill in the parameters according to specificities of the dataset The order of the variables in the dataset has to be exactly the same as in the selected fields
Header definition meaningful Fill in the parameters according to specificities of the dataset The order of the variables in the dataset DOES NOT need to match that of the selected fields box
Click on the triangle to see information about rejected cases Import - failed Click on the triangle to see information about rejected cases Import status FAILED
Import– successful with warnings Click on the triangle to see information about rejected cases Import status completed
Click here for imported dataset Import status completed Import - successful Click here for imported dataset Import status completed
Hide columns from the showing dataset Imported dataset Hide columns from the showing dataset Imported dataset
Fill in the information as needed and click on search Search Dataset(s) Fill in the information as needed and click on search Export in CSV /FLR View dataset Delete dataset
Import / Export dataset(s) Search criteria View, delete and download datasets
Create a Job – (I) choose a Job Search criteria Available programs Launch, view (program) or export a specific job
Create a Job – (II) select parameters Choose the appropriate dataset Rename (or not) the error log Click Execute Job
Create Job –(III) Job results Search criteria View, delete & copy job(s)
Job detailed information View / export specific dataset View Job details Job detailed information View error report/statistics/ detailed statistics / export error report View / export specific dataset
View error report information that can be hidden Displaying error cases in the form of a error dataset
Validation rule which failed in the previous cases RECORD SEX { CONDITION in(SEX,"MALES","FEMALES","TOTAL"); ERRMSG 'Value must match one of the codes' SEVERITY 'E' (SEX) ;
View Job statistics
View Job detailed statistics report – a downloadable file
Export error report (using CSV/FLR) Choose file type & fill in the required parameters accordingly Hit export
View program (available in Create Job)
View Job statistics
Fill in the appropriate parameters Get the corresponding Job list Search Job Fill in the appropriate parameters Get the corresponding Job list
Run validation flow – step I A validation flow may be available
Run validation flow – step II Locate dataset Name dataset Click start
Validation flow in progress
Run validation flow – step III Job Details where error reports are accessible (same window as in View Job Details)
Useful links To EDIT page: http://ec.europa.eu/eurostat/edit To VIPv page: CIRCAbc -> Eurostat -> VIP Validation Project Generic data validation and editing service: mailto: ESTAT-VALIDATION@ec.europa.eu EDIT as web – client - https://webgate.ec.europa.eu/eurostat/edit CIRCAbc for: EHSIS: https://circabc.europa.eu/w/browse/0b5ab24d-68a0-419f-a6bd-e41eb84f33fb BoP: https://circabc.europa.eu/w/browse/01940df9-91ec-407b-9ba4-0f5c47086e0c BoP:https://circabc.europa.eu/w/browse/ef8b542b-35a8-401c-9dd4-37f61e49f34d
Thank you for your attention! Questions? Thank you for your attention!