XIS XML Input System Statistics Denmark 11 Maj 2004
What is XIS? A Generic System for of Input Data Validation Storage and Publication of Input Data
Logic Several sources to the same survey – stored in the same table structure Relationel database tables (Oracle/SQL) as interface to production units XML validation XML transformation Component based
Electronic Input Sources Web Questionnaires EDI (file transfer) by Email Key Telephone OCR from Paper Scanning FTP Diskettes Tape or CD from Administrative Registers
Questionnaires Approximately 150 different questionnaires Approximately 70 are for private enterprises Annually, Semi Annually, Quarterly or Monthly reporting Large majority is simple questionnaires without routing - A few with complex routing and complex validation
Quantities Total number of reporting is ca. 450.000 per year Approx. 350.000 reporting from private enterprises Intrastat approx. 22.000 each month = approx. 150.000 per year
Architecture Virk.dk XIS PU Private enterprise Email server Scanner
System Architecture Adm. XML System T Key Telephone Virk.dk INPUT DB Email XML/CSV XML V T V Blaise XML XML OCR Scanning ? Control Message Or Log Diskettes/Tape/CD
Design principles Flexibility – needs are changing Changesibitily – questionnaires changes all the time Clear and simple interfaces – simple integration Components and standards – evolution step by step Stability and correctness – it is production Confidentiality – bureau of statistics Automation - resources Transperancy – user control
Overview ADM. DB INPUT XML SYSTEM INPUT SYSTEM INHOUSE DB Data Editing Respondent DB INHOUSE Data Editing INPUT SYSTEM XML SYSTEM INPUT DB Web Service PRF DB
4 Database Model Input Metadata TIMES Macro Metadata INPUT DB STAT. REG. SUM DB STAT BANK
Input Database Architecture Metadata D261210 Tælling 1 X010101 Tælling 2 X020202 Tælling 3 X030303 Tælling 4 X030322
Tracking Administrative Metadata / Envelope Data Form – eg. Intrastat (130501) Period – eg. 2004M3 Respondent – Legal/Obligated part Reporter – Supplier of information Date – eg. 2004-03-18 14:32:10
Prefill Central business register number Unit of reporting Period, deadline, status etc. Fields in form Questions Description of errors Notifications by email
Communication Publishing Reporting Error reporting Re-reporting Etc.
Technicalities Oracle Database, 9.2i Software AG, XML Mediator Generic database creator upon XSD: Nesting-> New Table Repeting field -> New Table Unique tag names -> Unique table names Generic XML loader Generic XML creator upon SQL views Cryptomathic, SMIME, Digital Signatures, X509 POP3 and SMTP Secure FTP Web Services, SOAP
Status Reception of data since June 2003 Prefilling from April 2004
Plans Forms administration Metadata Statistics Data from public administrative registres
Thanks