Download presentation
Presentation is loading. Please wait.
Published byAugustine Skinner Modified over 9 years ago
1
Developing Statistical Information Systems and XML Information Technologies - Possibilities and Practicable Solutions heikki.rouhuvirta@stat.fi Geneva, 8-10 May 2007 Heikki Rouhuvirta, Statistical Methodology R&D
2
01.04.20072Heikki Rouhuvirta Approaches to Statistics Production Sources to statistics – Data Processing Sources to statistics – Statistical Methodology Statistics as Information
3
01.04.20073Heikki Rouhuvirta tilasto- aineisto Dirty data Compilation / combining of data logical verifications processing into statistical concepts reporting release analyses reporting release protection of unit-level data quality control and approval of data for the purpose of statistics compilation further processing registers Inquiries other statistical data Imputation etc. Datum IT in Statistics Production
4
01.04.20074Heikki Rouhuvirta Methodological processing of statistical data In statistics production
5
01.04.20075Heikki Rouhuvirta Statistical Information
6
01.04.20076Heikki Rouhuvirta Challenge: create solutions that unite the foregoing point of views the solutions offer the services that statistic production needs the solutions are easy recognizable by a user and offer an adequate informative basis for each individual task by solutions the entity of tasks is manageable for the statistician Key for Solution: exploitation of XML Technology
7
01.04.20077Heikki Rouhuvirta XML Spesification for Statistical Information Common Structure of Statistical Information (CoSSI) Basic of XML
8
01.04.20078Heikki Rouhuvirta … the result from a statistics standpoint …
9
01.04.20079Heikki Rouhuvirta 0.Defining 1.Collecting 2.Editing 3.Producing public statistics 4.Using basic format datamatrix and description condensed format table and description descriptions in different documents matrix model including statmeta table model including statmeta statistical metadata model Stages of Processing condensing interpreting Model of Data Organisation matrix module table module statmeta module Statistics Production and Statistical Information
10
01.04.200710Heikki Rouhuvirta … case studies of XML in statistics production …
11
01.04.200711Heikki Rouhuvirta XML Database and Statistical Information
12
01.04.200712Heikki Rouhuvirta Retrieval of Statistical Metadata for a Variable - Simple User Interface
13
01.04.200713Heikki Rouhuvirta Turn over the Documents in XML Database
14
01.04.200714Heikki Rouhuvirta Saving Documents to XML Database
15
01.04.200715Heikki Rouhuvirta /db/logs/contents.xml... STORE /db/Tilastot/Arbortext-koulutus/Julkaisut/Julkaisu4.xml STORE /db/Tilastot/Arbortext-koulutus/Julkaisut/Julkaisu4_001.gif STORE /db/Tilastot/Arbortext-koulutus/Julkaisut/Julkaisu4_002.gif STORE /db/Tilastot/Arbortext-koulutus/Julkaisut/Julkaisu4_002.png STORE /db/Tilastot/Arbortext-koulutus/Julkaisut/Julkaisu4_eq_00.gif UPDATE /db/Tilastot/Arbortext-koulutus/Julkaisut/Julkaisu1.xml /db /system admin dba /config admin dba users.xml admin dba rwurwu--- /Tilastot admin dba /logs admin dba contents.xml admin dba rwurwur-- Event log of XML Database
16
01.04.200716Heikki Rouhuvirta Tabulation Application Architecture in SAS
17
01.04.200717Heikki Rouhuvirta Tabulation Wizard User Interface in SAS EG
18
01.04.200718Heikki Rouhuvirta SAS Data Editing Process
19
01.04.200719Heikki Rouhuvirta Statistical data Logical schema of an XML file
20
01.04.200720Heikki Rouhuvirta Archiving and Backuping to XML
21
01.04.200721Heikki Rouhuvirta Example of Xquery/SQL
22
01.04.200722Heikki Rouhuvirta Content of XML file
23
01.04.200723Heikki Rouhuvirta Production and Dissemination of Tables in Publishing Process
24
01.04.200724Heikki Rouhuvirta XML Publication Editor - User Interface
25
01.04.200725Heikki Rouhuvirta Retrieval of Statsitical Information
26
01.04.200726Heikki Rouhuvirta … and statistical information in tables
27
01.04.200727Heikki Rouhuvirta Statistical figure 6 Statistical figure 1Class value 1 Statistical figure 8 Statistical figure 4 Class value 2 Variable 3Variable 2 Variable 1 Statistical figure 6 Statistical figure 5 Statistical figure 2 Statistical figure 1Class value 1 Statistical figure 7 Statistical figure 3 Class value 2 Variable 3Variable 2 Variable 1 Table 1. Statistical Metadata in a informative statistical table (I) Statistical metadata: title, subtitle, footnote, metadata reference (quality declaration) Document metadata elements: subject, keywords, content description, date, identifier Statistical metadata elements: -name, specification, concept definition, concept definition description, operational definition, operational definition description, calculation name, calculation formula, calculation description, measurement unit, measurement description Statistical metadata elements: -code, name, description Document metadata elements: -classification id, type, author, date Statistical metadata elements: -note Register metadata elements: name, concept definition, formation intsruction, law, interpretation of law, lawcases, etc.
28
01.04.200728Heikki Rouhuvirta Statistical figure 6 Statistical figure 1Class value 1 Statistical figure 8 Statistical figure 4 Class value 2 Variable 3Variable 2 Variable 1 Statistical figure 6 Statistical figure 5 Statistical figure 2 Statistical figure 1Class value 1 Statistical figure 7 Statistical figure 3 Class value 2 Variable 3Variable 2 Variable 1 Table 1. Statistical Metadata in a informative statistical table (II) Quality declaration Quality Indicators: Coefficient of Variation Value=0.92 Quality Indicators: Coefficient of Variation Value=0.87
29
01.04.200729Heikki Rouhuvirta Statistical figure 6 Statistical figure 1Class value 1 Statistical figure 8 Statistical figure 4 Class value 2 Variable 3Variable 2 Variable 1 Statistical figure 6 Statistical figure 5 Statistical figure 2 Statistical figure 1Class value 1 Statistical figure 7 Statistical figure 3 Class value 2 Variable 3Variable 2 Variable 1 Table 1. Statistical Metadata in a informative statistical table (III) Quality declaration Quality Indicators: Coefficient of Variation Value=0.92 Quality Indicators: Coefficient of Variation Value=0.87
30
01.04.200730Heikki Rouhuvirta Conclusions XML Based Service Environment in Statistics Production The statistics production solution briefly described above gives indications of the kinds of services that could be produced from a statistical information system in future, both for statisticians and the users of statistical data. The foundation (for statistics production) is an XML-based information architecture and standard applications exploiting it. Basing the implementation of the information architecture on XML allows utilisation of standard and standard-like specifications, but the special characteristics of statistical information should be taken into consideration in their application and implementation. If, for instance, the possibilities of a semantic structural specification are not exploited in the structural analysis and the final structure of statistical data, from the point of information management the solutions become complicated, on the one hand, and ineffective in practice, on the other. From the perspective of application development, it seems especially important that the information architecture itself does not contain application-specific data specifications, because we are unlikely to see a situation where we would have just one monolithic application for both statistics production and information service provision. A semantically relevant structure helps the statistician and the user of statistics to control the correctness of contents.
31
01.04.200731Heikki Rouhuvirta Thank you for your attention!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.