Interactive session: Metadata Maia Ennok Head of Data Warehouse Service Statistics Estonia
Schema Maia Ennok ESSNet Data Warehouse 05/24/12 1 Specify Needs 2 Design3 Build4 Collect5 Process6 Analyse7 Disseminate Es ta bli sh ou tp ut ob je cti ve s Id en tif y co nc ep ts C he ck da ta av ail ab ilit y Pr ep ar e bu si ne ss ca se D es ig n ou tp ut s D es ig n va ria bl e de sc rip tio ns D es ig n da ta co lle cti on m et ho do lo gy D es ig n fra m e an d sa m pl e m et ho do lo gy D es ig n st ati sti ca l pr oc es si ng m et ho do lo gy D es ig n pr od uc tio n sy st e m s an d w or kfl o w B uil d da ta co lle cti on in str u m en t B uil d or en ha nc e pr oc es s co m po ne nt s 3. 3 C on fig ur e w or kfl o w s Te st pr od uc tio n sy st e m s Te st st ati sti ca l bu si ne ss pr oc es s 3. 6 Fi na liz e pr od uc tio n sy st e m s S el ec t sa m pl e S et up co lle cti on R un co lle cti on Fi na liz e co lle cti on Int eg rat e da ta Cl as sif y an d co de R ev ie w, va lid at e an d ed it Im pu te D eri ve ne w va ria bl es an d st ati sti ca l un its C al cu lat e w ei gh ts C al cu lat e ag gr eg at es Fi na liz e da ta fil es Pr ep ar e dr aft ou tp ut s V ali da te ou tp ut s Sc rut ini ze an d ex pl ai n A pp ly di sc lo su re co ntr ol Fi na liz e ou tp ut s U pd at e ou tp ut sy st e m s Pr od uc e di ss e mi na tio n pr od uc ts M an ag e rel ea se of di ss e mi na tio n pr od uc ts Pr o m ot e di ss e mi na tio n pr od uc ts M an ag e us er su pp ort Access Layer Interpretation and Analysis Layer Integration Layer Source Layer off SDWH Extra Layer
Task Maia Ennok ESSNet Data Warehouse 05/24/12 Put metadata subsets to schema (write examples) Same groups as previous ineractive session: SBS, STS, SBR, ET GSBPM phases SDWH layers, Extra Layer with description if we missed a layer, off SDWH Metadata subsets in different colors (Statistical, Process, Technical, Quality, Authorisation), Extra metadata subset with description if we miss a subset Presentations with metadata subsets, examples and answered fallowing questions Questions: What is in your opinion the key element of the S-DWH ? VARIABLE vs. DATASET What is the absolute minimum set of metadata that must be defined for that element? What should be the main function of the S-DWH (process support/driver, output/dissemination)? What is the function of the metadata layer?
Generic Statistical Business Process Model (GSBPM) Maia Ennok ESSNet Data Warehouse 05/24/12
SDWH Layers Maia Ennok ESSNet Data Warehouse 05/24/12 I.source layer, is the level in which we locate all the activities related to storing and managing internal (surveys) or external (archives) raw data sources. II.integration layer, on this layer performs the typical Extraction, Transformation and Loading functions; which must be realized in automatic or semi- automatic ways III.interpretation and data analysis layer is specialized to interactive and not structural activities. IV.access layer is addressed to a wide typology of users or informatics instruments for the final presentation of the information sought
Metadata subsets Maia Ennok ESSNet Data Warehouse 05/24/12 Statistical metadata are data about statistical data This definition will obviously cover all kinds of documentation with some reference to any type of statistical data and is applicable to metadata that refer to data stored in a S-DWH as well as any other type of data store Examples: Variable definition; register description; code list. Process metadata are metadata that describe the expected or actual outcome of one or more processes using evaluable and operational metrics Examples: Operator’s manual (active, structured, reference); parameter list (active, structured, reference); log file (passive, structured, reference/structural) Technical metadata are metadata that describe or define the physical storage or location of data. Examples: Server, database, table and column names and/or identifiers; server, directory and file names and/or identifiers Quality metadata are any kind of metadata that contribute to the description or interpretation of the quality of data. Examples: Quality declarations for a survey or register (passive, free-form, reference); documentation of methods that were used during a survey (passive, free-form, reference); most log lists (passive, structured, reference/structural) Authorisation metadata are administrative data that are used by programmes, systems or subsystems to manage users’ access to data. Examples: User lists with privileges; cross references between resources and users
Mapping the BPM-Notation on a SDWH layerd architecture Maia Ennok ESSNet Data Warehouse 05/24/12
Schema with metadata subsets Maia Ennok ESSNet Data Warehouse 05/24/12 Statistical Process Technical Quality Authorisation
3/28/12 Esitluse või esitleja nimi