CountryData SDMX for Development Indicators MDG DSD vs. the Di Database: Using the Mapping Tool
MDG Data Structure Definition (DSD) background Developed by SDMX Task Team of IAEG on MDGs Supports exchange of MDG Indicator data between international agencies (UN, UNICEF, UNESCO, …) Implemented in SDMX 2.0 Latest version (2.4) finalised in Feb 2013
DevInfo (Di) background Data dissemination software supported and promoted by UNICEF DevInfo7 (Di7) launched in Nov 2012 SDMX 2.1 & 2.0 compliant Web base software 9 out of 11 project countries using DevInfo Stable version compare to previous releases
Simple relation between Di & DSD Di DatabaseMDG DSD Area Indicator Unit Subgroup (i.e. Sex, Age, Location etc.) Source Time Period Footnotes Frequency (Default = “Annual”) Reference Area Series Units of measurement Unit multiplier (Default = 0) Location (Default = “Total”) Age group (Default = “All Ages”) Sex (Default = “Both Sexes”) Source Type (Default = “NA”) Source details Time Period Time period details Nature of data points (Default = “C”) Footnotes
Mapping to the DSD DSD dimensional structure means values are mandatory for LOCATION, SEX & AGE GROUP. Due the nature of this domain (i.e. MDGs), not obvious which values should be used in these dimensions For example, what is SEX for “Births attended by skilled personnel”: Not Applicable? Total? Female?
Mapping to the DSD Inconsistent mappings lead to duplications and other anomalies In CountryData, mappings for indicators/ time series are agreed before data exchange (see mapping for MDGs from 1 st workshop) However, this is just one side of the story…
Mapping to the DSD Understanding the structure and contents of the origin database is fundamental to the mapping process Mapping to the DSD requires the data to enter into certain ‘restrictions’ it’s not bound by in the database (and vice versa).
Mapping to the DSD The mapping tool in di software is designed to work with the di database as simply as possible… the tool is based on mapping between the codelists of the DSD and origin database; certain situations require some further manual effort to map a time series; and sometimes a “fix” is required to the database where the data simply isn’t valid or it’s duplicated. Therefore it’s good to review di structure to understand where these issues usually occur.
Area, hierarchical dimension IUS = Indicator, Unit and Subgroup Time series data are stored with the combination of the 3 dimensions Indicator Unit Subgroup: Combination of one or more sub-dimensions Source & Time Period Together with IUS “uniquely” defines each data value Footnote “Free text” field stored with data value Di Data Architecture
IUS: Indicator Unit Subgroup Indicator, for example: Infant Mortality AIDS Death Malaria Death Similar to SERIES in the DSD Contains only Indicator specific values Di INDICATOR
IUS: Indicator Unit Subgroup Unit: Percentage Number USD Square KM Similar to UNIT of Measurement in DSD Contains only Unit specific values Di UNIT
IUS: Indicator Unit Subgroup SubGroup Dimension: Combination of one or more sub- dimensions “Age”, “Sex”, “Location” and “Other” sub- dimensions are set initially in database Specific values can be created under each sub-dimension Relate to SEX, AGE GROUP and LOCATION in DSD. Di SUBGROUP
IUS: Indicator Unit Subgroup Formation Logic: Sub-DimensionAgeSexLocationOther Sub-Dimension values < 1 Year < 5 Year 5 – 10 Year Male Female Urban Rural Total Rice Wheat SUBGROUP (Combination) <1 Year Male <5 Year Female Rural Urban Di SUBGROUP
Di SOURCE
Di TIME PERIOD
Once data exists in di7 web-based software then data can be mapped and published which conforms with the MDG DSD. This is all done online through the di7 web- based repository through the administration profile, so let’s begin… Di Mapping Tool: Introduction
Getting Started…
Scroll down to ‘Registry’ menu
Log onto administrative profile
Full access to ‘Registry’ features
Prepare the Dbase for mapping
Prepares the SDMX artefactes
Ready to ‘Upload’ the DSD
Choose a DSD from your folders
DSD Upload is a success…
Now you are ready to map…
1 st Step: Codelist mapping
SEX CodeList NANot applicable FFemale MMale TBoth sexes UNIT CodeList NANot applicable CUR_LCULocal currency USD NUMBERNumber RATIORatio PERCENTPercent KM2Square kilometers TMetric Tons PER_100_LIVE_BIRTHSPer 100 live births PER_100_POPPer 100 population PER_1000_LIVE_BIRTHSPer 1,000 live births PER_1000_POPPer 1,000 population PER_100000_LIVE_BIRTHSPer 100,000 live births PER_100000_POPPer 100,000 population AGE CodeList NANot applicable 000_099_YAll age ranges 000_006_Munder 6 month olds 000_005_Yunder 5 year olds 000_001_Yunder 1 year olds 000_018_Yunder 18 year olds 000_006_Yunder 6 year olds 010_005_Y10-14 year olds 015_005_Y15-19 year olds 015_010_Y15-24 year olds 015_035_Y15-49 year olds 006_054_M6-59 months old 006_009_Y6-14 year olds 005_013_Y5-17 year olds 015_050_Y15-64 year olds Location CodeList TT Total (national level) UU Urban RR Rural Indicator CodeList SH_HIV_INCDHIV incidence rate SH_MLR_MORTNotified cases of malaria SE_ADT_1524Literacy rate SE_PRM_CMPLPrimary completion rate 1 st Step: DSD Codelists
1 st Step: (A) Map Indicator codes
1 st Step: (B) Map Unit codes
1 st Step: (C) Map Subgroup codes
1 st Step: (C) Choose Subgroup list
1 st Step: (C) Map Age subgroup
1 st Step: (C) Map Sex & Location
1 st Step: (D) Map Area
1 st Step: Save codelist mappings
1 st Step: Ignore warning
1 st Step: Confirm mapping saved
Exercise 1: Codelist mapping Use unstats.un.org/unsd/demodiweb[1-6]unstats.un.org/unsd/demodiweb[1-6] Username = Password = Map the codelists (where possible) for Unit Age Sex Location Area And just one indicator, “Antenatal care coverage for at least one visit”
1 st Step: Complete
2 nd Step: Confirm IUS mapping
2 nd Step: Save IUS Mappings
Exercise 2: mapping time series Use unstats.un.org/unsd/demodiweb[1-6]unstats.un.org/unsd/demodiweb[1-6] Username = Password = Map the time series for 1. “Antenatal care coverage for at least four visits” 2. “Employment to population ratio” 3. “Literacy rate of year-olds” 4. “Death rate associated with malaria” 5. “Proportion of population using solid fuels”
2 nd Step: Complete
Final Step: Register the mappings
Final Step: Select mappings
Final Step: Generate SDMX-ML
Final Step: Complete
Exercise 3: Publish time series Use unstats.un.org/unsd/demodiweb[1-6]unstats.un.org/unsd/demodiweb[1-6] Username = Password = Publish/ register the time series for 1. “Antenatal care coverage for at least four visits” 2. “Employment to population ratio” 3. “Literacy rate of year-olds” 4. “Death rate associated with malaria” 5. “Proportion of population using solid fuels”
Why the 2 nd step? The default values for SEX, LOCATION or AGE GROUP mapping may not be applicable to all mappings The codelist mapping may only provide a partial mapping of the time series (i.e. more information is required) These changes are made in the 2 nd step. This is all done online through the di7 web- based repository through the administration profile, so let’s begin…
Where are the default values?
Admin panel: Application settings Insert screens shot/details of admin panel and default value storage…
Application settings has all mapping default values
Manual mapping of SUBGROUP Indicator Unit Where a subgroup value is missing the default values will apply, for example… ? Default Values Location = T …
Manual mapping of SUBGROUP Indicator UnitSubgroup for Age and Sex? Default Values … Age Group = 000_099_Y Sex = T So subgroups coverage affects the number of manual changes which have to be made…
Manual mapping of SUBGROUP Indicator Unit Subgroups? Default Values Location = T Age Group = 000_099_Y Sex = Both sexes Common example of where default subgroup mapping do not apply
Manual mapping of SUBGROUP Indicator Unit Subgroup for Sex? Default Values … … Sex = T Common example of where default subgroup mapping do not apply
Manual mapping of SUBGROUP Indicator Unit Subgroup for Location, Age and Sex? Default Values Location = T Age Group = 000_099_Y Sex = T ?
Manual mapping of SUBGROUP Indicator Unit If the subgroups are sorted more simply, this also helps with the mapping: ? Default Values Location = T …
Common example of where default subgroup mapping do not apply Manual mapping of SUBGROUP Indicator Unit Subgroup for Location, Age and Sex? Default Values Location = T Age Group = 000_099_Y Sex = T ?
Back to mapping…
2 nd Step: Amend Indicator When using the check box to tick the mapping, you are “fixing” the mapped DSD values. If the box is unchecked again and the mappings saved, then DSD values revert to those mapped at codelist/ default values (i.e. any manual changes are undone.)
Final Step: Register new mappings
Exercise 4: Amend time series Use unstats.un.org/unsd/demodiweb[1-6]unstats.un.org/unsd/demodiweb[1-6] Map/ amend/ publish the time series for; 1. “Antenatal coverage rate” 2. “Children orphaned by AIDS” 3. “Children under-five sleeping under insecticide- treated net (ITN)” 4. “Proportion of births attended by skilled health personnel” 5. “Share of women in wage employment in the non- agricultural sector” 6. “Proportion of urban population living in slums”
More complex mappings under the 1 st and 2 nd mapping step? The most common changes made to mappings are between subgroups and the Sex, Age Group and Location dimensions But sometimes manual changes are required between di and DSD indicator and unit, either… More than one di code relates to a single DSD code OR More than one DSD code relates to a single di code
Many-to-one mapping for Indicator codelist (Example 1) Indicator
Many-to-one mapping for Indicator codelist (Example 2) Indicator
Many-to-one mapping for Indicator codelist (Example 3)
Unit Many-to-one mapping for Unit codelist (Example 1)
Manual mapping of INDICATOR Indicator Unit Manual change ?
Manual mapping of INDICATOR Indicator Unit Manual change ?
Manual mapping of UNIT Indicator Unit Manual change Unit = “Ratio”
Back to mapping…
1 st Step: many di to 1 DSD code
2 nd Step: 1 di to many DSD codes
Final Step: Register new mappings
Exercise 5: Complex time series Use unstats.un.org/unsd/demodiweb[1-6]unstats.un.org/unsd/demodiweb[1-6] Map/ amend/ publish the time series for; 1. “Contraceptive prevalence rate” 2. “Primary completion rate” 3. “Gender parity index in primary education” 4. “Seats held by men in national parliament” 5. “Seats held by women in national parliament” 6. “Telephone lines”
Other issues encountered with generating SDMX from DevInfo The MDG DSD requires any data point to be uniquely described by the following dimensions; However, DevInfo allows data to be stored in overlapping time intervals and with multiple sources. These issues need to be resolved to conform to the “uniqueness” required by the MDG DSD.
Multiple sources Allowable in DevInfo but not in the DSD
Overlapping time This issue is only a problem where overlapping periods begin from the same year, as the mapping tool takes the first year in the period as the value for the “Time Period” dimension.
Targets in the database Targets are also an issue when found in the database since they should not be exchanged as observed values
Target in database (Example 1) Sometimes stored as subgroup which can be ignored at the 2 nd stage…
Target in database (Example 2) But other times can be found as a time period among observed values…
Use of filters at registration To deal with the issues of; multiple sources for a given time period, overlapping time period beginning at the same year; And targets presented alongside observed values The mapping tool provides a feature to filter out data from a generated SDMX message associated with specific time periods and source references.
Back to mapping…
Final Step: Filter by time/ source
Final Step: Select source filter
Final Step: Select time filter
Final Step: Register new mappings
Final Step: Complete
Exercise 6: Filter time series Use unstats.un.org/unsd/demodiweb[2-6]unstats.un.org/unsd/demodiweb[2-6] Map/ amend/ publish the time series for; 1. “Under-five mortality rate” 2. “Maternal mortality ratio (MMR)” 3. “Net enrolment ratio in primary education (NER)” 4. “Orphans primary school enrolment” 5. “Tuberculosis prevalence rate” 6. “Proportion of the population using improved sanitation facilities”
DSD Maintenance The mapping and registry tool allows users to edit and delete the DSD as well as upload. For when the DSD is updated, it is recommended to edit the DSD rather than delete DSD deletion has the effect of removing all the mappings and subscriptions used for that DSD
DSD Maintenance…
DSD Header…