DATA ARCHIVING - POPULATION AND HOUSING CENSUS PRESENTED BY Richard A. P. Phiri D emographic and Social Statistics Division NATIONAL STATISTICS OFFICE - MW
OUTLINE Rationale Background Census Processing Archiving Dissemination and Policies of Micro-data
Rationale Census /Survey conducted and disseminated growing supply of data but unsatisfied demand and under-use Researchers common obstacles/ problems : Quality and technical capacity Accessibility and Timeliness Limited impact Coherence Lack of metadata / documentation Poorly organized archives Management / political / legal / ethical issues
Background Malawi conducted its recent census in June, 2008 and launched results in November, 2009 Usual census attributes were collected Data has been further analyzed by themes – THEMATIC REPORTS These characteristics or attributes enables the assessment of living conditions have profound effects on health and welfare of individuals within the HH
CENSUS PROCESSING
Census Processing DRS Optical Mark Scanners Administered questionnaires were scanned
Census Processing – Cont’d Storage of successfully scanned questionnaires
Census Processing – Cont’d Successfully scanned and edited data images were preserved Converted to CSPro Files concatenated Back-up the dataset Demography division server Technical services division server Senior NSO officers Tabulations of data Reports
IHSN METADATA TOOLKIT
Data Archiving Through the PARIS 21 > NSO expressed interest to archive Technical assistance Capacity building team was sent Workshops Involved 12 participants from different NSO divisions: -Demography and Social Statistics -Agriculture -Economics IHSN Metadata Toolkit / software
Data Archiving – cont’d Subsequent workshops in 2009 Working workshop scheduled for May, coincided with Census analysis Archiving workshop outputs: Census, 1998 and 2008 Intergrated Household Survey Welfare Monitoring Survey Malawi Demographic and Household Survey – 2000 and 2004
Infrastructure Improvement Census funding acquired laptops and desktop PCs Norwegian and Malawi Government support Server – centralised storage of Census data images and micro-data ICT team – supporting the running of the server Training of the ICT team – India and South Africa
CENSUS ARCHIVING
Study Description Identification Title, Subtitle Abbreviation Study type, Id No Version Overview Abstract, Population and housing censuses have regularly been conducted in Malawi since the colonial era. However, the most comprehensive censuses have only been undertaken during the post-colonial period. Censuses have been conducted along with the Integrated Household Survey, Demographic and Health Survey (DHS) and the annual Welfare Monitoring Survey. Besides providing benchmark data on demographic and socio-economic characteristics of the Malawi population, censuses are unique sources of information for small geographical areas and sub-national groups. which is vital for planning and decision-making at lower levels of the country's administrative structures. The 2008 PHC was conducted from June 2008 by the National Statistical Office (NSO) which deployed rigorously trained teams of enumerators and supervisors to ensure good household coverage. The 2008 PHC report presents basic results on the population size, composition and distribution; population characteristics; population dynamics and household and housing characteristics. ……
Study Description – cont’d Overview – cont’d Kind of data (Census/enumerated data) Unit of analysis -Individuals -Households -Housing Scope and Coverage Modules or topics -Individual -Housing units -Migration Geographic
Study Description – cont’d
Sampling, Data Collection and Processing
Data and Accessibility Data Appraisal Based on the Census Data evaluation and assessment report Accessibility Official request through the commissioner Agreement to use the data for research purposes Confidentiality NSO highly values the confidentiality of the census and survey data Requirement of Oath of secrecy – Data user – Data collectors – NSO officers The data user agrees to use the data for research purposes
Copyright and Contacts Copyright Standard legal statement Liability limitation regarding data use Contacts for long term validity preferably by title Affiliation, , Uniform Resource Identifier(URI)
Dataset and Variable groups preparation Convert data to IHSN compatible formats SPSS STATA Statistica, Excel etc Import data Data file description Name,Contents, Producer, Data version, Processing checks, Missing data, Notes Key variable and relations Variable groups
Dataset and Variable groups preparation IHSN compatible formats SPSS STATA Statistica Excel etc Converted 2008 census micro-data to SPSS Import data Data file description Name,Contents, Producer, Data version, Processing checks, Missing data, Notes Key variable and relations
Dataset Anonymization No tool/software for anonymizing the micro- data Planning to develop an in-house tool 100% micro-data on NSO website – not in the public domain Micro-data up to TA level – collapsed or removed direct identifiers(EA, Village, EA, Occupation etc) though not available – still checking the efficiency 10% micro-data on IPUMS website
External resources Resource description Label, say 2008 PHC questionnaire, Manual, Report URI or import the resource to Nesstar server Identification Type, title, subtitle, author(s), date Country, language, format, ID Number Contributor(s) Publisher(s) and Rights Content Description, Abstract, Table of contents Subjects say Modules, Key topics
DISSEMINATION AND POLICIES
Dissemination CD-ROMs disseminated during the launch of the Census results Upon request of the Census and any other archived survey No dataset Available on-line (including other surveys) NB: 10% of the micro-data to IPUMS
Policies Policies on archived data and confidentiality – not finalized Planned policies: Following the launch of Census/Survey results, comprehensively archive micro-data and metadata (agricultural, economics, demographic and social) in a data infrastructure that prevents loss or damage Encourage data sharing while acknowledging the legal rights complying with the release/sharing conditions
Policies – cont’d Encourage data sharing – cont’d register to access the micro-data and metadata Disseminated data or micro-data user to acknowledgement and abide by the NSO data user agreement and any other general conditions Micro-data with scrambled IDs or non- corresponding geocodes 10% of the micro-data
Challenges Lack of anonymization system for crumbling the unique identifiers / IDs Advanced training Technological tools Sophisticated scanners (i.e. ZYLAB scanners for unarchived censuses and Surveys) Insufficient computers and personnel Computer viruses Data archiving software or licence NSO - Intends to use IHSN tool kit in future surveys and censuses
ZIKOMO KWAMBIRI FOR YOUR ATTENTION!!