Harmonizing the World’s Census Microdata: The IPUMS Project Matt Sobek Minnesota Population Center
What is IPUMS-International? Census data – 1960 to present Samples – 1 to 10%, nationally representative Microdata – individual-level Extract system – select variables – pooled data Downloadable – anonymized Integrated – consistent codes across time and place
Map of IPUMS Partners Dark green = disseminating data Light green = partners, not yet disseminating 83 countries
Current Countries in IPUMS 44 countries 130 samples 279 million persons Egypt Ghana Guinea Kenya Rwanda South Africa Uganda Armenia Cambodia China India Iraq Israel Jordan Kyrgyz Rep. Malaysia Mongolia Palestine Philippines Vietnam Argentina Bolivia Brazil Canada Chile Colombia Costa Rica Ecuador Mexico Panama United States Venezuela Austria Belarus France Greece Hungary Italy Netherlands Portugal Romania Slovenia Spain United Kingdom Africa Asia Americas Europe
IPUMS Microdata Relation to head Marital status Literacy Occupation
Aggregate Data
Data Standardization
Data Integration – Marital Status China 1982 Colombia 1973 Kenya 1989 Mexico 1970 U.S.A. 1990
XML Harmonization Table
Census Questionnaire (Mexico 2000) Water Access
Text of Census Questionnaire (Mexico 2000)
Water access XML-Tagged Census Questionnaire (Mexico 2000)
Variable Description (Literacy)
Availability of Selected Person Variables (Number of samples)
Availability of Selected Household Variables (Number of samples)
Number of Geographic Units Identified
Size of Geographic Units by Country (Median population in 000s)
Urban Definitions (N of countries) (Functional criteria include infrastructure, businesses, agriculture, etc.)
National Stats Office Questionnaire Data collection Data processing Aggregate statistics Tabulator Public samples Full microdata Samples drawn Public samples IPUMS samples Harmonization Aggregate statistics IPUMS Sampling Donation Confidentiality
END Matt Sobek Minnesota Population Center