IPUMS-International Integration Process Matt Sobek Minnesota Population Center
June 2011 Data Release
Input materialPre-processingStandardizationIntegration DATA METADATA Data files Data dictionary Questionnaires Enum instructions Sample information Reformat data Donation Draw sample Confidentiality Translate to English Images to editable files Ipums data dictionary Code clean-up Verify data Tag enumeration text Document source variables Harmonize codes Variable programming Constructed variables GIS boundary files Variable descriptions Sample design
End Matt Sobek Minnesota Population Center
Confidentiality Measures Swap a small percentage of cases between geographic areas. Suppress low-level geographic variables. Recode geographic units to ensure small localities cannot be identified (typically those with fewer than 20,000 persons). For recent censuses: Recode cells representing very small numbers of persons in the population (into a residual or combined with a larger category). Top- or bottom-code continuous variables with a thin tail. Suppress specific categories of variables as requested by the NSO. Suppress entire variables as requested by the NSO.
China1982Colombia1973Kenya1989Mexico1970U.S.A.1990 Harmonize Codes: Translation Matrix for Marital Status
PernumRelateAgeSexMarstChborn 1head46malemarriedn/a 2spouse44femalemarried3 3aunt77femalewidow7 4child15femalesingle0 5child13femalesinglen/a 6child11malesinglen/a PernumRelateAgeSexMarstChborn 1head46malemarriedn/a 2spouse44femalemarried3 3aunt77femalewidow7 4child15femalesingle0 5child13femalesinglen/a 6child11malesinglen/a Spouse’s Mother’sFather’s Location (Colombia 1985) (Simple household) Constructed “Pointer” Variables
Census Questionnaire Image (Mexico 2000) Water Access
Text of Census Questionnaire (Mexico 2000)
Water access XML-Tagged Census Questionnaire (Mexico 2000)