Download presentation
Presentation is loading. Please wait.
1
Census Processing Procedures Matt Sobek Funded by the National Science Foundation Minnesota Population Center
2
1. Inventory IPUMS Work Process 2. English Translation 8. Dissemination 6. Data Harmonization 3. Data Restructuring 5. Confidentiality Measures 4. Sample Creation 7. Data Improvement
3
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement For each sample data data dictionary census questionnaire and instructions sample design census design published tabulations, post-enumeration surveys, demographic analyses (when available)
4
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement Census questionnaire Census instructions Data dictionary codes and labels
5
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement a) Create labels/set-up file
6
Labels File, Costa Rica 2000
7
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement Basic record structure b) Analyze data Unique IDs or other means of distinguishing household membership a) Create labels/set-up file
8
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement Basic record structure b) Analyze data Unique IDs or other means of distinguishing household membership c) Reformat the data Convert to household-person hierarchical structure a) Create labels/set-up file
9
geographyhousing person (head) person (child) geographyhousingperson (head) geographyhousingperson (child) geographyhousingperson (child) geographyhousingperson (head) geographyhousingperson (spouse) geographyhousingperson (child) geographyhousingperson (child) geographyhousing person (head) person (spouse) person (child) Reformat Rectangular Sample (Brazil 1980) (Person records only; household data duplicated on person records)
10
Reformat Dwelling-Household-Person Sample dwelling household person (head) person (spouse) person (child) household person (head) person (child) person (head) person (spouse) dwelling household dwellinghousehold person (head) person (spouse) person (child) dwellinghousehold person (head) person (child) dwellinghousehold person (head) person (spouse) (Chile 1992) (Separate dwelling and household records)
11
dwelling 001 head spouse child head dwelling 002 head child Reformat Dwelling-Person Sample (Colombia 1993) household 00101 head spouse child household 00102 head household 00201 head child (Multi-household dwellings; no separate household record)
12
serial 001head serial 001spouse serial 002head serial 002child serial 003head serial 001geog & housing serial 002geog & housing serial 003geog & housing serial 001household serial 001head serial 001spouse serial 003household serial 002household serial 002head serial 002child serial 003head Household File Person File (Brazil 2000) Merge Separate Household and Person Files
13
Reformat Individual-level Data geogpersonhousinggeogperson geogpersonhousinggeogperson geogpersonhousinggeogperson geogpersonhousinggeogperson geogpersonhousinggeogperson household person household (Mexico 1960) geogpersonhousinggeogperson geogpersonhousinggeogperson geogpersonhousinggeogperson geogpersonhousinggeogperson geogpersonhousinggeogperson (Individuals only; not organized in households)
14
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement Basic record structure b) Analyze data Unique IDs or other means of distinguishing household membership d) Identify and flag errors in structure c) Reformat the data Convert to household-person hierarchical structure a) Create labels/set-up file
15
Flags Identifying Structural Issues, Chile 1970
16
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement a) Formerly, systematic samples We developed a household- substitution technique to exclude corrupt records during sampling
17
Flag bad 10th x x HHSize 2114 226 239 246 255 261 272 281 297 305 314 323 334 346 356 3613 374 387 396 405 HHSize 14 22 39 44 53 61 75 86 94 105 113 121 135 142 152 164 1710 182 196 202 Flag bad 10th x x Sampling Procedure – Colombia 1973 Take No
18
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement a) Formerly, systematic samples We developed a household- substitution technique to exclude corrupt records during sampling b) Stratified samples Variables for variance estimation Develop strata for each sample using geography, ethnicity, hh size, hh type, socioeconomic status; adjusted as necessary for census
19
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement Limit geographic specificity Swap across geographic units Randomize order within geographies Merge small variable categories Top-code sensitive numeric variables 5 measures, as required
20
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement a) Data translation matrices
21
Translation Matrix – Marital Status
25
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement a) Data translation matrices b) Specialized variable programming Where one-to-one recoding of the translation matrix is insufficient
26
1. Inventory IPUMS Work Process 2. Translation 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement a) Constructed variables Family structure and other derived variables 8. Dissemination Location of mother, father and spouse
27
PernumRelateAgeSexMarstChborn 1head46malemarriedn/a 2spouse44femalemarried3 3aunt77femalewidow7 4child15femalesingle0 5child13femalesinglen/a 6child11malesinglen/a PernumRelateAgeSexMarstChborn 1head46malemarriedn/a 2spouse44femalemarried3 3aunt77femalewidow7 4child15femalesingle0 5child13femalesinglen/a 6child11malesinglen/a Spouse’s Mother’sFather’s IPUMS “Pointer” Variables Location 2 1 0 0 0 0 0 0 00 0 0 21 1 1 2 2 (Colombia 1985) (Simple household)
28
PernumRelationshipAgeSexMarstChborn 1head53femaleseparated6 2child28malesinglen/a 3child22malesinglen/a 4child21malesinglen/a 5child25femalemarried2 6child-in-law28malemarriedn/a 7grandchild3malesinglen/a 8grandchild1malesinglen/a 9non-relative32femaleseparated2 10non-relative10malesinglen/a 11non-relative5femalesinglen/a Location 0 0 0 0 0 6 5 0 0 0 0 0 0 1 1 1 1 0 5 5 0 9 9 0 0 0 6 6 0 0 0 0 0 Spouse’sFather’sMother’s IPUMS “Pointer” Variables (Complex household) (Colombia 1985)
29
1. Inventory IPUMS Work Process 2. Translation Location of mother, father, and spouse. 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement a) Constructed variables Family structure and other derived variables. 8. Dissemination b) Data editing and missing data allocation
30
Missing Data Allocation – Occupation Script (USA pre-1940 samples) OCC allocated when 975, 996, 998; sex (2 categories) 1; 2; empstat (3 categories) 10-19; 20-29; 30-39; classwkr (3 categories) 10-19; 20-29; 99; age (6 categories) 10-19; 20-29; 30-39; 40-49; 50-59; 60-126; race (3 categories) 100-199; 200-299; 300-899;
31
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement a) Metadata Static pages Dynamic pages Translation matrices Control files
32
1. Inventory IPUMS Work Process 2. Translation 8. Dissemination 6. Harmonization 3. Data Restructuring 5. Confidentiality 4. Sample Creation 7. Data Improvement a) Metadata Static pages Dynamic pages Translation matrices Control files b) Dissemination Programming Documentation system Extract interface (front end) Extract engine (back end) On-line data analysis
33
End Matt Sobek sobek@pop.umn.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.