Presentation is loading. Please wait.

Presentation is loading. Please wait.

Original dataOriginal data. (various) Reformat dataReformat data: structural issues draw sample confidentiality (general tools) Data dictionary. (txt/pdf)

Similar presentations


Presentation on theme: "Original dataOriginal data. (various) Reformat dataReformat data: structural issues draw sample confidentiality (general tools) Data dictionary. (txt/pdf)"— Presentation transcript:

1 Original dataOriginal data. (various) Reformat dataReformat data: structural issues draw sample confidentiality (general tools) Data dictionary. (txt/pdf) Enumeration formsEnumeration forms and instructions. (pdf) Sample designs, census info, etc. (pdf) Collect metadata for input variables:metadata for input variables codes labels (original language) labels (English) frequencies (Excel, with Perl) Convert to editable files:editable files translate into English standardized layout standardized formatting (Word) Assemble codes, labels, and frequencies from source variables for harmonized trans tables. (automated) Collect relevant enumeration text for harmonized variables. (automated) Create files for public delivery. (pdf & generated HTML) Create translation tables:translation tables recoding matrix (Excel) Variable descriptionsVariable descriptions: definition of variable universe comparability general & detailed (Word) Project-wide control files: countries samples variables (Excel) Create IPUMSI data: creation: Java reporting: Java testing: SPSS extraction: Java IPUMSI web site. (Java & HTML) Export IPUMSI metadata for use by major MPC programs. (transfer responsibility to IT) Original materials Prepare samples Integration Create IPUMSI Collate sample information.sample information (Word, tagged) Collect codes, labels, and frequencies for ALL input variables. (automated) Tag enumeration text Tag enumeration text to link it specifically with input variables. Create translation tables:translation tables clean-up recoding only virtually no special programming (Excel) Variable descriptions: basic definition of variable universe cross references to enumeration text (Word) Create source variables Data improvements: allocation logical editing pointers Scripts for special programming (text)

2 Integrated variable list. Integrated variable description. Sample designs, etc. Enumeration files in their entirety. Codes and labels, with frequencies. Documentation: User experience of IPUMSI web site Source variable list. Source variable description (Java assembles tagged enumeration text). Translation table. Special programming. Source variable metadata: frequencies, labels, and original-language labels. Select samples. Download extract: data syntax enhanced codebook Data: Select variables: integrated general or detailed source Select features: case selection household aggregation attached characteristics Registration: more rigorous vetting more automated registration processes Access. user preferences registration expires (1 yr) Registration: Enumeration text specific to the variable. (assembled by Java)

3 Vice President of the U.S., 1856-1860 Secretary of War, C.S.A, 1861-1865 Later charged with treason, fled to Cuba How a case gets from the manuscript census into the IPUMS John C. Breckinridge of Kentucky An example from the 1860 census....

4 Original enumeration form from the 1860 U.S. Census

5 Data entry screen in Minnesota (ca. 1997)

6 Household and person record ready for checking (ca. 1999)

7 Coding dictionary for the occupation variable (ca. 2000)

8 Year Industry Page Wealth Age Relationship Checked and coded data, ready for release (ca. 2001) Occupation

9 Enumeration form: original file

10 Variable labels file

11 Data file: before reformatting

12 Data file: after reformatting

13 geographyhousing person (head) person (child) geographyhousingperson (head) geographyhousingperson (child) geographyhousingperson (child) geographyhousingperson (head) geographyhousingperson (spouse) geographyhousingperson (child) geographyhousingperson (child) geographyhousing person (head) person (spouse) person (child) Reformat Rectangular Sample (Brazil 1980) (Person records only; household data duplicated on person records)

14 Reformat Dwelling-Household-Person Sample dwelling household person (head) person (spouse) person (child) household person (head) person (child) person (head) person (spouse) dwelling household dwellinghousehold person (head) person (spouse) person (child) dwellinghousehold person (head) person (child) dwellinghousehold person (head) person (spouse) (Chile 1992) (Separate dwelling and household records)

15 dwelling 001 head spouse child head dwelling 002 head child Reformat Dwelling-Person Sample (Colombia 1993) household 00101 head spouse child household 00102 head household 00201 head child (Multi-household dwellings; no separate household record)

16 serial 001head serial 001spouse serial 002head serial 002child serial 003head serial 001geog & housing serial 002geog & housing serial 003geog & housing serial 001household serial 001head serial 001spouse serial 003household serial 002household serial 002head serial 002child serial 003head Household File Person File (Brazil 2000) Merge Separate Household and Person Files

17 Reformat Individual-level Data geogpersonhousinggeogperson geogpersonhousinggeogperson geogpersonhousinggeogperson geogpersonhousinggeogperson geogpersonhousinggeogperson household person household (Mexico 1960) geogpersonhousinggeogperson geogpersonhousinggeogperson geogpersonhousinggeogperson geogpersonhousinggeogperson geogpersonhousinggeogperson (Individuals only; not organized in households)

18 Enumeration form: editable file, in English

19 Variable description

20 Sample design

21 PernumRelateAgeSexMarstChborn 1head46malemarriedn/a 2spouse44femalemarried3 3aunt77femalewidow7 4child15femalesingle0 5child13femalesinglen/a 6child11malesinglen/a PernumRelateAgeSexMarstChborn 1head46malemarriedn/a 2spouse44femalemarried3 3aunt77femalewidow7 4child15femalesingle0 5child13femalesinglen/a 6child11malesinglen/a Spouse’s Mother’sFather’s IPUMS “Pointer” Variables Location 2 1 0 0 0 0 0 0 00 0 0 21 1 1 2 2 (Colombia 1985) (Simple household)

22 PernumRelationshipAgeSexMarstChborn 1head53femaleseparated6 2child28malesinglen/a 3child22malesinglen/a 4child21malesinglen/a 5child25femalemarried2 6child-in-law28malemarriedn/a 7grandchild3malesinglen/a 8grandchild1malesinglen/a 9non-relative32femaleseparated2 10non-relative10malesinglen/a 11non-relative5femalesinglen/a Location 0 0 0 0 0 6 5 0 0 0 0 0 0 1 1 1 1 0 5 5 0 9 9 0 0 0 6 6 0 0 0 0 0 Spouse’sFather’sMother’s IPUMS “Pointer” Variables (Complex household) (Colombia 1985)

23 Project control file: variables

24 Translation table

25 Translation Matrix – Marital Status How we integrate variables across countries and time

26 Translation Matrix – Marital Status location of data in the original samples

27 Translation Matrix – Marital Status marital codes used in the 1973 Colombian census

28 Translation Matrix – Marital Status different original codes for “widowed” across the censuses

29 Translation Matrix – Marital Status final IPUMS coding scheme for marital status

30 Source variable translation table

31 Tagged enumeration form


Download ppt "Original dataOriginal data. (various) Reformat dataReformat data: structural issues draw sample confidentiality (general tools) Data dictionary. (txt/pdf)"

Similar presentations


Ads by Google