Design and Use of the IPUMS-International Data Series http://international.ipums.org Matt Sobek Minnesota Population Center sobek@pop.umn.edu
IPUMS-International Overview Processing Dissemination system Strengths and limitations Users
Matt Sobek Minnesota Population Center sobek@pop.umn.edu END https://international.ipums.org Matt Sobek Minnesota Population Center sobek@pop.umn.edu
What is IPUMS-International? Census data – 1960 to present Samples – 1 to 10%, nationally representative Microdata – individual-level Integrated – consistent codes across time and place Downloadable – anonymized Extract system – select variables – pooled data
Map of IPUMS Partners 83 countries Dark green = disseminating data Light green = partners, not yet disseminating 83 countries
Current Countries in IPUMS Africa Asia Americas Europe Egypt Ghana Guinea Kenya Rwanda South Africa Uganda Armenia Cambodia China India Iraq Israel Jordan Kyrgyz Rep. Malaysia Mongolia Palestine Philippines Vietnam Argentina Bolivia Brazil Canada Chile Colombia Costa Rica Ecuador Mexico Panama United States Venezuela Austria Belarus France Greece Hungary Italy Netherlands Portugal Romania Slovenia Spain United Kingdom 44 countries 130 samples 279 million persons
Countries in IPUMS Archive Bangladesh Botswana Cuba Czech Republic Dominican Rep. El Salvador Ethiopia Fiji Germany Guatemala Haiti Honduras Indonesia Liberia Madagascar Malawi Mali Mauritius Nepal Nicaragua Pakistan Paraguay Peru Puerto Rico Senegal Saint Lucia Sierra Leone Sudan Switzerland Tanzania Thailand Turkmenistan Uruguay Zambia
IPUMS Microdata Relation to head Marital status Literacy Occupation
Availability of Selected Person Variables (Number of samples)
Availability of Selected Household Variables (Number of samples) 536 Integrated variables 10,600 Unharmonized variables
User Access Application Scholarly and educational purposes Key: it must not be redistributed Once approved, access to all data Free
Making the IPUMS Pre-processing Integration Dissemination
Making the IPUMS Pre-processing Integration Language translation Reformatting Error correction Sampling Confidentiality Integration
Making the IPUMS Pre-processing Integration Language translation Reformatting Error correction Sampling Confidentiality Integration Metadata Data harmonization Constructed variables
Census Questionnaire (Mexico 2000) Water Access
Text of Census Questionnaire (Mexico 2000)
XML-Tagged Census Questionnaire (Mexico 2000) Water access
Data Integration – Marital Status China 1982 Colombia 1973 Kenya 1989 Mexico 1970 U.S.A. 1990
Family Interrelationship Variables (Simple household) Spouse’s Pernum Relate Age Sex Marst Chborn 1 head 46 male married n/a 2 spouse 44 female 3 aunt 77 widow 7 4 child 15 single 5 13 6 11 Location 2 1 Mother’s Father’s Pernum Relate Age Sex Marst Chborn 1 head 46 male married n/a 2 spouse 44 female 3 aunt 77 widow 7 4 child 15 single 5 13 6 11 Location Location 2 1 2 1 2 1
IPUMS “Pointer” Variables (Complex household) Spouse’s Mother’s Father’s Pernum Relationship Age Sex Marst Chborn 1 head 53 female separated 6 2 child 28 male single n/a 3 22 4 21 5 25 married child-in-law 7 grandchild 8 9 non-relative 32 10 11 Location Location Location 1 6 5 5 6 5 6 9 9
Family Interrelationship Pointers 13 censuses include data on location of parent or spouse Agree Disagree Spouse 99.5 0.5 Mother 98.7 1.3 Father 99.4 0.6 Mother 97.5 2.5 Father 98.7 1.3 Under age 18
IPUMS Home Page
Variables Page
Variables Page
Variables Page
Sample Filtering
Variables Page
Unharmonized Variables
Variable Description (Marital status)
Comparability Discussion (Marital status)
Enumeration Text (Marital status)
Enumeration Text (Marital status, Cambodia)
Variable Codes (Marital status)
Variable Codes (Marital status)
Variable Codes (Marital status)
IPUMS Home Page
Extract Step 1 – Login
Extract Step 2 – Select Samples
Extract Step 3 – Select Variables
Extract Step 4 – Variable Options
Extract Step 4 – Select Cases
Extract Step 4 – Attach Characteristics Age of spouse Employment status of father Occupation of father
Extract Step 5 – Customize Sample Sizes
Extract Step 5 – Customize Sample Sizes
Extract Step 5 – Customize Sample Sizes
Extract Step 6 – Submit
Download or Revise Extract
Key Strengths of the Census Samples Large Enable study of relatively small populations Internationally comparable Pool data across countries – integrated variables Temporal depth To summarize….. Provide historical perspective
Key Strengths of the Census Samples Microdata All of a person’s characteristics – multivariate analysis Hierarchical Characteristics of everyone a person resided with Cohabitation and family interrelationships To summarize…..
Limitations Due to Confidentiality Samples Too small to answer some questions Geography 20,000 population or larger Sensitive variables, very small categories
Other Issues and Limitations Cross-sectional data Not longitudinal User burden Information overload; culturally specific knowledge Variable labels are insufficient
IPUMS Users 2200 registered users Academic field (%) 47 Economics 21 Demography 10 Sociology 22 Other 54% Graduate students
Samples Extracted 67% multiple samples 45% multiple countries 17% 5 or more countries
Decade of Extracted Sample Decade Percent 1960s 11 1970s 14 1980s 16 1990s 30 2000s 29
Most Frequently Extracted Countries 1. Mexico 2. Brazil 3. United States 4. Colombia 5. France 6. Chile 7. Ecuador 8. Vietnam 9. Kenya 10. Argentina
Most Frequently Extracted Variables Relation to head Age Sex Marital status Educational attainment Years of schooling School attendance Literacy Employment status Class of worker Occupation recode Industry recode Occupation Industry Urban-rural status Country of birth Nativity status Migration status, 5 years Children ever born Children surviving Religion Ownership of dwelling Water Electricity Sewage Number of rooms Toilet Earned income Total income Spouse’s location in household
Median Age by Country (Calculated from the most recent sample from each country.)
Population Pyramids Palestine Egypt Iraq
Population Pyramids Young (Uganda 2002) Medium (Philippines 2000) Old (USA 2005)
Population Pyramids Belarus 1998 Cambodia 1998 China 1990
Population Pyramids Mexico 1960 1990 2005
Married Female Labor Force Participation in Latin America (age 18 to 65) 50 Brazil 45 40 35 Chile 30 Ecuador Colombia Venezuela Mexico Percent in Labor Force 25 Costa Rica 20 15 10 5 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005
Married Female Labor Force Participation: 10 20 30 40 50 60 70 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Percent in Labor Force Latin America and U.S. (age 18 to 65) United States Latin America
Compare Latin America to U.S. 40 years earlier Married Female Labor Force Participation: Latin America and U.S. (age 18 to 65) 70 United States 60 50 Brazil 40 Compare Latin America to U.S. 40 years earlier Percent in Labor Force Colombia Chile 30 Ecuador Venezuela Mexico Costa Rica 20 10 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010
Mexican-born Women in United States Married Female Labor Force Participation: Mexican-born Women, 1970-2000 70 60 Mexican-born Women in United States 50 40 Percent in Labor Force 30 Women in Mexico 20 10 1970 1975 1980 1985 1990 1995 2000
Percent of elders in elder-head intergenerational families
Percent of elders in younger-head families
Trends in Intergenerational Families Intergenerational families headed by the older generation are becoming more common in most countries, with exceptions mainly in Africa. Intergenerational families headed by the younger generation—the configuration that suggests old-age support—are much rarer, and they are on the decline in most countries.