Download presentation
Presentation is loading. Please wait.
1
Matt Sobek Minnesota Population Center sobek@pop.umn.edu
Integrated Public Use Microdata Series IPUMS Matt Sobek Minnesota Population Center
2
IPUMS Overview 1. What is the IPUMS 1. What is the IPUMS
2. Harmonization 3. Additional Data Enhancements 4. Access 5. Strengths and Limitations 6. Research examples
3
Brief History IPUMS-USA 1991 -- Steve Ruggles
All existing samples of US census Data extraction system 1998 IPUMS-International 2001 2004 IPUMS-Latin America 2005 IPUMS-Europe 2005 NSF Expansion World’s largest collection census data 30 samples per year for the next 3 years Bob McCaa
4
Datasets in IPUMS
5
IPUMS Census Sample Holdings and Release Dates
6
IPUMS Global Coverage Dark green = disseminating
Medium green = data held by IPUMS Light green = negotiating Yellow = not negotiating
7
Selected Variable Availability -- PERSON
8
Selected Variable Availability -- HOUSEHOLD
9
What Are Microdata? Individual-level data
• every record represents a separate person • all of their individual characteristics are recorded • users must manipulate the data themselves Different from aggregate/summary/tabular data • a count of persons by municipality • an employment status table by sex from a published census volume
10
Kenya 1999 Census Questionnaire
Now, I want to back up for a minute. Before we get too into examples of how microdata can be used, I want to talk to you briefly about how microdatasets are constructed. And what it is that a user actually gets when he requests one of our datasets. This is the source of it all. I’m sure all of you have seen these records. They’re organized by household. [go through (a) geography, (b) a family] [Emphasize how variables are in COLUMNS]. What we do is type all this in and convert every possible variable to a number. So when a user requests a dataset from us they don’t get a table, or an immediate answer to their question. Rather, they get mostly raw numeric data. That they have to manipulate.
11
Raw Census Microdata from IPUMS
So, here’s a typical slice of a census microdataset. Note the “H”s and “P”s at the beginning of each line H=household record. Info common to the whole HH P=person record. This is the people from the census form. With all of their info turned in to #s. Without going in to too much detail I want to show you a little more about the structure of a census microdata set.
12
IPUMS Data Structure Age Birthplace Sex Mother’s birthplace
Relationship Race Occupation IPUMS Data Structure Household record (shaded) followed by a person record for each member of the household Just like the original census form, the data is organized by columns. Each column represents a variable. [go through some variables], like especially relationship For each type of record, columns correspond to specific variables
13
The Advantages of Microdata
Combination of all of a person’s characteristics Characteristics of everyone with whom a person lived Freedom to make any table you need Freedom to make models examining multivariate relationships Basically, you are only limited by the questions asked in the particular census Why would you want to work with data like this? It’s certainly more of a hassle to use… --combination of all of a person’s chars: so you don’t just know, say, how many southern migrants in the north there were. You know their age, marital status, occupation, income, how long they’ve been there, etc. --characteristics of everyone with whom a person lived. A friend of mine studying junk peddlers wanted to know whether their children’s rates of school attendence was comparable to that of other children. You would never see a census table on this. With IPUMS, it’s a cinch. --freedom to make any table you need --freedom to model using multivariate relationships. EXAMPLE?
14
IPUMS Overview 1. What is the IPUMS 2. Harmonization
3. Additional Data Enhancements 4. Access 5. Strengths and Limitations 6. Research examples
15
Translation Table – Marital Status (IPUMS-International)
China 1982 Colombia 1973 Kenya 1989 Mexico 1970 U.S.A. 1990
16
Translation Table – Marital Status
General Codes
17
Variable Description: Literacy (International)
18
IPUMS Overview 1. What is the IPUMS 2. Harmonization
3. Additional Data Enhancements 4. Access 5. Strengths and Limitations 6. Research examples
19
IPUMS “Pointer” Variables
(Simple household) Spouse’s Pernum Relate Age Sex Marst Chborn 1 head 46 male married n/a 2 spouse 44 female 3 aunt 77 widow 7 4 child 15 single 5 13 6 11 Location 2 1 Mother’s Father’s Pernum Relate Age Sex Marst Chborn 1 head 46 male married n/a 2 spouse 44 female 3 aunt 77 widow 7 4 child 15 single 5 13 6 11 Location Location 2 1 2 1 2 1
20
IPUMS “Pointer” Variables
(Complex household) Spouse’s Mother’s Father’s Pernum Relationship Age Sex Marst Chborn 1 head 53 female separated 6 2 child 28 male single n/a 3 22 4 21 5 25 married child-in-law 7 grandchild 8 9 non-relative 32 10 11 Location Location Location 1 6 5 5 6 5 6 9 9
21
IPUMS Overview 1. What is the IPUMS 2. Harmonization
3. Additional Data Enhancements 4. Access 5. Strengths and Limitations 6. Dissemination
22
IPUMS Access Restricted access Scholarly and educational purposes
Conditions of use: key is not to redistribute Serious vetting
23
IPUMS Overview 1. What is the IPUMS 2. Harmonization
3. Additional Data Enhancements 4. Access 5. Strengths and Limitations 6. Research examples
24
Census Microdata Samples
4 Key Strengths of the Census Microdata Samples More cases than any comparable datasets Enable study of relatively small populations Large National in scope Results not subject to local peculiarities Provide context for local studies Temporal depth Provide historical perspective To summarize….. Microdata Can make your own tabulations Apply multivariate techniques
25
Limitations of the Microdata Samples
Confidentiality Samples Too small to answer some questions Geography 20,000 population or larger Sensitive variables, swapping, etc
26
Other Issues and Limitations
Not annual Any historical analysis will have gaps Cross-sectional data Not longitudinal Very large extracts Need knowledge of a statistical package User burden Information overload; culturally specific knowledge
27
IPUMS Overview 1. What is the IPUMS 2. Harmonization
3. Additional Data Enhancements 4. Users and Access 5. Strengths and Limitations 6. Research examples
28
IPUMS-International Research Topics
Child labor outside the household in Mexico and Colombia Effect of NAFTA on educational attainment and school enrollment by region within Mexico Concentration of mortality within families in Kenya Life course patterns of co-residence among Mexicans in Mexico, Mexicans in the U.S., and Mexican Americans Brain drain from developing countries How language diversity is affected by migration and economic factors
29
Married Female Labor Force Participation in Latin America
(age 18 to 65) 50 Brazil 45 40 35 Chile 30 Ecuador Colombia Venezuela Mexico Percent in Labor Force 25 Costa Rica 20 15 10 5 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005
30
Married Female Labor Force Participation:
10 20 30 40 50 60 70 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Percent in Labor Force Latin America and U.S. (age 18 to 65) United States Latin America
31
Compare Latin America to U.S. 40 years ago
Married Female Labor Force Participation: Latin America and U.S. (age 18 to 65) 70 United States 60 50 Brazil 40 Compare Latin America to U.S. 40 years ago Percent in Labor Force Colombia Chile 30 Ecuador Venezuela Mexico Costa Rica 20 10 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010
32
Mexican-born Women in United States
Married Female Labor Force Participation: Mexican-born Women, 70 60 Mexican-born Women in United States 50 40 Percent in Labor Force 30 Women in Mexico 20 10 1970 1975 1980 1985 1990 1995 2000
36
End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.