Matt Sobek Minnesota Population Center

Slides:



Advertisements
Similar presentations
IPUMS workshop * * * Robert McCaa, Professor of Population History University of Minnesota additional information.
Advertisements

Aggregate data Also called summary data, tabular data Counts of things for places (e.g. counties) or entities Examples: –census volumes –HSUS –ICPSR files.
1 Assortative Mating Patterns in the Developing World Albert Esteve* and Robert McCaa** Presented by: Sula Sarkar** * Centre d ’ Estudis Demogr à fics.
IPUMS-International Integration Process Matt Sobek Minnesota Population Center
IPUMS-International Integration Process Matt Sobek Minnesota Population Center
Original dataOriginal data. (various) Reformat dataReformat data: structural issues draw sample confidentiality (general tools) Data dictionary. (txt/pdf)
Census Processing Procedures Matt Sobek Funded by the National Science Foundation Minnesota Population Center.
Building Historical Social Science Infrastructure: Data Integration Projects of the Minnesota Population Center Robert McCaa and Steven Ruggles Minnesota.
Raw Census Microdata from IPUMS IPUMS Data Structure Household record (shaded) followed by a person record for each member of the household Relationship.
IPUMS-International Integration Process Matt Sobek Minnesota Population Center
Census Bureau – Fernando Casimiro, Coordinator Lisboa IPUMS - Portugal Country Report.
Indigenous peoples, ethnicity and identities in contemporary censuses: A global perspective source: *
Harmonizing the World’s Census Microdata: The IPUMS Project Matt Sobek Minnesota Population Center
United Nations Demographic Yearbook Data Collection System Adriana Skenderi United Nations Statistics Division Third Regional Workshop on Production and.
Making Graphs. The Basics … Graphical Displays Should: induce the viewer to think about the substance rather than about the methodology, graphic design,
U.S. Decennial Census Finding and Accessing Data Summer Durrant October 20, 2014 Data & Geographical Information Librarian Research Data Services
Father Involvement and Child Well-Being: 2006 Survey of Income and Program Participation (SIPP) Child Well-Being Topical Module 1 By Jane Lawler Dye Fertility.
Design and Use of the IPUMS-International Data Series
Roomers and Boarders: Melissa Scopilliti, University of Maryland, Maryland Population Research Center; Population Division, U.S. Census Bureau.
Statistical Coherence: Census Hub Hypercubes and IPUMS Microdata UNECE Expert Group on Population and Housing Censuses Geneva, September 2014 Lara.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
IPUMS-International Steven Ruggles Minnesota Population Center.
Introduction to the Public Use Microdata Sample (PUMS) File from the American Community Survey Updated February 2013.
1 Sources of gender statistics Angela Me UNECE Statistics Division.
Design and Use of the IPUMS-International Data Serieshttp://international.ipums.org Matt Sobek Minnesota Population Center
IPUMS-International Methods Matt Sobek Minnesota Population Center
IPUMS Microdata Relation to head Marital status Literacy Occupation.
 Background Data harmonization Data output  Web: Variable documentation system  Web: Data extract system IPUMS Dissemination System.
Integrated Public Use Microdata Series IPUMSwww.ipums.org Matt Sobek Minnesota Population Center
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
The Integrated Public Use Microdata Series database IPUMSwww.ipums.org Lab 1 Background on the IPUMS and SPSS.
Gender and Poverty: Conceptual Overview Sonia Montaño Women and Development Unit Economic Commission for Latin America and the Caribbean Inter-Agency and.
IPUMS-International Process Matt Sobek Minnesota Population Center
United Nations Workshop on Evaluation and Analysis of Census Data, 1-12 December 2014, Nay Pyi Taw, Myanmar DATA VALIDATION-II Consistency check.
Challenges of Census Data Harmonization: IPUMS-International Matt Sobek Minnesota Population Center
Click “Browse and Select Data”:  to view integrated metadata  and to get microdata (make an “extract”) Note: the data are “pooled” into a single file–
Integrated Public Use Microdata Series IPUMS Internationalwww.ipums.org Matt Sobek Minnesota Population Center
Integrated Public Use Microdata Series IPUMSwww.ipums.org.
Data access and development: The IPUMS perspective United Nations Commission on Population and Development The data revolution in action: National and.
Which socio-demographic living arrangement helps to reach 100? Michel POULAIN & Anne HERM Orlando 8 January 2014.
Workshop on World Programme for the Census of Agriculture 2020 Amman, Jordan May 2016 Theme 8: Demographic and social characteristics Technical Session.
BPS-Statistics Indonesia
Measuring International Migration: An Example from the U. S
Elizabeth M. Grieco Chief, Foreign-Born Population Branch
International migration: practices of 2000 Round and issues for 2010
1st and 2nd generation immigrants - a statistical overview -
Beata Nowok Chris Dibben & Gillian Raab Administrative Data
Joshua Rosenbloom and Brandon Dupont
IPUMS-International Schedule
IPUMS “Pointer” Variables
Census Bureau – Fernando Casimiro, Coordinator
U.S. Hispanic Population: 2000
Explore variables metadata (18 slides)
Mongolia country experience Gender Equality Monograph based on the 2010 Population and Housing Census Ms.Tsogzolmaa, Analyst Ms. Lkhagvadulam, Analyst.
IPUMS-International Integration Process
and the Future of Historical Family Demography
TerraPop Goals Lower barriers to conducting interdisciplinary human-environment interactions research by making data with different formats from different.
Integrating Gender into Population and Housing Censuses
Module 7: Gender Responsive Census
National Bureau of Statistics, China
Andrew Jenkins and Rosalind Levačić
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Income Poverty Status Education The Labor Force Journey To Work
The IPUMS-International Dissemination System
Recommended Population and Housing Census Topics 2020
Session 4 United Nations Statistics Division
Recommended Tabulations of the Principles and Recommendations for Population and Housing Censuses, Rev. 2 Session 4 United Nations Statistics Division.
Census topics selection
Presentation transcript:

Matt Sobek Minnesota Population Center sobek@pop.umn.edu Integrated Public Use Microdata Series IPUMS www.ipums.org Matt Sobek Minnesota Population Center sobek@pop.umn.edu

IPUMS Overview 1. What is the IPUMS 1. What is the IPUMS 2. Harmonization 3. Additional Data Enhancements 4. Access 5. Strengths and Limitations 6. Research examples

Brief History IPUMS-USA 1991 -- Steve Ruggles All existing samples of US census Data extraction system 1998 IPUMS-International 2001 2004 IPUMS-Latin America 2005 IPUMS-Europe 2005 NSF Expansion World’s largest collection census data 30 samples per year for the next 3 years Bob McCaa

Datasets in IPUMS

IPUMS Census Sample Holdings and Release Dates

IPUMS Global Coverage Dark green = disseminating Medium green = data held by IPUMS Light green = negotiating Yellow = not negotiating

Selected Variable Availability -- PERSON

Selected Variable Availability -- HOUSEHOLD

What Are Microdata? Individual-level data • every record represents a separate person • all of their individual characteristics are recorded • users must manipulate the data themselves Different from aggregate/summary/tabular data • a count of persons by municipality • an employment status table by sex from a published census volume

Kenya 1999 Census Questionnaire Now, I want to back up for a minute. Before we get too into examples of how microdata can be used, I want to talk to you briefly about how microdatasets are constructed. And what it is that a user actually gets when he requests one of our datasets. This is the source of it all. I’m sure all of you have seen these records. They’re organized by household. [go through (a) geography, (b) a family] [Emphasize how variables are in COLUMNS]. What we do is type all this in and convert every possible variable to a number. So when a user requests a dataset from us they don’t get a table, or an immediate answer to their question. Rather, they get mostly raw numeric data. That they have to manipulate.

Raw Census Microdata from IPUMS So, here’s a typical slice of a census microdataset. Note the “H”s and “P”s at the beginning of each line H=household record. Info common to the whole HH P=person record. This is the people from the census form. With all of their info turned in to #s. Without going in to too much detail I want to show you a little more about the structure of a census microdata set.

IPUMS Data Structure Age Birthplace Sex Mother’s birthplace Relationship Race Occupation IPUMS Data Structure Household record (shaded) followed by a person record for each member of the household Just like the original census form, the data is organized by columns. Each column represents a variable. [go through some variables], like especially relationship For each type of record, columns correspond to specific variables

The Advantages of Microdata  Combination of all of a person’s characteristics  Characteristics of everyone with whom a person lived  Freedom to make any table you need  Freedom to make models examining multivariate relationships  Basically, you are only limited by the questions asked in the particular census Why would you want to work with data like this? It’s certainly more of a hassle to use… --combination of all of a person’s chars: so you don’t just know, say, how many southern migrants in the north there were. You know their age, marital status, occupation, income, how long they’ve been there, etc. --characteristics of everyone with whom a person lived. A friend of mine studying junk peddlers wanted to know whether their children’s rates of school attendence was comparable to that of other children. You would never see a census table on this. With IPUMS, it’s a cinch. --freedom to make any table you need --freedom to model using multivariate relationships. EXAMPLE?

IPUMS Overview 1. What is the IPUMS 2. Harmonization 3. Additional Data Enhancements 4. Access 5. Strengths and Limitations 6. Research examples

Translation Table – Marital Status (IPUMS-International) China 1982 Colombia 1973 Kenya 1989 Mexico 1970 U.S.A. 1990

Translation Table – Marital Status General Codes

Variable Description: Literacy (International)

IPUMS Overview 1. What is the IPUMS 2. Harmonization 3. Additional Data Enhancements 4. Access 5. Strengths and Limitations 6. Research examples

IPUMS “Pointer” Variables (Simple household) Spouse’s Pernum Relate Age Sex Marst Chborn 1 head 46 male married n/a 2 spouse 44 female 3 aunt 77 widow 7 4 child 15 single 5 13 6 11 Location   2 1 Mother’s Father’s Pernum Relate Age Sex Marst Chborn 1 head 46 male married n/a 2 spouse 44 female 3 aunt 77 widow 7 4 child 15 single 5 13 6 11 Location   Location   2 1 2 1 2 1

IPUMS “Pointer” Variables (Complex household) Spouse’s Mother’s Father’s Pernum Relationship Age Sex Marst Chborn 1 head 53 female separated 6 2 child 28 male single n/a 3 22 4 21 5 25 married child-in-law 7 grandchild 8 9 non-relative 32 10 11 Location   Location   Location   1 6 5 5 6 5 6 9 9

IPUMS Overview 1. What is the IPUMS 2. Harmonization 3. Additional Data Enhancements 4. Access 5. Strengths and Limitations 6. Dissemination

IPUMS Access Restricted access Scholarly and educational purposes Conditions of use: key is not to redistribute Serious vetting

IPUMS Overview 1. What is the IPUMS 2. Harmonization 3. Additional Data Enhancements 4. Access 5. Strengths and Limitations 6. Research examples

Census Microdata Samples 4 Key Strengths of the Census Microdata Samples More cases than any comparable datasets Enable study of relatively small populations Large National in scope Results not subject to local peculiarities Provide context for local studies Temporal depth Provide historical perspective To summarize….. Microdata Can make your own tabulations Apply multivariate techniques

Limitations of the Microdata Samples Confidentiality Samples Too small to answer some questions Geography 20,000 population or larger Sensitive variables, swapping, etc

Other Issues and Limitations Not annual Any historical analysis will have gaps Cross-sectional data Not longitudinal Very large extracts Need knowledge of a statistical package User burden Information overload; culturally specific knowledge

IPUMS Overview 1. What is the IPUMS 2. Harmonization 3. Additional Data Enhancements 4. Users and Access 5. Strengths and Limitations 6. Research examples

IPUMS-International Research Topics Child labor outside the household in Mexico and Colombia Effect of NAFTA on educational attainment and school enrollment by region within Mexico Concentration of mortality within families in Kenya Life course patterns of co-residence among Mexicans in Mexico, Mexicans in the U.S., and Mexican Americans Brain drain from developing countries How language diversity is affected by migration and economic factors

Married Female Labor Force Participation in Latin America (age 18 to 65) 50 Brazil 45 40 35 Chile 30 Ecuador Colombia Venezuela Mexico Percent in Labor Force 25 Costa Rica 20 15 10 5 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005

Married Female Labor Force Participation: 10 20 30 40 50 60 70 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Percent in Labor Force Latin America and U.S. (age 18 to 65) United States Latin America

Compare Latin America to U.S. 40 years ago Married Female Labor Force Participation: Latin America and U.S. (age 18 to 65) 70 United States 60 50 Brazil 40 Compare Latin America to U.S. 40 years ago Percent in Labor Force Colombia Chile 30 Ecuador Venezuela Mexico Costa Rica 20 10 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010

Mexican-born Women in United States Married Female Labor Force Participation: Mexican-born Women, 1970-2000 70 60 Mexican-born Women in United States 50 40 Percent in Labor Force 30 Women in Mexico 20 10 1970 1975 1980 1985 1990 1995 2000

End sobek@pop.umn.edu