IPUMS-International: high precision, household census samples integrated cross-nationally and chronologically * * * Robert McCaa Professor Emeritus of Population History, University of Minnesota for additional details, please see :
Outline 1. IPUMS-International: a global collaboratory, slides 2. UK: Users, Usage, SARs, and a plan countries, 185 samples plus annually 3 4. Integrated census microdata and metadata Dynamic metadata 5 Dynamic metadata 5 5. Impact: Teaching and Research 5 6. Conclusion: Invitation 1 2
1. IPUMS-International, a global collaboratory to preserve, integrate and disseminate high-precision census samples; 90+ National Statistical Offices participate See handouts: Brochure for list of participating statistical offices, census microdata available, and info on IPUMS-International Brochure for list of participating statistical offices, census microdata available, and info on IPUMS-International 3
MPC assumes responsibilities and risks for integrating & disseminating microdata and metadata (as CCSR for UK) » Uniform Memorandum of Understanding » Founding partners (2001): France, Spain, China, Vietnam, Kenya, Colombia, Mexico, USA … now 97 countries …including UK (2004) » Specifies conditions for entrusting microdata and for access » Uniform license agreement: transparent, democratic access for qualified researchers and students » Over 4,500 registered researchers in 90+ countries » Hundreds of publications (click & select IPUMS-International) » NSOsmany formerly paralyzed by fearare relieved to be freed of risks, responsibilities, and the work! 4
IPUMS-International: a global collaboratory dark green = integrated and disseminating ** 62 countries, 185 samples, 397 millon person records ** medium green = integrating. Light green = negotiating 2015: 80 countries, 250 samples integrated Mollweide projection Invitation to Register and Use 5
Europe Mirror Site: 6
MPC NSI …62+ NSI 1 …. MPC integrates metadata and confidentializes microdata samples IPUMS- International IPUMS-International manages access and entrusts researchers with custom- tailored, SAS, STATA, and SPSS metadata and microdata extracts for any combination of countries, censuses, sub-populations, and variables Trusted researcher …. IPUMS-International Microdata Access Trusted researcher analyzes customized extracts using own hardware and software NSI entrusts census metadata and anonymized microdata to MPC 7
6 steps using 2a. Study documentation 2b. Create extract 3. Receive ; logon with p/word 4. Download extract (SSL encrypted) 5. UnZip data 1. Login with password 6. Analyze using favored software
2. UK Users, Usage, SARs and IPUMS 9
~5k Users: UK ranks #2 ~5k Users: UK ranks #2 ~800 institutions, top 40: 2 UK – Sussex and LSE ~800 institutions, top 40: 2 UK – Sussex and LSE ~25k extracts, by country: UK ranks #30 … ~25k extracts, by country: UK ranks #30 … UK samples in IPUMS (H-SARs 91, SARs 01) are not comparable, and are relatively poor in content UK samples in IPUMS (H-SARs 91, SARs 01) are not comparable, and are relatively poor in content Do not despair: ONS-UK has agreed, in principle, to a plan to construct a uniform series of high precision household samples for Do not despair: ONS-UK has agreed, in principle, to a plan to construct a uniform series of high precision household samples for ~900k variables extracted, 33 most popular variables ~900k variables extracted, 33 most popular variables Note MOMLOC, SPLOCrequire household samples Note MOMLOC, SPLOCrequire household samples 1/3 use Attach Characteristicsrequire hhld. samples 1/3 use Attach Characteristicsrequire hhld. samples Usage Statistics 10
Good news from ONS: plan to assemble a complete series of high precision household samples,
3. 62 countries, 185 samples available now; samples integrated annually See handout: Card for list of available samples, density, and # of person records by country and census year Card for list of available samples, density, and # of person records by country and census year 12
397 million integrated person records in 185 samples, representing 62 countries (Jun 2011) 397 million integrated person records in 185 samples, representing 62 countries (Jun 2011) Annually, new samples for 6-8 countries are incorporated Annually, new samples for 6-8 countries are incorporated Samples for 2010 round censuses have highest priority Samples for 2010 round censuses have highest priority Hundred of variables are integrated: Hundred of variables are integrated: Demography Demography Education Education Work and income Work and income Migration, immigration Migration, immigration Disability Disability Dwellings Dwellings IPUMS-International: Census samples for research and policy making
Universal population coverage Universal population coverage High precision, confidentialized samplesmost drawn to IPUMS-International specifications High precision, confidentialized samplesmost drawn to IPUMS-International specifications Samplesmicrodata and metadataare integrated: identical concepts have the same codes in all samples Samplesmicrodata and metadataare integrated: identical concepts have the same codes in all samples Analyze: Analyze: People in household context: employment, education, etc. People in household context: employment, education, etc. Fertility and mortality: own-child method (implicit, requires no fertility module) Fertility and mortality: own-child method (implicit, requires no fertility module) Migration, immigration, language, ethnicity, religion Migration, immigration, language, ethnicity, religion Household sanitation variables (water supply, toilet, and waste disposal, rooms) Household sanitation variables (water supply, toilet, and waste disposal, rooms) IPUMS-I Samples
1. Open metadata system: universal, free access 2. Dynamic, integrated metadata system to compare text of any question for any combination of countries and censuses 3. Microdata integrated: cross-nationally and chronologically 4. Custom-tailored extracts constructed by each researcher: Select country/ies, census/es, variables, sub-populations, sample size Select country/ies, census/es, variables, sub-populations, sample size 5. Pointer variables (momloc, poploc, sploc; also –rule): parent-child and spouse links. 6. Attach characteristics: of parents to children and spouses to spouses Facilitates relational analysis with no programming required Facilitates relational analysis with no programming required 7. And much more… IPUMS-International Value-Addeds: 15
4. IPUMS-International open-access, integrated, dynamically generated, compatible metadata (copies shared with NSO partners and World Bank IHSN) See, p. 3: 2011: IPUMS-International: Free, Worldwide Microdata Access Now for Censuses of 62 Countries--80 by 2015, 58 th International Statistical Institute, Dublin, Ireland, August.. 16
User registration, conditions of use license Browse (metadata) and select data Link to Official Statistical Agency home pages View sample descriptions (integrated metadata) Download data extract (and codebook) Bibliography: view cites, link to publications 17
18 Metadata are integrated, open access, dynamically constructed. Example: Marital Status
Integrated IPUMS-I Metadata: Codes and Frequencies Detailed, Case-Count View 2 rules: 1. Retain details 2. Harmonize everything IPUMS integration empowers the researcher to make informed research decisions 19
Integrated IPUMS-I Metadata: Enumeration text View text in English for any combination of countries and censuses. 2 documents: 1. Form 20
Integrated IPUMS-I Metadata: Enumeration text View text in English for any combination of countries and censuses. 2 documents: 1. Form 2. Instructions scroll down for more 21
5. Impact: teaching and research 22
1. Free, easy access to data for many countries and censuses 2. Large sample sizes: Make it possible to include many different variables in a regression… multi-level model… Make it possible to include many different variables in a regression… multi-level model… Produce separate estimates for population sub-groups Produce separate estimates for population sub-groups Easy to extract samples with a target sample size (e.g., 50mb) Easy to extract samples with a target sample size (e.g., 50mb) Easy to revise an extract for a larger size or to include more countries, censuses, variables or sub-populations Easy to revise an extract for a larger size or to include more countries, censuses, variables or sub-populations 3. Students show a great deal of creativity in using IPUMS-I 4. Skills acquired have an immediate pay-off when applying for jobs (e.g., World Bank), graduate school, etc. IPUMS-I is an excellent resource for teaching… -- David Lam, President, Population Association America
Research: What are integrated census samples used for? see
Or: IPUMS-International + key-word: subject, country, etc.
Research Topicsamazingly diverse » Economists: » Comparative study of labor force participation » Demand and supply of public services (water, electricity, sewage, etc.) » Economic impact of family planning and fertility decline » Brain drain, brain gain » Econometric analysis of labor force and income » Effect of long-term youth unemployment » Effects of volume of human capital on returns to education » Human capital and aging » Impact of trade policies on growth, development, immigration, labor markets, and inequality » Etc.
Vital registration data: among births to Arab parents in Michigan, prenatal exposure to Ramada fasting Vital registration data: among births to Arab parents in Michigan, prenatal exposure to Ramada fasting lower birth weight lower birth weight first month of gestation, fewer male births first month of gestation, fewer male births IPUMS-International census samples, Uganda 02 (n = 80k Muslims), Iraq 97 (250k) --disability, religion, month of birth. Long term fetal origins effect, where early pregnancy overlapped with Ramadan: IPUMS-International census samples, Uganda 02 (n = 80k Muslims), Iraq 97 (250k) --disability, religion, month of birth. Long term fetal origins effect, where early pregnancy overlapped with Ramadan: adults 20% more likely to be disabled adults 20% more likely to be disabled effects are larger for mental (learning) disabilities effects are larger for mental (learning) disabilities relatively mild prenatal exposures can have persistent effects relatively mild prenatal exposures can have persistent effects maternal behavior, particularly during the first month of pregnancy, can have permanent impacts on offspring health maternal behavior, particularly during the first month of pregnancy, can have permanent impacts on offspring health Impact: Ramadan fasting and fetal health D. Almond & B. Mazumder Health Capital and the Prenatal Environment: The Effect of Ramadan Observance During Pregnancy American Economic Journal (2011)
6. Invitation to use IPUMS-International census samples 28
1. Use data responsibly: a. Respect confidentiality. a. Respect confidentiality. b. Share data only with registered users. b. Share data only with registered users. c. Secure data from unregistered users. c. Secure data from unregistered users. 2. Analyze data carefully: a. Apply weights where appropriate. a. Apply weights where appropriate. b. Use proper methods. b. Use proper methods. c. Report results accurately c. Report results accurately 3. Support NSOs and IPUMS-International a. Cite NSOs and IPUMS in publications a. Cite NSOs and IPUMS in publications b. Register citations in IPUMS bibliography b. Register citations in IPUMS bibliography c. Report data errors to c. Report data errors to
Thank you 30