5. Integration of Microdata and Metadata (9 slides)

Slides:



Advertisements
Similar presentations
International conference on Census microdata: findings and futures 1-3 September 2008 Manchester, United Kingdom Uses of census microdata for health workforce.
Advertisements

Sampling Strategy for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
Riku Salonen Regression composite estimation for the Finnish LFS from a practical perspective.
Variance Estimation: Drawing Statistical Inferences from IPUMS-International Census Data Lara L. Cleveland IPUMS-International November 14, 2010 Havana,
Integrating Disability Census Microdata: What is accessible from IPUMS-International? (all census documentation used in this paper is available.
Online Market Research Part 1. The ABCs of the Federal Statistical System Presented by Janet Harrah, Director Center for Economic Development & Business.
Hist.umn.edu/~rmccaa/ipums-europe1 IPUMS i integration principles IPUMS i integration principles » 1. Respect absolute anonymity and confidentiality »
The Dutch Censuses of 1960, 1971 and 2001 Producing public use files in the IPUMS project Wijnand Advokaat Statistics Netherlands Division Social and Spatial.
REPUBLIC OF RWANDA National Institute of Statistics Prepared by Emmanuel GATERA National Institute of Statistics of Rwanda Management Information Systems.
IPUMS-International Integration Process Matt Sobek Minnesota Population Center
Who and How And How to Mess It up
Sampling.
6. Managing access to IPUMS integrated census microdata “extracts” (13 slides)
IPUMS-International Integration Process Matt Sobek Minnesota Population Center
The IPUMS-International dynamic metadata system * * * Robert McCaa, Professor of Population History University of Minnesota.
Original dataOriginal data. (various) Reformat dataReformat data: structural issues draw sample confidentiality (general tools) Data dictionary. (txt/pdf)
Integrating Disability Census Microdata: What is available from IPUMS-International? (all census documentation used in this paper is available.
MONGOLIA COUNTRY REPORT National Statistical Office IPUMS-Global Workshop, Lisbon, Portugal, August 22-26, 2007.
Census Processing Procedures Matt Sobek Funded by the National Science Foundation Minnesota Population Center.
IPUMS-EurAsia, : Changing Patterns of Microdata Use * * * Robert McCaa, Professor of Population History University.
IPUMS-International Integration Process Matt Sobek Minnesota Population Center
Census Bureau – Fernando Casimiro, Coordinator Lisboa IPUMS - Portugal Country Report.
How Does Ability to Speak English Affect Earnings?
Trade and business statistics: use of administrative data Lunch Seminar Enrico Giovannini Italian National Statistical Institute (ISTAT) New York, February,
United Nations Statistics Division Overview. Overview  Of the many classifications in the Family, five reference classifications will be discussed at.
Harmonizing the World’s Census Microdata: The IPUMS Project Matt Sobek Minnesota Population Center
GIS in Prevention, County Profiles, Series 3 (2006) A. Census Definitions The following is an excellent source of definitions and explanations of geography-related.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,
Estimating the Labour Force Trinidad and Tobago 28 th May 2014 Sterling Chadee Director of Statistics.
Research data workflow Practice in Slovenian Social Science Data Archives SERSCIDA WP4 – WORKSHOP Ljubljana September 2013.
Dr. Godius Kahyarara Economics Department University Tanzania.
CHAPTER 12 – SAMPLING DESIGNS AND SAMPLING PROCEDURES Zikmund & Babin Essentials of Marketing Research – 5 th Edition © 2013 Cengage Learning. All Rights.
ISCO-08 - Current Status and plans to support implementation David Hunter Department of Statistics International Labour Office United Nations Expert Group.
Statistical Coherence: Census Hub Hypercubes and IPUMS Microdata UNECE Expert Group on Population and Housing Censuses Geneva, September 2014 Lara.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
Population Census carried out in Armenia in 2011 as an example of the Generic Statistical Business Process Model Anahit Safyan Member of the State Council.
Design and Use of the IPUMS-International Data Serieshttp://international.ipums.org Matt Sobek Minnesota Population Center
Population census micro data for research: the case of Slovenia Danilo Dolenc Statistical Office of the Republic of Slovenia Ljubljana, First Regional.
United Nations Economic Commission for Europe Statistical Division Getting the Facts Right: Metadata for MDG and other indicators UNECE Baku, Azerbaijan,
Copyright 2010, The World Bank Group. All Rights Reserved. Sources of Agricultural Data Section A 1.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Bangkok,
IPUMS Microdata Relation to head Marital status Literacy Occupation.
INFORMATION SYSTEM FOR SUPPORT OF REGIONAL DEVELOPMENT (INFOREG) IN THE SLOVAK REPUBLIC INFOSTAT, Bratislava, Slovakia Prepared by Lenka Priehradnikova,
WS Population Census 2004 UNECE Statistical Division METHODS USED TO COLLECT EDUCATIONAL CHARACTERISTICS IN THE ECE 2000 ROUND OF POPULATION AND HOUSING.
WS Population Census 2004 UNECE Statistical Division METHODS USED TO COLLECT INFORMATION ON ECONOMIC CHARACTERISTICS IN THE ECE 2000 ROUND OF POPULATION.
The challenge of a mixed-mode design survey and new IT tools application: the case of the Italian Structure Earning Surveys Fabiana Rocci Stefania Cardinleschi.
Structure of the Hungarian population in employment by ESeC Based on LFS and Census 2001 Elizabeth Lindner Hungarian Central Statistical Office.
Chapter 6: 1 Sampling. Introduction Sampling - the process of selecting observations Often not possible to collect information from all persons or other.
United Nations Workshop on Revision 3 of Principles and Recommendations for Population and Housing Censuses and Evaluation of Census Data, Amman 19 – 23.
The Integrated Public Use Microdata Series database IPUMSwww.ipums.org Lab 1 Background on the IPUMS and SPSS.
Challenges of Census Data Harmonization: IPUMS-International Matt Sobek Minnesota Population Center
Use of Economic Classifications at Appropriate Detail Level Aloke Kar Regional Advisor, ESCWA Cairo 12th December 2007.
3. IPUMS Documentation Dynamic Metadata System: 5 “clicks” to compare any census question, in English, for any combination of years and countries in the.
Census Office Fernando Casimiro Geneva, July 2010 Portugal – Census results tailored to user needs «
United Nations Symposium on Population and Housing Censuses 13 – 14 September 2004 New York.
Integrated Public Use Microdata Series IPUMSwww.ipums.org.
United Nations Economic Commission for Europe Statistical Division Challenges in measuring gender and minorities Govinda Dahal (presented by E.Bisogno)
RESEARCH METHODS Lecture 28. TYPES OF PROBABILITY SAMPLING Requires more work than nonrandom sampling. Researcher must identify sampling elements. Necessary.
CENSUS MICRODATA : THAILAND NATIONAL STATISTICAL OFFICE by PAKAMAS RATTANALANGKARN Thailand National Statistical Office.
United Nations Expert Group Meeting on Revising the Principles and Recommendations for Population and Housing Censuses New York, 29 October – 1 November.
UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE
RESEARCH METHODS Lecture 28
Country report Germany
Country report Germany
Census Bureau – Fernando Casimiro, Coordinator
IPUMS-International Integration Process
CENSUS MICRODATA : THAILAND
Danilo Dolenc Statistical Office of the Republic of Slovenia
Sampling and estimation
Presentation transcript:

5. Integration of Microdata and Metadata (9 slides)

Integrating samples » Project requests new, high quality samples for historical censuses (before 1995) –drawn to uniform criteria » Complete documentation, in official language and English translation » Census forms » Enumerator instructions » Data dictionaries and codebooks » Complete metadata for all samples » Systematic metadata for all variables » Universe » Definitions » Comparability » Dynamic metadata system to compare phrasing of any question in any combination of countries and censuses

IPUMS-International: High precision samples with implicit stratification » Suppress all identifying information: names, id numbers, street addresses, low-level administrative geography (NUTS-5, NUTS-4?, NUTS-3?, NUTS-2?) » Sample is stratified by lowest level geography (census tract) » Lower standard errors than a classic random sample—to the extent that variables of interest are correlated with geography » Implicit geographical stratification is equivalent to extremely fine geographic stratification with proportional weighting » Many of our NSI partners have adopted the IPUMS sample design (see table). » 26 countries provided 100% microdata for the MPC to draw the sample » Europe: almost all NSIs have drawn samples to IPUMS specs. for all censuses » 188 High precision samples for 78 countries entrusting microdata (08/08/2007) » 10% samples: 125 censuses62 countries » 5% 3410 » <5% 29 6

IPUMS-International: High precision samples with implicit stratification » Suppress all identifying information: names, id numbers, street addresses, low-level administrative geography (NUTS-5, NUTS-4?, NUTS-3?, NUTS-2?) » Sample is stratified by lowest level geography (census tract) » Lower standard errors than a classic random sample—to the extent that variables of interest are correlated with geography » Implicit geographical stratification is equivalent to extremely fine geographic stratification with proportional weighting » Many of our NSI partners have adopted the IPUMS sample design (see table). » 26 countries provided 100% microdata for the MPC to draw the sample » Europe: almost all NSIs have drawn samples to IPUMS specs. for all censuses » 188 High precision samples for 78 countries entrusting microdata (08/08/2007) » 10% samples: 125 censuses62 countries » 5% 3410 » <5% 29 6

International and chronological integration of microdata and metadata: methods and procedures » Integrated microdata » Challenge: retain all significant detail, integrate everything » Solution: composite coding scheme (multiple digits, each carries meaning—think ISCO) » Use international standards where appropriate » Integrated metadata (documentation) » Summarize and describe commonalities » Explain unique characteristics » Dynamically generate metadata according to needs (countries, samples, variables) of individual researcher using XML database

Use international standards as points of departure: » UNESCO (1997) The International Standard Classification of Education (isced 1997). » International Labor Office (1990) International Standard Classification of Occupations (isco-88). » United Nations Statistics Division (1990) International Standard Industrial Classification of All Economic Activities (isic-88). » United Nations Economic Commission for Europe (1999). Recommendations for the 2000 Censuses of Population and Housing in the ECE Region.

I N T E G R A T E S I P U M Si » Integrate (harmonize), not standardize 1. retain all original detail 2. harmonize every digit » How is this possible? Composite codes (multiple digits, 111) Not serial (1, 2, 3,....) (example: next slide) » Why? Researcher confidently understands the logic uses as much detail as needed

Composite coding scheme: Employment Status Composite coding scheme: Employment Status

Metadata : Employment Status Metadata : Employment Status EMPSTAT Employment status Description EMPSTAT indicates whether or not the respondent was part of the labor force -- working or seeking work -- over a specified period of time. Depending on the sample, EMPSTAT can also convey further information. The first digit of EMPSTAT is fully comparable, and classifies the population into three groups: employed, unemployed, and inactive. The combination of employed and unemployed yields the total labor force. The second and third digits of EMPSTAT preserve additional information available for some countries and census years but not for others. Employment status is sometimes referred to in other sources as "activity status." Comparability -- General The age of persons to whom the question applies varies across the samples (see Universe). The reference period for the employment status question varies. For most samples, employment status was reported with respect to the day of the census or…

Metadata : Employment Status, example: Mexico Metadata : Employment Status, example: Mexico Comparability -- Mexico The universe and reference period are fully comparable across the Mexico samples. The 1970 Census did not provide detail on the inactive population except for "houseworkers," while the later samples have numerous subcategories. In 1990, the employment status question refers to "Principal Activity" and therefore under- reports secondary economic activity by students, housewives, family-workers, the semi- retired, and others. The 2000 Census sought to overcome deficiencies in reporting work status for people whose primary activity was not work (students, housewives, retirees, etc.), but who in fact were working according to international definitions. A second question introduced for the first time in 2000 sought to capture this secondary economic activity. For strict comparability with earlier Mexican censuses, this recovered activity (codes ) should be considered "inactive." … Integrate: retain all significant detail, harmonize everything Not standardize: force square pegs in round holes