Hist.umn.edu/~rmccaa/ipums-europe1 IPUMS-Europe, 2004-2008: Restricted-access, anonymized microdata for scientific and policy research * * * Robert McCaa,

Slides:



Advertisements
Similar presentations
Disseminating census microdata: the IPUMS and IECM experiences, (and plans for beyond) * * * Robert McCaa and Albert Esteve Minnesota Population.
Advertisements

How IPUMS Harmonizes Microdata Data Sources and Bibliography Data Sources: Original census data are contributed to the IPUMS- International project by.
United Nations Expert Group Meeting on Revising the Principles and Recommendations for Population and Housing Censuses New York, 29 October – 1 November.
Welcome IPUMS/IECM-Europe Workshop: Accomplishments, plans and challenges * * * Robert McCaa, Professor of.
IPUMS workshop * * * Robert McCaa, Professor of Population History University of Minnesota additional information.
Hist.umn.edu/~rmccaa/ipums-europe1 Population Activities Unit 1990 census round harmonization project: focused on Aging » Begun 1992: PAU/UNECE, UNFPA,
Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas.
Using a restricted-access web-site of anonymized, integrated census microdata (for 1, 2, 3, 4,
Hist.umn.edu/~rmccaa/ipums-europe1 IPUMS i integration principles IPUMS i integration principles » 1. Respect absolute anonymity and confidentiality »
A proposal to preserve, integrate and manage access to anonymized census samples of the Official Statistical Agencies of the Arab States in cooperation.
6. Managing access to IPUMS integrated census microdata “extracts” (13 slides)
Calibrating census microdata against a gold standard (employment survey): women in the workforce, Mexico 1970, 1990 and 2000.
Hist.umn.edu/~rmccaa/ipums-europe1 Sister-project: IPUMS-Latin America: 17 countries, ~500 million pop., 5 census rounds 80+ samples, 100+ million person.
54th ISI, Berlin IPUMS-International: A Restricted Access Web-Site Providing Anonymized, Integrated Census Microdata.
IPUMS-Eurasia, : Preserving Eurasian census microdata, making them useful, and promoting their use * * * Robert McCaa,
Building Historical Social Science Infrastructure: Data Integration Projects of the Minnesota Population Center Steven Ruggles Minnesota Population Center.
Statistical confidentiality and privacy. 2. Case study: IPUMS-International * * * Robert McCaa Minnesota Population Center.
5. Integration of Microdata and Metadata (9 slides)
Hist.umn.edu/~rmccaa/ipums-europe1 From IPUMS-USA (1989-) & PAU-Aging (1992-) From IPUMS-USA (1989-) & PAU-Aging (1992-) to IPUMS-International (1999-)
Statistical confidentiality and privacy: 1. General considerations * * * Robert McCaa Minnesota Population Center “ Inadequate.
Users and Uses of IPUMS International Data Presented by Dr. Miriam King.
IPUMS-Europe: Confidentiality measures for licensing and disseminating restricted-access census microdata extracts
IPUMS-International: High precision Population Census Samples: Balancing the Privacy-Quality Tradeoff by Means of Restricted Access Microdata Extracts.
IPUMS-EurAsia, : Changing Patterns of Microdata Use * * * Robert McCaa, Professor of Population History University.
Building Historical Social Science Infrastructure: Data Integration Projects of the Minnesota Population Center Robert McCaa and Steven Ruggles Minnesota.
The IECM project: Integrating the European Census Microdata IECM team* *A. Cabré, A. Esteve, J.Garcia, T. López, M. Valls PROJECT.
IPUMS-International: August * * * Robert McCaa, Professor of Population History University of Minnesota
Indigenous peoples, ethnicity and identities in contemporary censuses: A global perspective source: *
Harmonizing the World’s Census Microdata: The IPUMS Project Matt Sobek Minnesota Population Center
United Nations Statistics Division
Entrusting census microdata and metadata for timely integration and dissemination via the IPUMS-EurAsia and IECM initiatives, * * * Robert McCaa,
The Application of the Concept of Uniqueness for Creating Public Use Microdata Files Jay J. Kim, U.S. National Center for Health Statistics Dong M. Jeong,
OECD Review of Russian Statistics Peer Review Mission to Russia April 2012 Tim Davis Head, Global Relations, Statistics Directorate.
Statistical Coherence: Census Hub Hypercubes and IPUMS Microdata UNECE Expert Group on Population and Housing Censuses Geneva, September 2014 Lara.
CES Task Force on Confidentiality and Microdata Tiina Luige UNECE Statistical Division Conference of European Statisticians UN Economic Commission for.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
IPUMS-International Steven Ruggles Minnesota Population Center.
JOINT UNECE-UNFPA TRAINING WORKSHOP ON POPULATION AND HOUSING CENSUSES GENEVA, 5-6 JULY 2010 GOOD PRACTICES IN DISSEMINATING POPULATION CENSUS RESULTS.
Design and Use of the IPUMS-International Data Serieshttp://international.ipums.org Matt Sobek Minnesota Population Center
Population census micro data for research: the case of Slovenia Danilo Dolenc Statistical Office of the Republic of Slovenia Ljubljana, First Regional.
Access to official statistical micro data at the Statistical Office of the Republic of Slovenia and cooperation with the Slovenian Social Science Data.
New and easier ways of working with aggregate data and geographies from UK censuses Justin Hayes UK Data Service Census Support.
Trans-Border access to Census Microdata: The IPUMS-IECM partnership * * * Robert McCaa and Albert Esteve Palós “You have to.
Migration Statistics Global database United Nations Economic Commission for Europe (UNECE) and United Nations Population Fund (UNFPA) Istanbul, Turkey,
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
IPUMS Microdata Relation to head Marital status Literacy Occupation.
Cooperation between Data Archives and National Statistical Institutes: recent changes and future perspectives Tomaz Smrekar / Statistical Office of the.
The availability of Dutch census microdata Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands Division Social.
JOINT UN-ECE/EUROSTAT MEETING ON POPULATION AND HOUSING CENSUSES GENEVA, MAY 2009 DETERMINING USER NEEDS FOR THE 2011 UK CENSUS IAN WHITE, Office.
Integrated Public Use Microdata Series IPUMSwww.ipums.org Matt Sobek Minnesota Population Center
1 Dissemination Michael J. Levin Harvard Center for Population and Development Studies
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
The Integrated Public Use Microdata Series database IPUMSwww.ipums.org Lab 1 Background on the IPUMS and SPSS.
Data Dissemination Conditions in the European Statistical System (ESS) UNECE, Warschau May 2009.
Joint Eurostat Unece Worksession on Statistical Data Confidentiality 2011, Tarragona Initial analyses on comparable dissemination from the Essnet project.
HETUS Pilot Group 8 Privacy procedures and ethical issues Kimberly Fisher, Centre for Time Use Research – co-ordinator External consultant Kai Ludwigs.
Integrated Public Use Microdata Series IPUMS Internationalwww.ipums.org Matt Sobek Minnesota Population Center
Integrated Public Use Microdata Series IPUMSwww.ipums.org.
1. Introduction 2. Background 3. Funding framework 4. EU participation 5. Timetable 6. Progress report 7. Future plans I ntegrating the E uropean C ensus.
Robert McCaa Antonio López Gay Representing IPUMS – International Project Minnesota Population Center / University of.
Data access and development: The IPUMS perspective United Nations Commission on Population and Development The data revolution in action: National and.
Maria João Valente Rosa
Integrating the European Census Microdata
Welcome IPUMS/IECM-Europe Workshop: Accomplishments, plans and challenges * * * Robert McCaa, Professor.
Nordic Demography Symposium, Tjøme 2001
2. Applying for Access (10 slides)
The 2010 World Population and Housing Census Programme
Danilo Dolenc Statistical Office of the Republic of Slovenia
hist.umn.edu/~rmccaa/ipums-europe
The role of metadata in census data dissemination
Presentation transcript:

hist.umn.edu/~rmccaa/ipums-europe1 IPUMS-Europe, : Restricted-access, anonymized microdata for scientific and policy research * * * Robert McCaa, University of Minnesota Population Center Nikolai Botev, UN-ECE Population Activities Unit (Geneva)

hist.umn.edu/~rmccaa/ipums-europe2Outline » PAU 1990s project PAU 1990s project PAU 1990s project » IPUMS-International means: Restricted access, anonymized microdata IPUMS-International means: Restricted access, anonymized microdata IPUMS-International means: Restricted access, anonymized microdata » IPUMS-Europe: sister project (Latin America), connections with PAU IPUMS-Europe: sister project (Latin America), connections with PAU IPUMS-Europe: sister project (Latin America), connections with PAU » IPUMS-International partners IPUMS-International partners IPUMS-International partners » Principles: integration, dissemination Principles: integration, dissemination Principles: integration, dissemination

hist.umn.edu/~rmccaa/ipums-europe3 Population Activities Unit 1990 census round harmonization project: focused on Aging » Begun 1992: PAU/UNECE, UNFPA, US-NIA » Microdata acquired for 15 countries » Harmonized 26 core person variables plus 13 optional; 10 dwelling/household variables, 18 optional » Extensive metadata: questionnaires, nomenclatures, classifications » Progressive over-sampling with age

hist.umn.edu/~rmccaa/ipums-europe4 Population Activities Unit 1990 census round harmonization project: focused on Aging

hist.umn.edu/~rmccaa/ipums-europe5 Population Activities Unit, 1990 census round harmonization project: focused on Aging » General release: samples for 8 countries » Samples for the other 7 countries available under more restrictive conditions » Dissemination: CDs or other media; no online access » Sustainability: ICPSR (U. of Michigan)

hist.umn.edu/~rmccaa/ipums-europe6 Problems with PAU effort: » Sample design too complex » Need for time series » Lacked legal authority » Inadequate funding » Insufficient computing infrastructure and human resources » Antiquated distribution system » Sustainability problematic

hist.umn.edu/~rmccaa/ipums-europe7 Population Activities Unit: samples of older persons based on the 2000-round of censuses » Tightly integrated with IPUMS-Europe » Based on the same coding schemes, nomenclatures, and classifications » Utilize the same anonymization techniques and approaches; same data access modalities » Ensure sustainability through the integration with IPUMS-Europe: ICPSR & European Data Centers

hist.umn.edu/~rmccaa/ipums-europe8 Population Activities Unit: samples of older persons based on the 2000-round of censuses » Sample design: - » Sample design: - sample of households not included in the core IPUMS- Europe sample, where at least one member is over age 60 (recommended sampling density: 5 percent); - geography to match that of core samples; » Advantages: - » Advantages: - more straightforward than the design used for 1990s; - in line with the practice of national statistical offices (e.g. PUMS-A and PUMS-O of the US Census Bureau);

hist.umn.edu/~rmccaa/ipums-europe9 From IPUMS-USA (1989-) & PAU-Aging (1992-) From IPUMS-USA (1989-) & PAU-Aging (1992-) to IPUMS-International (1999-) and beyond to IPUMS-International (1999-), Latin America (2003-), Europe (2004?) and beyond

hist.umn.edu/~rmccaa/ipums-europe10 IPUMS-International means Restricted access, Anonymized microdata » Should be “IRAMS” not IPUMS » Who are IPUMS-International users? Those who: » Have a demonstrated need for the data (project abstract) » Agree to abide by the restrictions of use » Place themselves under the jurisdiction of Institutional Review Boards Place themselves under the jurisdiction of Institutional Review Boards Place themselves under the jurisdiction of Institutional Review Boards

hist.umn.edu/~rmccaa/ipums-europe11 A N O N Y M I Z E S IPUMSiIPUMSiIPUMSiIPUMSi » Suppress geographical detail (NUTS2/3?) » Corrupt the data! (just a little…) » Blur/aggregate sensitive codes » Convert dates to ages (blur key vars.) » Swap cases between districts! (just a few…) » Scramble order of unit records Using the most demanding standards: legal & administrative legaladministrativelegaladministrative as well as technical:

hist.umn.edu/~rmccaa/ipums-europe12 » 1. Suppress geographical variables below commune » 2. Convert » Dates of birth, marriage, immigration to ages » Band small groups » 3. Suppress sensitive codes for small groups: » Citizenship » Year of immigration to Italy » Commune of work/study Anonymization example: Italy, 1991 First assessment Note: population uniques are anonymized after integration Italy, 1991Italy, 1991

hist.umn.edu/~rmccaa/ipums-europe13 EUROSTAT statistical anonymity standards (Thorogood, 1999) --all accepted by IPUMS-International » 1. small sample size » 2. limited geographical detail » 3. top and bottom coding of unique categories » 4. signed non-disclosure agreement » 5. prohibit redistribution of datasets to third parties » 6. prohibit attempts to identify individuals or the making of any claim to that affect » 7. require users to provide copies of publications

hist.umn.edu/~rmccaa/ipums-europe14 EUROSTAT statistical anonymity standards (Thorogood, 1999) --all accepted by IPUMS i and more » 8. Age (constructed from birth date, where necessary) » 9. Never identify date of birth » 10. Never identify place of birth » 11. Migration: timing and place not identified in detail » 12. Place of residence identified by major civil division (pop>60k, 120k, 250k, 1 million--national rule) » 13. Sensitivity analysis of variables by national experts » 14. Confidentiality assessment by national experts

hist.umn.edu/~rmccaa/ipums-europe15 Sister-project: IPUMS-Latin America: 17 countries, ~500 million pop., 5 census rounds 80+ samples, 100+ million person records » Scope: Latin American census microdata, 1960-present census microdata, 1960-present census microdata, 1960-present » Work Plan ( funded by National Institutes of Health) » 2001: Sign licensing agreements with official agencies 2001: Sign licensing agreements with official agencies 2001: Sign licensing agreements with official agencies » 2002: Obtain funding from U.S. NIH » 2003: Develop/translate microdata & metadata » 2004: Country expert teams design national integrations » 2005: MPC/expert teams design regional integration » 2006: MPC anonymizes/integrates microdata and metadata » 2007: MPC disseminates to bona fide researchers who sign non-disclosure license. National census/data/research institutes may distribute national versions via CDs/web.

hist.umn.edu/~rmccaa/ipums-europe16 IPUMS-Europe Partnership: More… » Censuses: 1960s – 2000, where microdata exist » Countries: >350 million population, 16, inclined at present ( * = signed): Austria, Bulgaria, Czech Republic *, France *, Germany, Greece, Ireland, Israel, Hungary *, Poland, Portugal, Romania, Slovenia *, Spain *, Switzerland, Turkey » Research: more knowledge, more users

hist.umn.edu/~rmccaa/ipums-europe17 IPUMS-Europe Partnership: More uniformity… » Legal: signed memorandum of understanding » Administrative: restricted to approved users; strong enforcement procedures » Sample design: every n th household » Anonymization: includes corrupting data » Integration: more variables, composite coding » Dissemination: extract custom-tailored datasets, never entire samples Dissemination

hist.umn.edu/~rmccaa/ipums-europe18 Advantages… proven record of accomplishments: » Uniform legal protocols » Substantial institutional infrastructure » Experienced census microdata integrators » Cost-effective academic environment » Sustained funding from National Science Foundation, National Institutes of Health » Successful web-based distribution system: users! users

hist.umn.edu/~rmccaa/ipums-europe19 Advantages of IPUMS-International » Comparability: data are rigorously integrated; documentation is extensive, both primary (from NSIs) and integrated (from MPC) » Accountability: reports on users, usage and publications advisory board of statisticians and scientists » Sustainability: MPC, ICPSR

hist.umn.edu/~rmccaa/ipums-europe20 IPUMS-Europe, : coverage ~20 countries, representing ~400m. people » Scope: European census microdata, 1950-present census microdata, 1950-present census microdata, 1950-present » Work Plan (contingent upon funding) » 2003: Sign licensing agreements with census agencies Obtain funding from US NIH 2003: Sign licensing agreements with census agencies 2003: Sign licensing agreements with census agencies » 2004: Develop/translate microdata & metadata » 2005: Country expert teams design national integrations » 2006: MPC/expert teams design regional integration » 2007: MPC integrates microdata and metadata » 2008: MPC disseminates to bona fide researchers who sign non-disclosure license. National census/data/research institutes via CDs/web.

hist.umn.edu/~rmccaa/ipums-europe21 I N T E R N A T I O N A L I P U M S » Easy-to-use web-interface » Highest scientific standards » Proven, powerful integration » A quantum leap in usage Imagine a new statistical product: scientifically anonymized, integrated census microdata samples made up of unidentifiable individuals... » 1998: 1 country signed » 1999: 3 countries » 2000: 9 » 2001: 15 » 2002: 32; first release, 6 countries

hist.umn.edu/~rmccaa/ipums-europe22 R E S C U E S UN Demographic Center for Latin America (CELADE, Santiago, Chile) ~3000 microdata tapes recovered UN Demographic Center for Latin America (CELADE, Santiago, Chile) ~3000 microdata tapes recovered IPUMSiIPUMSiIPUMSiIPUMSi and metadata (documentation)

hist.umn.edu/~rmccaa/ipums-europe23 PAYSPAYSPAYSPAYS IPUMSiIPUMSiIPUMSiIPUMSi » Assembling microdata and documentation » Developing samples » to minimize confidentiality risks » and to maximize robustness » Designing national integration plan » census-by-census » concept-by-concept » code-by-code » Writing integrated documentation National experts in each country are contracted to assist with:

hist.umn.edu/~rmccaa/ipums-europe24 PARTNERSHIPPARTNERSHIPPARTNERSHIPPARTNERSHIP Photos from Colombia integration project, February-March, 2000: 4 experts from DANE (census office) +7 academics (3 universities) IPUMSiIPUMSiIPUMSiIPUMSi Standard:UN/Eurostat Principles & Recs... Census documentation compiled for Colombian microdata

hist.umn.edu/~rmccaa/ipums-europe25 IPUMS i integration principles IPUMS i integration principles » 1. Respect absolute anonymity and confidentiality » 2. Preserve all original data, except adjustments to insure privacy (top codes, blurrings, masking, re- ordering, etc.) » 3. Harmonize codes using international standards occupation: ISCO-88 (detailed, general) education: ISCED “ “ family: IPUMS, etc. “ “ » 4. Enhance with constructed variables

hist.umn.edu/~rmccaa/ipums-europe26 Composite coding scheme example: marital status

hist.umn.edu/~rmccaa/ipums-europe27 Occupation: the ISCO standard, preliminary release: “1” digit final: 2-3 or 4 digit, depending upon country

hist.umn.edu/~rmccaa/ipums-europe28 Variable availability, preliminary release

hist.umn.edu/~rmccaa/ipums-europe29 DISSEMINATESDISSEMINATESDISSEMINATESDISSEMINATES IPUMSiIPUMSiIPUMSiIPUMSi Legally-binding license agreement » protects privacy and confidentiality » assures proper use » new sanction: loss of employment. Researcher selects » countries » censuses » cases/sub-populations » variables » sample densities » Facilitates comparative research Web-based extraction system

hist.umn.edu/~rmccaa/ipums-europe30 additional information at: contact: * * * * * Thank you

hist.umn.edu/~rmccaa/ipums-europe31 IPUMS-Europe, : coverage ~20 countries, representing ~400m. people » Scope: European census microdata, 1950-present census microdata, 1950-present census microdata, 1950-present » Work Plan (contingent upon funding) » 2003: Sign licensing agreements with census agencies Obtain funding from US NIH 2003: Sign licensing agreements with census agencies 2003: Sign licensing agreements with census agencies » 2004: Develop/translate microdata & metadata » 2005: Country expert teams design national integrations » 2006: MPC/expert teams design regional integration » 2007: MPC integrates microdata and metadata » 2008: MPC disseminates to bona fide researchers who sign non-disclosure license. National census/data/research institutes via CDs/web.