Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hist.umn.edu/~rmccaa/ipums-europe1 IPUMS-Europe, 2004-2008: Restricted-access, anonymized microdata for scientific and policy research * * * Robert McCaa,

Similar presentations


Presentation on theme: "Hist.umn.edu/~rmccaa/ipums-europe1 IPUMS-Europe, 2004-2008: Restricted-access, anonymized microdata for scientific and policy research * * * Robert McCaa,"— Presentation transcript:

1 hist.umn.edu/~rmccaa/ipums-europe1 IPUMS-Europe, 2004-2008: Restricted-access, anonymized microdata for scientific and policy research * * * Robert McCaa, University of Minnesota Population Center Nikolai Botev, UN-ECE Population Activities Unit (Geneva) www.hist.umn.edu/~rmccaa/ipums-europe

2 hist.umn.edu/~rmccaa/ipums-europe2Outline » PAU 1990s project PAU 1990s project PAU 1990s project » IPUMS-International means: Restricted access, anonymized microdata IPUMS-International means: Restricted access, anonymized microdata IPUMS-International means: Restricted access, anonymized microdata » IPUMS-Europe: sister project (Latin America), connections with PAU IPUMS-Europe: sister project (Latin America), connections with PAU IPUMS-Europe: sister project (Latin America), connections with PAU » IPUMS-International partners IPUMS-International partners IPUMS-International partners » Principles: integration, dissemination Principles: integration, dissemination Principles: integration, dissemination

3 hist.umn.edu/~rmccaa/ipums-europe3 Population Activities Unit 1990 census round harmonization project: focused on Aging » Begun 1992: PAU/UNECE, UNFPA, US-NIA » Microdata acquired for 15 countries » Harmonized 26 core person variables plus 13 optional; 10 dwelling/household variables, 18 optional » Extensive metadata: questionnaires, nomenclatures, classifications » Progressive over-sampling with age

4 hist.umn.edu/~rmccaa/ipums-europe4 Population Activities Unit 1990 census round harmonization project: focused on Aging

5 hist.umn.edu/~rmccaa/ipums-europe5 Population Activities Unit, 1990 census round harmonization project: focused on Aging » General release: samples for 8 countries » Samples for the other 7 countries available under more restrictive conditions » Dissemination: CDs or other media; no online access » Sustainability: ICPSR (U. of Michigan)

6 hist.umn.edu/~rmccaa/ipums-europe6 Problems with PAU effort: » Sample design too complex » Need for time series » Lacked legal authority » Inadequate funding » Insufficient computing infrastructure and human resources » Antiquated distribution system » Sustainability problematic

7 hist.umn.edu/~rmccaa/ipums-europe7 Population Activities Unit: samples of older persons based on the 2000-round of censuses » Tightly integrated with IPUMS-Europe » Based on the same coding schemes, nomenclatures, and classifications » Utilize the same anonymization techniques and approaches; same data access modalities » Ensure sustainability through the integration with IPUMS-Europe: ICPSR & European Data Centers

8 hist.umn.edu/~rmccaa/ipums-europe8 Population Activities Unit: samples of older persons based on the 2000-round of censuses » Sample design: - » Sample design: - sample of households not included in the core IPUMS- Europe sample, where at least one member is over age 60 (recommended sampling density: 5 percent); - geography to match that of core samples; » Advantages: - » Advantages: - more straightforward than the design used for 1990s; - in line with the practice of national statistical offices (e.g. PUMS-A and PUMS-O of the US Census Bureau);

9 hist.umn.edu/~rmccaa/ipums-europe9 From IPUMS-USA (1989-) & PAU-Aging (1992-) From IPUMS-USA (1989-) & PAU-Aging (1992-) to IPUMS-International (1999-) and beyond to IPUMS-International (1999-), Latin America (2003-), Europe (2004?) and beyond

10 hist.umn.edu/~rmccaa/ipums-europe10 IPUMS-International means Restricted access, Anonymized microdata » Should be “IRAMS” not IPUMS » Who are IPUMS-International users? Those who: » Have a demonstrated need for the data (project abstract) » Agree to abide by the restrictions of use » Place themselves under the jurisdiction of Institutional Review Boards Place themselves under the jurisdiction of Institutional Review Boards Place themselves under the jurisdiction of Institutional Review Boards

11 hist.umn.edu/~rmccaa/ipums-europe11 A N O N Y M I Z E S IPUMSiIPUMSiIPUMSiIPUMSi » Suppress geographical detail (NUTS2/3?) » Corrupt the data! (just a little…) » Blur/aggregate sensitive codes » Convert dates to ages (blur key vars.) » Swap cases between districts! (just a few…) » Scramble order of unit records Using the most demanding standards: legal & administrative legaladministrativelegaladministrative as well as technical:

12 hist.umn.edu/~rmccaa/ipums-europe12 » 1. Suppress geographical variables below commune » 2. Convert » Dates of birth, marriage, immigration to ages » Band small groups » 3. Suppress sensitive codes for small groups: » Citizenship » Year of immigration to Italy » Commune of work/study Anonymization example: Italy, 1991 First assessment Note: population uniques are anonymized after integration Italy, 1991Italy, 1991

13 hist.umn.edu/~rmccaa/ipums-europe13 EUROSTAT statistical anonymity standards (Thorogood, 1999) --all accepted by IPUMS-International » 1. small sample size » 2. limited geographical detail » 3. top and bottom coding of unique categories » 4. signed non-disclosure agreement » 5. prohibit redistribution of datasets to third parties » 6. prohibit attempts to identify individuals or the making of any claim to that affect » 7. require users to provide copies of publications

14 hist.umn.edu/~rmccaa/ipums-europe14 EUROSTAT statistical anonymity standards (Thorogood, 1999) --all accepted by IPUMS i and more » 8. Age (constructed from birth date, where necessary) » 9. Never identify date of birth » 10. Never identify place of birth » 11. Migration: timing and place not identified in detail » 12. Place of residence identified by major civil division (pop>60k, 120k, 250k, 1 million--national rule) » 13. Sensitivity analysis of variables by national experts » 14. Confidentiality assessment by national experts

15 hist.umn.edu/~rmccaa/ipums-europe15 Sister-project: IPUMS-Latin America: 17 countries, ~500 million pop., 5 census rounds 80+ samples, 100+ million person records » Scope: Latin American census microdata, 1960-present census microdata, 1960-present census microdata, 1960-present » Work Plan ( funded by National Institutes of Health) » 2001: Sign licensing agreements with official agencies 2001: Sign licensing agreements with official agencies 2001: Sign licensing agreements with official agencies » 2002: Obtain funding from U.S. NIH » 2003: Develop/translate microdata & metadata » 2004: Country expert teams design national integrations » 2005: MPC/expert teams design regional integration » 2006: MPC anonymizes/integrates microdata and metadata » 2007: MPC disseminates to bona fide researchers who sign non-disclosure license. National census/data/research institutes may distribute national versions via CDs/web.

16 hist.umn.edu/~rmccaa/ipums-europe16 IPUMS-Europe Partnership: More… » Censuses: 1960s – 2000, where microdata exist » Countries: >350 million population, 16, inclined at present ( * = signed): Austria, Bulgaria, Czech Republic *, France *, Germany, Greece, Ireland, Israel, Hungary *, Poland, Portugal, Romania, Slovenia *, Spain *, Switzerland, Turkey » Research: more knowledge, more users

17 hist.umn.edu/~rmccaa/ipums-europe17 IPUMS-Europe Partnership: More uniformity… » Legal: signed memorandum of understanding » Administrative: restricted to approved users; strong enforcement procedures » Sample design: every n th household » Anonymization: includes corrupting data » Integration: more variables, composite coding » Dissemination: extract custom-tailored datasets, never entire samples Dissemination

18 hist.umn.edu/~rmccaa/ipums-europe18 Advantages… proven record of accomplishments: » Uniform legal protocols » Substantial institutional infrastructure » Experienced census microdata integrators » Cost-effective academic environment » Sustained funding from National Science Foundation, National Institutes of Health » Successful web-based distribution system: users! users

19 hist.umn.edu/~rmccaa/ipums-europe19 Advantages of IPUMS-International » Comparability: data are rigorously integrated; documentation is extensive, both primary (from NSIs) and integrated (from MPC) » Accountability: reports on users, usage and publications advisory board of statisticians and scientists » Sustainability: MPC, ICPSR

20 hist.umn.edu/~rmccaa/ipums-europe20 IPUMS-Europe, 2004-2008: coverage ~20 countries, representing ~400m. people » Scope: European census microdata, 1950-present census microdata, 1950-present census microdata, 1950-present » Work Plan (contingent upon funding) » 2003: Sign licensing agreements with census agencies Obtain funding from US NIH 2003: Sign licensing agreements with census agencies 2003: Sign licensing agreements with census agencies » 2004: Develop/translate microdata & metadata » 2005: Country expert teams design national integrations » 2006: MPC/expert teams design regional integration » 2007: MPC integrates microdata and metadata » 2008: MPC disseminates to bona fide researchers who sign non-disclosure license. National census/data/research institutes via CDs/web.

21 hist.umn.edu/~rmccaa/ipums-europe21 I N T E R N A T I O N A L I P U M S » Easy-to-use web-interface » Highest scientific standards » Proven, powerful integration » A quantum leap in usage Imagine a new statistical product: scientifically anonymized, integrated census microdata samples made up of unidentifiable individuals... » 1998: 1 country signed » 1999: 3 countries » 2000: 9 » 2001: 15 » 2002: 32; first release, 6 countries

22 hist.umn.edu/~rmccaa/ipums-europe22 R E S C U E S UN Demographic Center for Latin America (CELADE, Santiago, Chile) ~3000 microdata tapes recovered UN Demographic Center for Latin America (CELADE, Santiago, Chile) ~3000 microdata tapes recovered IPUMSiIPUMSiIPUMSiIPUMSi and metadata (documentation)

23 hist.umn.edu/~rmccaa/ipums-europe23 PAYSPAYSPAYSPAYS IPUMSiIPUMSiIPUMSiIPUMSi » Assembling microdata and documentation » Developing samples » to minimize confidentiality risks » and to maximize robustness » Designing national integration plan » census-by-census » concept-by-concept » code-by-code » Writing integrated documentation National experts in each country are contracted to assist with:

24 hist.umn.edu/~rmccaa/ipums-europe24 PARTNERSHIPPARTNERSHIPPARTNERSHIPPARTNERSHIP Photos from Colombia integration project, February-March, 2000: 4 experts from DANE (census office) +7 academics (3 universities) IPUMSiIPUMSiIPUMSiIPUMSi Standard:UN/Eurostat Principles & Recs... Census documentation compiled for Colombian microdata

25 hist.umn.edu/~rmccaa/ipums-europe25 IPUMS i integration principles IPUMS i integration principles » 1. Respect absolute anonymity and confidentiality » 2. Preserve all original data, except adjustments to insure privacy (top codes, blurrings, masking, re- ordering, etc.) » 3. Harmonize codes using international standards occupation: ISCO-88 (detailed, general) education: ISCED “ “ family: IPUMS, etc. “ “ » 4. Enhance with constructed variables

26 hist.umn.edu/~rmccaa/ipums-europe26 Composite coding scheme example: marital status

27 hist.umn.edu/~rmccaa/ipums-europe27 Occupation: the ISCO standard, preliminary release: “1” digit final: 2-3 or 4 digit, depending upon country

28 hist.umn.edu/~rmccaa/ipums-europe28 Variable availability, preliminary release

29 hist.umn.edu/~rmccaa/ipums-europe29 DISSEMINATESDISSEMINATESDISSEMINATESDISSEMINATES IPUMSiIPUMSiIPUMSiIPUMSi Legally-binding license agreement » protects privacy and confidentiality » assures proper use » new sanction: loss of employment. Researcher selects » countries » censuses » cases/sub-populations » variables » sample densities » Facilitates comparative research Web-based extraction system

30 hist.umn.edu/~rmccaa/ipums-europe30 additional information at: www.hist.umn.edu/~rmccaa/ipums-europe contact: rmccaa@umn.edu * * * * * Thank you

31 hist.umn.edu/~rmccaa/ipums-europe31 IPUMS-Europe, 2004-2008: coverage ~20 countries, representing ~400m. people » Scope: European census microdata, 1950-present census microdata, 1950-present census microdata, 1950-present » Work Plan (contingent upon funding) » 2003: Sign licensing agreements with census agencies Obtain funding from US NIH 2003: Sign licensing agreements with census agencies 2003: Sign licensing agreements with census agencies » 2004: Develop/translate microdata & metadata » 2005: Country expert teams design national integrations » 2006: MPC/expert teams design regional integration » 2007: MPC integrates microdata and metadata » 2008: MPC disseminates to bona fide researchers who sign non-disclosure license. National census/data/research institutes via CDs/web.


Download ppt "Hist.umn.edu/~rmccaa/ipums-europe1 IPUMS-Europe, 2004-2008: Restricted-access, anonymized microdata for scientific and policy research * * * Robert McCaa,"

Similar presentations


Ads by Google