Presentation is loading. Please wait.

Presentation is loading. Please wait.

* * * Robert McCaa and Albert Esteve Palos www.ipums.org/international www.iecm-project.org IPUMS-International and Integrated European Census Microdata.

Similar presentations


Presentation on theme: "* * * Robert McCaa and Albert Esteve Palos www.ipums.org/international www.iecm-project.org IPUMS-International and Integrated European Census Microdata."— Presentation transcript:

1 * * * Robert McCaa and Albert Esteve Palos www.ipums.org/international www.iecm-project.org IPUMS-International and Integrated European Census Microdata Projects Reduce Risks of Managing Trans-border Access and Add Significant Value * * * Robert McCaa and Albert Esteve Palos Minnesota Population Center and Centre d’Estudis Demografics--Barcelona www.ipums.org/international www.iecm-project.org www.ipums.org/international www.iecm-project.org www.ipums.org/international www.iecm-project.org

2 “ Dissemination [means] opening up the value inherent in our data.” -- Walter Radermacher and Pieter Everaers Seminar on Emerging Trends in Data Communication and Statistics, UNSC, New York, Feb. 19, 2010 *

3 Trans-Border access is essential in 21 st Century. Many researchers (e.g., demographers, members of IUSSP) reside outside their country of birth New Zealanders60% reside outside country of birth New Zealanders60% reside outside country of birth Dutch 40% Dutch 40% Germans 38% Germans 38% Danes34% Danes34% Chinese30% Chinese30% Belgians31% Belgians31% British25% British25% Australians22% Australians22% Canadians, Finns, French, Japanese, Swiss, etc. ~20% Canadians, Finns, French, Japanese, Swiss, etc. ~20% Limiting access to in-country is old-fashioned, inefficient, costly, & unfair. Encourages violations, brain drain.

4 IPUMS-International dark green = anonymized, harmonized and disseminating (69 countries, 212 censuses, 480 millon person records) medium green = to be integrated (29 countries, 75 censuses, ~100 mpr) Mollweide projection IPUMS-International: 2012 (weighted by population size) 2012 launch: El Salvador (2) Indonesia (9) Mexico (2010) Morocco (3) Nicaragua (3) Turkey (3) Uruguay (5) Work began in 1999. By 2020 we hope to integrate census microdata of 100 countries, including 2010 round censuses.

5 IPUMS-International dark green = anonymized, harmonized and disseminating (17 countries, 56 censuses, 93 millon person records) medium green = to be integrated (2 countries, 6 censuses, ~5 mpr) Mollweide projection IECM/ IPUMS-Europe: 2012 (weighted by population size) Countries not yet participating are invited to consider doing so: Albania, Belgium, Bosnia-H, Croatia, Denmark, Estonia, Finland, Iceland, Latvia, Lithuania, Moldova R., Norway, Russia, Serbia, Slovak R., Sweden, etc.

6 NSOs that disseminate microdata by “going it alone” incur significant risks, substantial costs, & much user dissatisfaction NSOs that disseminate microdata by “going it alone” incur significant risks, substantial costs, & much user dissatisfaction I. IPUMS & IECM offer a “one-stop” comprehensive solution to managing access to census microdata II. Statistical Confidentiality and Security III. Integration IV. Manage trans-border access V. Conclusion: Invitation to cooperate, entrust 2010 round census microdata as soon as feasible. Outline: IPUMS-International & IECM Outline: IPUMS-International & IECM Reduce Risks of Managing Trans-border Access and Add Significant Value

7 I. One-stop, comprehensive solution to disseminating census microdata & metadata… of Europe and the world 1. OrganizeUniform agreement with each NSO 2. AdministerWe manage approval/denial of user access 3. AnonymizeWe are responsible for data anonymization 4. IntegrateWe do the work Metadata Official language and integrated in English MicrodataIntegrated globally & optimized for Europe 5. DisseminateExtracts, custom-tailored to each request 6. ShareWe share: results, comprehensive electronic bibliography No longer enough to prepare a CD or post a dataset on a web-site

8 II. Statistical Confidentiality and Security A. Microdata security and confidentiality protections Employees face fines, job loss, and possible imprisonment for violations Employees face fines, job loss, and possible imprisonment for violations Security: “best practice” – Dennis Trewin, ex Aus. Stat. Security: “best practice” – Dennis Trewin, ex Aus. Stat. B. Statistical disclosure control protections: Suppression of records using sub-sampling, names, low- level geography, unique variates, Suppression of records using sub-sampling, names, low- level geography, unique variates, Paired swapping of geographical identifiers of households to create uncertainty Paired swapping of geographical identifiers of households to create uncertainty Top/bottom coding, global recodes, deletion of digits, etc. Top/bottom coding, global recodes, deletion of digits, etc. C. Managing restricted access to microdata (next slide)

9 II. Statistical Confidentiality and Security (cont’d.) A. Microdata security and confidentiality protections B. Statistical disclosure control protections: C. Managing restricted access to microdata Detailed registration form to establish bona-fides Detailed registration form to establish bona-fides 4/5ths of viewers do not complete the form! --automatic denial 4/5ths of viewers do not complete the form! --automatic denial Conditions of use bind researcher & institution; violations penalize every researcher at institution Conditions of use bind researcher & institution; violations penalize every researcher at institution Custom-tailored extracts encourage researchers to jealously guard their downloads. Custom-tailored extracts encourage researchers to jealously guard their downloads. More than 5,000 researchers approved for access More than 5,000 researchers approved for access

10 III. Integration: Metadata & Microdata D. Comprehensive source metadata in official language(s) Questionnaires, instructions, manuals, etc. Questionnaires, instructions, manuals, etc. E. Integrated, DDI compatible metadata: definitions, concepts, variable names, value labels, codes--all link back to sources Descriptions of censuses and samples, Descriptions of censuses and samples, Variables defined, comparability discussions, Variables defined, comparability discussions, Example: educational attainment (next slide) Example: educational attainment (next slide) F. Integrated, pooled microdata: multiple censuses in a single file G. Integrated boundary files (GIS) linked to microdata H. IPUMS value added variables

11 Example of composite coding: Educational attainment

12 III. Integration: Metadata & Microdata (cont’d.) D. Comprehensive source metadata in official language(s) E. Integrated, DDI compatible metadata: definitions, concepts, variable names, value labels, codes--all link back to sources F. Integrated, pooled microdata: many censuses in single file G. Integrated boundary files (GIS) linked to microdata H. IPUMS value added variables: Technical variables: weights, identifiers Technical variables: weights, identifiers Family, household info: summary indicators Family, household info: summary indicators Person variables: Locations of mother, father, spouse and rules for linking (momloc, poploc, sploc) Person variables: Locations of mother, father, spouse and rules for linking (momloc, poploc, sploc)

13 IV. Managing Trans-border Access I. Trans-border access: uniform experience for access to all countries, regardless of nationality J. Custom-tailored extracts: user selects country(ies), censuses, variables, sub-populations Extract engine fulfills request, generates custom-tailored microdata and metadata Extract engine fulfills request, generates custom-tailored microdata and metadata 3 unique IPUMS extract tools: 3 unique IPUMS extract tools: 1. Select cases 2. Attach characteristics 3. Customize sample size K. Usage: 8,048 extracts in 2011; 40,142 samples. See next page.

14 Disclosure Controls for Trans-Border access to Census Microdata via a Single License, Access Point: The IPUMS-IECM partnership * * * Robert McCaa and Albert Esteve Palos www.ipums.org/international “You have to do due diligence, something to assure yourself that the people you’re giving your data to can be trusted.” --http://www.nytimes.com/2011/09/09/us/09breach.html?hp Disclosure Controls for Trans-Border access to Census Microdata via a Single License, Access Point: The IPUMS-IECM partnership * * * Robert McCaa and Albert Esteve Palos Minnesota Population Center and Centre d’Estudis Demografics--Barcelona www.ipums.org/international “You have to do due diligence, something to assure yourself that the people you’re giving your data to can be trusted.” --http://www.nytimes.com/2011/09/09/us/09breach.html?hp www.ipums.org/internationalhttp://www.nytimes.com/2011/09/09/us/09breach.html?hp www.ipums.org/internationalhttp://www.nytimes.com/2011/09/09/us/09breach.html?hp IPUMS-International Google Analytics: 2011 Trans-Border Access: 169 countries/territories 3,033 cities, 45,000 page views. Up 4X from 2010

15 Table 2. Rank of the Top Five and all European Countries plus Canada and the USA by Number of Extracts for the 2000 round census (statistics for calendar year 2011) RankCountry Sample %* Variables (n)*Years of census samplesExtracts 1Brazil51061960, 70, 80, 91, 2000712 2Mexico101201960p, 70, 90, 95, 2000, 05626 3United States5921960, 70, 80, 90, 2000, 05554 4Colombia101201964p, 72, 85, 93, 2005516 5South Africa101081996, 2001, 2007428 7Canada2.5591971p, 81p, 91p, 2001p409 9France33941962, 68, 75, 82, 90, 99, 06380 10Spain5991981, 91, 2001366 13Greece10891971, 81, 91, 2001327 18Austria10751971, 81, 91, 2001310 25Italy5812001285 26Portugal5961981, 91, 2001283 29Romania10971976, 92, 2002272 30Switzerland5791970, 80, 90, 2000266 32United Kingdom3471991, 2001p263 38Hungary5741970, 80, 90, 2001222 42The Netherlands1331960p, 71p, 2001p211 45Slovenia10802002185 48Belarus10841999179 Total samples extracted for 55 countries (162 samples) available from January 1, 2011.8,048 *2000 round census; refers to all integrated variables, including IPUMS constructed variables. “p” = person sample; all other samples are of households 15

16 IECM value-added (in beta test): Password protected, trans-border on-line tabulator

17 Substantial returns to NSOs; no cost: economies of scale, low risk. Substantial returns to NSOs; no cost: economies of scale, low risk. 96 NSOs are participating 96 NSOs are participating If yours is not, let’s discuss how to resolve the obstacles: If yours is not, let’s discuss how to resolve the obstacles:  Amend legislation,  Revise regulations,  Advocate statistical transparency, etc. Entrust 2011 census microdata, as soon as feasible Entrust 2011 census microdata, as soon as feasible Provide boundary files at low-level geography for each census possible Provide boundary files at low-level geography for each census possible Reflections

18 IPUMS at the 59 th ISI (Hong Kong, Aug 24-30, 2013) http://www.isi2013.hk/ » IPUMS Workshop » Microdata session » IPUMS Funding for delegates from developing countries » IPUMS booth

19 Thank you If your NSO is not participating yet, please contact: rmccaa@umn.edu When processing of your 2011 census microdata is completed, please contact: rmccaa@umn.edu rmccaa@umn.edu rmccaa@umn.edu rmccaa@umn.edu


Download ppt "* * * Robert McCaa and Albert Esteve Palos www.ipums.org/international www.iecm-project.org IPUMS-International and Integrated European Census Microdata."

Similar presentations


Ads by Google