Download presentation
Presentation is loading. Please wait.
1
www.ipums.org/international1 IPUMS-Eurasia, 2003-2007: Preserving Eurasian census microdata, making them useful, and promoting their use * * * Robert McCaa, Steven Ruggles, Matthew Sobek, Deborah Levison and Miriam King University of Minnesota Population Center rmccaa@umn.edu
2
www.ipums.org/international2 If so, the following needs to be done now: IPUMS-Eurasia before Europe » Official: » Formalize agreement » Release 1989 & 1994 samples for project development » Unofficial, agree upon: » Sample density: entire long-form preferred; 10% OK » License fee: $$$ proportional to sample density » Division of tasks (provisional): equitable » Calendar (provisional): begin in 2003 » 1989 sample: OK? Or will a new one be drawn? » 1979 and 1970: do any microdata tapes still exist?
3
www.ipums.org/international3 …official statistics that meet the test of practical utility are to be compiled and made available on an impartial basis by official statistical agencies to honor citizens’ entitlement to public information. -- UN Statistical Commission, 1994 …official statistics that meet the test of practical utility are to be compiled and made available on an impartial basis by official statistical agencies to honor citizens’ entitlement to public information. -- UN Statistical Commission, 1994 Widespread Internet Technology diffusion is “a pre-requisite for the development of civil society based on free access to information through the global Internet.“ --President Putin, March 6, 2001 http://president.kremlin.ru/events/178.html
4
www.ipums.org/international4 I N T E R N A T I O N A L I P U M S » Easy-to-use web-interface » Highest scientific standards » Proven, powerful integration » A quantum leap in usage Imagine a new statistical product: scientifically anonymized, integrated census microdata samples made up of unidentifiable individuals... » 1998: 1 country signed » 1999: 3 countries » 2000: 9 » 2001: 15 » 2002: 32; first release, 6 countries
5
www.ipums.org/international5 BeforeEurope?BeforeEurope?BeforeEurope?BeforeEurope? IPUMS-EURASIAIPUMS-EURASIAIPUMS-EURASIAIPUMS-EURASIA Eurasia Phase: 2003-2007 Advantages of a Eurasia-phase, before Europe Statistical coherence of 1989/2000 censuses Statistical coherence of 1989/2000 censuses Readily organizable Readily organizable 12 countries, not 40 12 countries, not 40 One linguistic standard: Russian One linguistic standard: Russian Progress on negotiating agreements Technical OKs: Belarus, Moldova Republic Technical OKs: Belarus, Moldova Republic Negotiating: Armenia, Azerbaijan Republic, Georgia, Kazahkstan, Kyrghz Republic, Russia, Tajikistan, Turkmenistan, Ukraine, Uzbekistan Negotiating: Armenia, Azerbaijan Republic, Georgia, Kazahkstan, Kyrghz Republic, Russia, Tajikistan, Turkmenistan, Ukraine, Uzbekistan Not participating: none, as yet. Not participating: none, as yet.
6
www.ipums.org/international6 B E N E F I T S I P U M Si » Researchers, world-wide: free, high quality data harmonized, comprehensive » National Statistics Institutes: increased usage enhanced cost-benefit ratio payment for license fees, expertise » People: who we are what the future may bring how policies might improve
7
www.ipums.org/international7 IPUMS-International, a global collaboratory of National Statistical/Research Institutes: » 1. Inventories the world’s census microdata » 2. Preserves endangered microdata and documentation * * * » 3. Integrates datasets of selected countries using UNSD, Eurostat and other standards » 4. Anonymizes census microdata to preserve statistical confidentiality, using highest standards » 5. Disseminates customized extracts free of charge (with complete copies on CDs to all partners) Integrated Public Use Microdata Series - International
8
www.ipums.org/international8 PARTNERSPARTNERSPARTNERSPARTNERS IPUMSiIPUMSiIPUMSiIPUMSi Phase 1: 1999-2004 Brazil1960, 1970, 1980, 1991, 2001 Colombia 1964, 1973, 1985, 1993, 2003 Mexico1960, 1970, 1980, 1990, 2000 France 1962, 1968, 1975, 1982, 1990 Hungary1970, 1980, 1990, 2000 Spain 1981, 1991, 2001 Kenya 1989, 1999 Ghana 1984, 2000 China 1982, 1990, 2000 Vietnam 1989, 1999 USA 1850-1880, 1900-1930, 1940 - 2000
9
www.ipums.org/international9 IPUMS-Latin America, 2003-2007: 16 countries, ~500m. people » Scope: Latin American census microdata, 1960-present census microdata, 1960-present census microdata, 1960-present » Work Plan ( funded by National Institutes of Health) » 2001-2: Sign licensing agreements with official agencies 2001-2: Sign licensing agreements with official agencies 2001-2: Sign licensing agreements with official agencies » 2002: Obtain funding from U.S. NIH » 2003: Develop/translate microdata & metadata » 2004: Country expert teams design national integrations » 2005: MPC/expert teams design regional integration » 2006: MPC integrates microdata and metadata » 2007: MPC disseminates to bona fide researchers who sign non-disclosure license. National census/research institutes via CDs/web.
10
www.ipums.org/international10 PARTNERSPARTNERSPARTNERSPARTNERS I P U M S- E U R O P E Europe Phase: 2004-8 Phase 1 European partners: INSEE-France 1962, 1968, 1975, 1982, 1990 CSO-Hungary 1970, 1980, 1990, 2000 INE-Spain 1981, 1991, 2001 Phase 2, 2004-2007: 10 OK: Austria, Bulgaria, Czech Republic, Germany, Ireland, Lithuania, Poland, Romania, Slovenia, UK 5 Approval pending: Finland, Iceland, Israel, Norway, Portugal 11 Negotiating: Belgium, Denmark, Greece, Italy, Latvia, Netherlands, Russia, Sweden, Switzerland, Turkey, Yugoslavia 2 Not participating: Estonia, Slovakia
11
www.ipums.org/international11 P R E S E R V E S UN Demographic Center for Latin America (CELADE, Santiago, Chile) ~3000 microdata tapes preserved UN Demographic Center for Latin America (CELADE, Santiago, Chile) ~3000 microdata tapes preserved IPUMSiIPUMSiIPUMSiIPUMSi and metadata (documentation)
12
www.ipums.org/international12 12100102600700720000011210000104 22200202600700720000011210000104 32300100600700720000012123000000 42300200400700000000000000000000 52300200200700000000000000000000 62300200000700000000000000000000 Census microdata of the late 20th century: Who will preserve them? Who will make them useful? Census microdata: Public goods should be democratized. Censuses are costly. Where microdata are available, they are used.
13
www.ipums.org/international13 S A M P L E S I P U M Si
14
www.ipums.org/international14 PAYSPAYSPAYSPAYS IPUMSiIPUMSiIPUMSiIPUMSi National experts are paid to: » Assemble microdata and documentation » Develop samples » to minimize confidentiality risks » and to maximize robustness » Design national/regional integration plan » census-by-census » concept-by-concept concept-by-concept » code-by-code code-by-code » Write integrated documentation National Statistical Institutes are paid a non-exclusive license fee for integrated data
15
www.ipums.org/international15 INTEGRATESINTEGRATESINTEGRATESINTEGRATES Photos from Colombia integration project, February-March, 2000: 4 experts from DANE (census office) +7 academics (3 universities) IPUMSiIPUMSiIPUMSiIPUMSi Standard:UN/Eurostat Principles & Recs... Census documentation compiled for Colombian microdata
16
www.ipums.org/international16 IPUMS i integration principles IPUMS i integration principles » 1. Respect absolute anonymity and confidentiality » 2. Preserve all original data, except adjustments to insure privacy (top codes, blurrings, masking, re- ordering, etc.) » 3. Harmonize codes using international standards occupation: ISCO, HISCO (detailed, general) education: ISCED “ “ family: IPUMS, etc. “ “ » 4. Enhance with constructed variables
17
www.ipums.org/international17 Variable availability, preliminary release
18
www.ipums.org/international18 Composite coding scheme example: marital status
19
www.ipums.org/international19 Occupation: the ISCO standard, preliminary release: “1” digit final: 2-3 or 4 digit, depending upon country
20
www.ipums.org/international20 A N O N Y M I Z E S IPUMSiIPUMSiIPUMSiIPUMSi » Suppress geographical detail » Blur/aggregate sensitive codes » Convert dates to ages (blur key vars.) » Swap cases between districts » Scramble records Using the highest standards available: administrative (license), legal, and technical (US Census Bureau, Eurostat, & others)
21
www.ipums.org/international21 ‘statistical confidentiality’ shall mean the protection of data related to single statistical units which are obtained directly for statistical purposes or indirectly from administrative or other sources against any breach of the right to confidentiality. It implies the prevention of non-statistical utilization of the data obtained and unlawful disclosure. --COUNCIL REGULATION (EC) No 322/97 of 17 February 1997
22
www.ipums.org/international22 Anonymization plan: Kenya, 1989 Kenya: Anonymization Based on Unique Characteristics Threshold (100,000 for geographic variables; 10,000 for other variables) TypeProcedure Variable Name KeySuppressedDivision, Location, Sublocation, Enumeration area Aggregated100,000 minimum: Province, District of Residence, Birth and Past Residence NoneSex, Marital Status, Relationship to Head SensitiveAggregated10,000/1,000 minimum: Tribe/Ethnicity, Occupation, Employment Status Transitory (information is considered too changeable to be used to identify individuals from microdata). NoneAge, Urban/Rural Residence, Literacy, Educational Status, Educational Level, Labor Activity, Children Everborn/Alive/Dead, Last Birth Year, Mortality variables Note: For greater detail and a reproduction of the 1989 enumeration form, see Appendix 3.
23
www.ipums.org/international23 EUROSTAT statistical anonymity standards (Thorogood, 1999) --all used by IPUMS-International » 1. small sample size » 2. limited geographical detail » 3. top and bottom coding of unique categories » 4. signed non-disclosure agreement » 5. prohibit redistribution of datasets to third parties » 6. prohibit attempts to identify individuals or the making any claim to that effect » 7. require users to provide copies of publications
24
www.ipums.org/international24 EUROSTAT statistical anonymity standards (Thorogood, 1999) --all used by IPUMS i and more » 8. Age (constructed, where necessary) » 9. Never identify date of birth » 10. Never identify place of birth » 11. Migration: timing and place not identified in detail » 12. Place of residence identified by major civil division (pop>60k, 120k, 250k, 1 million--national rule) » 13. Sensitivity analysis of variables by national experts » 14. Confidentiality assessment by national experts
25
www.ipums.org/international25 International Monetary Fund’s General Data Dissemination System 52 countries with uniform standards » All embrace strict standards of statistical confidentiality » All prohibit disclosure of information which may identify individuals or entities » And 37 of 52 countries distribute census microdata samples » Why not Russia, Armenia, Azerbaijan Republic, Belarus, Georgia, Kazakhstan, Kyrgyz Republic, Moldova Republic, Tajikistan, Turkmenistan, Ukraine, or Uzbekistan?
26
www.ipums.org/international26 DISSEMINATESDISSEMINATESDISSEMINATESDISSEMINATES IPUMSiIPUMSiIPUMSiIPUMSi Legally-binding license agreement » protects privacy and confidentiality » assures proper use; » new sanction: loss of employment. Researcher selects » Countries, » Censuses, » Cases/sub-populations, » Variables, and » Sample densities--makes chronological &/or cross-national research possible Open architecture software and mirror sites Web-based extraction system
27
www.ipums.org/international27 IPUMS-Eurasia, 2003-2007: 12 countries, >280 m. people » Scope: Eurasia census microdata, 1989-present census microdata, 1989-presentcensus microdata, 1989-present » Work Plan (contingent upon funding): » Jan 2003: Sign licensing agreements with official agencies Jan 2003: Sign licensing agreements with official agencies Jan 2003: Sign licensing agreements with official agencies » Nov 2003: Obtain funding from US NIH » 2004: Pay licenses/sign contracts to develop/translate microdata & metadata » 2005: Country expert teams design national integrations » 2006: MPC/expert teams design Eurasia integration » 2007: MPC integrates microdata and metadata » 2008 and beyond: MPC disseminates to bona fide researchers who sign non-disclosure license. National census/research institutes disseminate via CDs/web.
28
www.ipums.org/international28 On a millennial scale, censuses and census microdata survive for only a short, but significant period
29
www.ipums.org/international29 IPUMS-Eurasia, 2003-2007: What needs to be done now? » Official: » Formalize agreement » Release 1989 & 1994 samples for project development » Unofficial, agree upon: » Sample density: entire long-form preferred; 10% OK » License fee: $$$ proportional to sample density » Division of tasks (provisional): equitable » Calendar (provisional): begin in 2003 » 1989 sample: OK? Or will a new one be drawn? » 1979 and 1970: do any microdata tapes still exist?
30
www.ipums.org/international30 additional information at: www.hist.umn.edu/~rmccaa/ipums-eurasia contact: rmccaa@umn.edu * * * * * Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.