Download presentation
Presentation is loading. Please wait.
Published byStéphanie Vilarinho Escobar Modified over 6 years ago
1
Nordic Demography Symposium, Tjøme 2001
The census in global perspective and the coming census microdata revolution * * * Robert McCaa & Steven Ruggles Minnesota Population Center IPUMS International funded by National Science Foundation Nordic Demography Symposium, Tjøme 2001
2
Subtext: Why should Nordic countries participate in a project to preserve the world’s census microdata and help make them usable? Longest historical series of census microdata in the world Cross-national research on a global scale requires representation of all cultural regions Intriguing demographic, historical laboratory Large pool of scientific talent with global concerns Persisting cultural, scientific ties with Minnesota (would, for example, U. of Texas be as interested?) Nordic Demography Symposium, Tjøme 2001
3
Globalization of the census & the coming census microdata revolution
1. Introduction: census & census microdata 2. The population census goes global coverage, periodicity, and content 3. Liberating census microdata: preservation, anonymization, integration, & dissemination 4. Statistical confidentiality and census samples: a 36 year-long perfect record 5. International norms of statistical confidentiality 6. Harmonizing and disseminating scientifically anonymized census samples: IPUMSi Nordic Demography Symposium, Tjøme 2001
4
Nordic Demography Symposium, Tjøme 2001
1. Introduction The census: what is it? Census microdata: what are they? How can they be made usable? Why should we care? Nordic Demography Symposium, Tjøme 2001
5
(from Museum of Antropology, Mexico City)
16th c. “census” of Mexico (Nahuatl, 1530s). “Here is the home of one...” (from Museum of Antropology, Mexico City) original ms. transcribed translated digitized Nordic Demography Symposium, Tjøme 2001
6
(from Museum of Antropology, Mexico City)
16th c. “census” of Mexico (Nahuatl, 1530s). “Here is the home of one...” (from Museum of Antropology, Mexico City) original ms. transcribed translated digitized When is a census, a census? Goyer (1986): 5. Individual enumeration 6. Periodic enumeration 7. Publication of results 8. Dissemination of results 1. National legal authority 2. Defined enumeration area 3. Complete coverage 4. Simultaneous enumeration Nordic Demography Symposium, Tjøme 2001
7
Male 10 years old, not married
An Aztec extended family 5 conjugal units, 4 generations, 3 married brothers Simply an old widow 1530 Female, 20, not yet married Married Male Married female Married female Married Male (1 yr. Ago) Married Head of house Married female Married Male (1 yr. Ago) Married female Male 10 years old, not married Nordic Demography Symposium, Tjøme 2001
8
Nordic Demography Symposium, Tjøme 2001
450 years later: An example of a patrilateral household from rural Morelos 5 conjugal unions, 3 generations 1990 Married head, 50 Married, 48 Son, 15 daughter10 Son, 22 free union Daughtr, 22 Daughtr,14, free union Free union, 21 Free Union, 25 Unión libre, 25 años Free union, 29 Daughtr 5 Son, 2 Daughtr, months old Daughtr, 2 Free union, 19 Free union, 16 Nordic Demography Symposium, Tjøme 2001 (not kin)
9
Examples to percentages: Have there been changes in 4 1/2 centures?
Head spouse child kin non-kin Head spouse child kin non-kin Nordic Demography Symposium, Tjøme 2001
10
Nordic Demography Symposium, Tjøme 2001
Census microdata of the late 20th century: What are they? Who bears preservation responsibility? Who will make them usable? Person number Age Sex Census microdata: Censuses are costly Public goods should be democratized Where microdata are available, they are used Nordic Demography Symposium, Tjøme 2001
11
Globalization of the census & the coming census microdata revolution
1. Introduction: census & census microdata 2. The population census goes global coverage, periodicity, and content 3. Liberating census microdata: preservation, anonymization, integration, & dissemination 4. Statistical confidentiality and census samples: a 36 year-long perfect record 5. International norms of statistical confidentiality 6. Harmonizing and disseminating scientifically anonymized census samples: the case of IPUMSi Nordic Demography Symposium, Tjøme 2001
12
Nordic Demography Symposium, Tjøme 2001
2. The population census goes global. Coverage becomes universal (thanks to A.N. Kiær, Statistics Norway, who promoted globalization of census at beginning of 20th c.) Content becomes uniform Decennial censuses become the norm Nordic Demography Symposium, Tjøme 2001
13
Population censuses became universal in the 20th century.
Will census microdata ... in the 21st? 153 countries with 1 million + pop. in 2000 2000 round figures are provisional Nordic Demography Symposium, Tjøme 2001
14
Nordic Demography Symposium, Tjøme 2001
Content ... increasingly uniform, principal source on population information. social variables: Nordic Demography Symposium, Tjøme 2001
15
Content ... increasingly uniform education and migration variables:
Nordic Demography Symposium, Tjøme 2001
16
Content ... increasingly uniform demographic and economic variables:
Nordic Demography Symposium, Tjøme 2001
17
Nordic Demography Symposium, Tjøme 2001
Decennial censuses are the rule ( ). of 153 countries with 1 million + pop totaling 6 billion people in 2000: At least one census per decade: 66 countries 50% of world’s population Missed a single decennial enumeration: 43 countries 38% of world’s population Missed 2 or 3 enumerations: 32 countries 10% pop. Fewer than 3 enumerations: 12 countries % of pop. Nordic Demography Symposium, Tjøme 2001
18
Nordic Demography Symposium, Tjøme 2001
On a millennial scale, censuses and census microdata survive for only a short, but significant period Nordic Demography Symposium, Tjøme 2001
19
Globalization of the census & the coming census microdata revolution
1. Introduction: census & census microdata 2. The population census goes global coverage, periodicity, and content 3. Liberating census microdata: preservation, anonymization, integration, & dissemination 4. Statistical confidentiality and census samples: a 36 year-long perfect record 5. International norms of statistical confidentiality 6. Harmonizing and disseminating scientifically anonymized census samples: the case of IPUMSi Nordic Demography Symposium, Tjøme 2001
20
Nordic Demography Symposium, Tjøme 2001
…official statistics that meet the test of practical utility are to be compiled and made available on an impartial basis by official statistical agencies to honor citizens’ entitlement to public information. -- UN Statistical Commission, 1994 Nordic Demography Symposium, Tjøme 2001
21
IPUMSi helps five ways:
1. Inventory the world’s census microdata 2. Preserve endangered microdata and documentation * * * 3. Anonymize census microdata to preserve statistical confidentiality, using highest standards (Stat. Nether.) 4. Integrate datasets of selected countries using UN, Eurostat and other standards 5. Disseminate database free with complete copies to all partners Integrated Public Use Microdata Series - International Nordic Demography Symposium, Tjøme 2001
22
Nordic Demography Symposium, Tjøme 2001
I P U M Si I N V E N T O R I E S Microdata...for any population or administrative division: Nation, province, district, city, ethnic group, etc. Example: Latin America, countries - 67 censuses inventoried - 1% - 100% sample densities - 100,000 to 150 million cases 19th century: censuses 1960s: s: s: s: 17 Found: complete census data for Colombia 1973 and 16 other countries Nordic Demography Symposium, Tjøme 2001
23
and metadata (documentation)
I P U M Si P R E S E R V E S UN Demographic Center for Latin America (CELADE, Santiago, Chile) ~3000 microdata tapes to be preserved and metadata (documentation) Nordic Demography Symposium, Tjøme 2001
24
Nordic Demography Symposium, Tjøme 2001
Preserve against accident, deterioration and technological obsolescence Microdata: - transfer to stable media - use standard data storage protocols - entrust copies with at least two depositories Metadata: collect, catalogue, and reproduce - Enumeration forms (preserve all versions used) - Enumerator and data processing instructions - Codebooks (photocopies and scanned images) - Technical studies, evaluations, reports UN Stat. Div.: entire archive deposited, to be scanned Nordic Demography Symposium, Tjøme 2001
25
Globalization of the census & the coming census microdata revolution
1. Introduction: census & census microdata 2. The population census goes global coverage, periodicity, and content 3. Liberating census microdata: preservation, anonymization, integration, & dissemination 4. Statistical confidentiality and census samples: a 36 year-long perfect record 5. International norms of statistical confidentiality 6. Harmonizing and disseminating scientifically anonymized census samples: the case of IPUMSi Nordic Demography Symposium, Tjøme 2001
26
How anonymized census samples became a standard statistical product:
US Census Bureau: census 0.1% “public use microdata series” census: six 1% samples harmonized with 1960 - 1984: 1940, % samples - 1980, 1990 samples varying densities, contents CELADE: Latin America - 1960s: 16 countries, densities 1-5% - 1970s: 19 countries, 1-10% Nordic Demography Symposium, Tjøme 2001
27
How anonymized census samples became a standard statistical product:
Canada: - 1971, 1976, 1981, 1986, 1991, 1996: varying designs, densities - 1996: Data Liberation Initiative led to an explosion in of usage in research and teaching UK: - 1991: 2% individuals, 0.5% households hundreds of publications, thousands of users - 2001: double the densities because confidentiality assessments were too conservative. Nordic Demography Symposium, Tjøme 2001
28
Risk assessment of statistical confidentiality:
Take into account error, coding variability and changing of personal characteristics in time Dale and Elliott, JRSS-A (forthcoming): “For a user of an outside database, attempting this sort of match with no opportunity for verification would prove fruitless. In the first place, the small degree of expected overlap would be a considerable deterrent to an intruder. However, if a match between the two files was attempted the large number of apparent matches would be highly confusing as an intruder would have no way of checking correct identification.” Nordic Demography Symposium, Tjøme 2001
29
Statistical confidentiality in the USA: a brief history
Before 1954: - 1850: “exclusively for the use of the government, and not to be used...to the gratification of curiosity...” - 1920s: deny access to data on individuals - 1942: refused to supply War Dept. w/ addresses of Japanese-Americans after 1954: - census microdata do not reveal identities of individuals - basic geographical identifiers, low sample densities, masking, swapping, top-coding, re-coding In practice, not a single breach or allegation of a breach! Nordic Demography Symposium, Tjøme 2001
30
Heightened concerns about confidentiality in USA
Assault on privacy by businesses Distrust of “government” Never a question of use of census microdata. Yet must avoid any possible perception of mis-use to retain confidence and cooperation of citizens. Pro-active strategy: - Publicize confidentiality safe-guards - Offer a variety of microdata products: higher risks, higher security - Data enclaves: expensive, low usage, exceedingly detailed microdata Nordic Demography Symposium, Tjøme 2001
31
Globalization of the census & the coming census microdata revolution
1. Introduction: census & census microdata 2. The population census goes global coverage, periodicity, and content 3. Liberating census microdata: preservation, anonymization, integration, & dissemination 4. Statistical confidentiality and census samples: a 36 year-long perfect record 5. International norms of statistical confidentiality 6. Harmonizing and disseminating scientifically anonymized census samples: the case of IPUMSi Nordic Demography Symposium, Tjøme 2001
32
Nordic Demography Symposium, Tjøme 2001
‘statistical confidentiality’ shall mean the protection of data related to single statistical units which are obtained directly for statistical purposes or indirectly from administrative or other sources against any breach of the right to confidentiality. It implies the prevention of non-statistical utilization of the data obtained and unlawful disclosure. --COUNCIL REGULATION (EC) No 322/97 of 17 February 1997 Nordic Demography Symposium, Tjøme 2001
33
Nordic Demography Symposium, Tjøme 2001
Statistical confidentiality standards in Eurostat Countries (* = in IPUMSi consortium) Norway: Statistics Norway is prohibited to publish or disclose data from which information about individual persons or firms can be derived. Researchers may be given access to such information under strict rules and conditions. Guidelines provided by the Norwegian Data Inspectorate form the framework for internal management of data security. Other countries with strict provisions: *Austria, Canada, Denmark, Finland, *France, Germany, Ireland, Netherlands, Sweden Nordic Demography Symposium, Tjøme 2001
34
Nordic Demography Symposium, Tjøme 2001
Anonymized census microdata sample availability for European countries (* = in IPUMSi consortium, * = negotiating) 15 countries available via PAU, 1990 round (3 in IPUMSi), : Belgium, Czech Republic, Estonia, Finland, *Hungary, *Italy, Latvia, Lithuania, Norway, Poland, *Spain, Sweden, Switzerland, Turkey, *UK 11 countries not available via PAU (2 in IPUMSi): *Austria, Croatia, Denmark, *France, Germany, Iceland, Ireland, Netherlands, Portugal, Slovak Republic, Slovenia Nordic Demography Symposium, Tjøme 2001
35
Nordic Demography Symposium, Tjøme 2001
EUROSTAT statistical anonymity standards (Thorogood, 1999) --all accepted by IPUMSi 1. small sample size 2. limited geographical detail 3. top and bottom coding of unique categories 4. signed non-disclosure agreement 5. prohibit redistribution of datasets to third parties 6. prohibit attempts to identify individuals or the making any claim to that effect 7. require users to provide copies of publications Nordic Demography Symposium, Tjøme 2001
36
Nordic Demography Symposium, Tjøme 2001
EUROSTAT statistical anonymity standards (Thorogood, 1999) --all accepted by IPUMSi and more 8. Age (constructed, where necessary) 9. Never identify date of birth 10. Never identify place of birth 11. Migration: timing and place not identified in detail 12. Place of residence identified by major civil division (pop>60k, 120k, 250k, 1 million--national rule) 13. Sensitivity analysis of variables by national experts 14. Confidentiality assessment by national experts Nordic Demography Symposium, Tjøme 2001
37
Nordic Demography Symposium, Tjøme 2001
International Monetary Fund’s General Data Dissemination System 52 countries with uniform standards All embrace strict standards of statistical confidentiality Prohibit disclosure of information which may identify individuals or entities 37 countries distribute anonymized census microdata samples Nordic Demography Symposium, Tjøme 2001
38
Globalization of the census & the coming census microdata revolution
1. Introduction: census & census microdata 2. The population census goes global coverage, periodicity, and content 3. Liberating census microdata: preservation, anonymization, integration, & dissemination 4. Statistical confidentiality and census samples: a 36 year-long perfect record 5. International norms of statistical confidentiality 6. Harmonizing and disseminating scientifically anonymized census samples: the case of IPUMSi Nordic Demography Symposium, Tjøme 2001
39
Making the data usable ... and used.
I P U M Si Making the data usable ... and used. IPUMSi, ~20 countries Nordic Demography Symposium, Tjøme 2001
40
P A Y S National experts in each country are contracted to:
I P U M Si P A Y S National experts in each country are contracted to: Assemble microdata and documentation Develop samples to minimize confidentiality risks and maximize robustness Design national integration plan census-by-census concept-by-concept code-by-code Write integrated documentation Nordic Demography Symposium, Tjøme 2001
41
Nordic Demography Symposium, Tjøme 2001
I P U M Si I N T E G R A T E S Census documentation compiled for Colombian microdata Standard:UN/Eurostat Principles & Recs... Photos from Colombia integration project, February-March, 2000: 4 experts from DANE (census office) +7 academics (3 universities) Nordic Demography Symposium, Tjøme 2001
42
IPUMSi integration principles
1. Respect absolute anonymity 2. Preserve all original data, except adjustments to insure privacy (top codes blurrings, masking, re-ordering, etc.) 3. Harmonize codes for countries occupation: ISCO, HISCO (detailed, general) education: ISCED “ “ family: IPUMS, etc “ “ 4. Enhance with constructed variables Nordic Demography Symposium, Tjøme 2001
43
10 projects started I N T E G R A T E S I P U M Si
First 18 months USA , France , 1968, 1975, 1982, 1990 Norway , 1865, 1875, negotiating: 1960, 1970, 1980, 1990, 2001 Canada , 1881, 1901; negotiating: ; United Kingdom (1851, 1881), 1991; negotiating: 1961, 1971, 1981, 2001 Argentina 1869, 1895 Colombia 1964, 1973,1985, 1993, 2003 Vietnam 1989, 1999 Hungary 1970, 1980, 1990, 2000 Nordic Demography Symposium, Tjøme 2001
44
Nordic Demography Symposium, Tjøme 2001
I P U M Si I N T E G R A T E S 5 projects planned Mexico 1960, 1970, 1980, 1990, 2000 Spain 1981, 1991, 2001 Brazil 1960, 1970, 1980, 1991, 2001 China 1982, 1990, 2000 Kenya 1989, 1999 3 negotiations underway Ghana , 2000 Italy , 1991, 2001 Austria , 1981, 1991, 2001 Nordic Demography Symposium, Tjøme 2001
45
Country Census microdata
I P U M Si ? ? 7 future possibilities Country Census microdata a , 1870, 1880, 1950, 1960, 1970, , 1990, 2000 b , 1971, 1981, 1991, 2001 c , 1971, 1976, 1981, 1986, , 1996 d , 1965, 1970, 1975, 1980, , 1990, 1995 e , 1966, 1970, 1975, 1980, , 1990, 1995 f , 1981, 1991, 2001 g , 1980, 1990, 2000 and .... ??? Nordic Demography Symposium, Tjøme 2001
46
Nordic Demography Symposium, Tjøme 2001
I P U M Si A N O N Y M I Z E S Using the highest standards currently available: technical (Statistics Netherlands) administrative (license agreement) Imagine a new statistical product: a scientifically anonymized census microdata sample made up of unidentifiable individuals... Nordic Demography Symposium, Tjøme 2001
47
Nordic Demography Symposium, Tjøme 2001
IPUMSi preserves statistical confidentiality (in addition to NSO safe-guards): 1. Construct small samples 2. Suppress geographical detail (minor civil divisions and others with less than 100,000 population), date of birth, 3-4 digit occupational codes, etc. 3. Blur codes for sensitive variables where identity might be compromised (income) 4. Top-code income, education, etc. 5. Swap a small fraction of records 6. Assess confidentiality risks for unique records for all defined geographical areas (“ARGUS”, Statistics Netherlands) Nordic Demography Symposium, Tjøme 2001
48
Nordic Demography Symposium, Tjøme 2001
Repositories of anonymized census microdata samples for scientific research ICPSR, University of Michigan ACAP, University of Pennsylvania CELADE, Centro Latino Americano de Demografía, Santiago Chile. ECE/PAU, Population Affairs Unit, Geneva Switzerland. EWC, East-West Center, U. of Hawaii. IPUMSi, University of Minnesota. Will others (a Nordic institution?) join the effort? Nordic Demography Symposium, Tjøme 2001
49
D I S S EM I N A T E S International web-based access system
I P U M Si D I S S EM I N A T E S International web-based access system End-User license agreement protects privacy and confidentiality assures proper use User selects countries, cases, variables, and samples--makes cross-national research possible Open architecture software and mirror sites available to all partners Nordic Demography Symposium, Tjøme 2001
50
Why should Nordic countries participate now?
Legal and scientific foundations in place: EUROSTAT, France, Austria, UK, etc. Project has been underway 18 months of 5 year project; if resources are required, budget planning must begin soon. Historical census microdata projects are well advanced: 1801, 1865 (100% club), 1875, 1900. Time to turn to contemporary census microdata Nordic Demography Symposium, Tjøme 2001
51
additional information at: http://www.ipums.org * * * * * * Thank you
Nordic Demography Symposium, Tjøme 2001
52
Work plan, part II: make census microdata usable
3. Integrate: March National partners: -integrate phase I countries using UN/Eurostat Principles & Recommendations -help to design prototype Analyze all concepts, variables and codes of census schedules for 30 target countries -help to implement for phase I and II countries 4. Disseminate: -October 2004 - Design international data access engine - Implement with phase I and II countries Nordic Demography Symposium, Tjøme 2001
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.