Presentation is loading. Please wait.

Presentation is loading. Please wait.

5. Integration of Microdata and Metadata (9 slides)

Similar presentations


Presentation on theme: "5. Integration of Microdata and Metadata (9 slides)"— Presentation transcript:

1 5. Integration of Microdata and Metadata (9 slides)

2 Integrating samples » Project requests new, high quality samples for historical censuses (before 1995) –drawn to uniform criteria » Complete documentation, in official language and English translation » Census forms » Enumerator instructions » Data dictionaries and codebooks » Complete metadata for all samples » Systematic metadata for all variables » Universe » Definitions » Comparability » Dynamic metadata system to compare phrasing of any question in any combination of countries and censuses

3 IPUMS-International: High precision samples with implicit stratification » Suppress all identifying information: names, id numbers, street addresses, low-level administrative geography (NUTS-5, NUTS-4?, NUTS-3?, NUTS-2?) » Sample is stratified by lowest level geography (census tract) » Lower standard errors than a classic random sample—to the extent that variables of interest are correlated with geography » Implicit geographical stratification is equivalent to extremely fine geographic stratification with proportional weighting » Many of our NSI partners have adopted the IPUMS sample design (see table). » 26 countries provided 100% microdata for the MPC to draw the sample » Europe: almost all NSIs have drawn samples to IPUMS specs. for all censuses » 188 High precision samples for 78 countries entrusting microdata (08/08/2007) » 10% samples: 125 censuses62 countries » 5% 3410 » <5% 29 6

4 IPUMS-International: High precision samples with implicit stratification » Suppress all identifying information: names, id numbers, street addresses, low-level administrative geography (NUTS-5, NUTS-4?, NUTS-3?, NUTS-2?) » Sample is stratified by lowest level geography (census tract) » Lower standard errors than a classic random sample—to the extent that variables of interest are correlated with geography » Implicit geographical stratification is equivalent to extremely fine geographic stratification with proportional weighting » Many of our NSI partners have adopted the IPUMS sample design (see table). » 26 countries provided 100% microdata for the MPC to draw the sample » Europe: almost all NSIs have drawn samples to IPUMS specs. for all censuses » 188 High precision samples for 78 countries entrusting microdata (08/08/2007) » 10% samples: 125 censuses62 countries » 5% 3410 » <5% 29 6

5 International and chronological integration of microdata and metadata: methods and procedures » Integrated microdata » Challenge: retain all significant detail, integrate everything » Solution: composite coding scheme (multiple digits, each carries meaning—think ISCO) » Use international standards where appropriate » Integrated metadata (documentation) » Summarize and describe commonalities » Explain unique characteristics » Dynamically generate metadata according to needs (countries, samples, variables) of individual researcher using XML database

6 Use international standards as points of departure: » UNESCO (1997) The International Standard Classification of Education (isced 1997). » International Labor Office (1990) International Standard Classification of Occupations (isco-88). » United Nations Statistics Division (1990) International Standard Industrial Classification of All Economic Activities (isic-88). » United Nations Economic Commission for Europe (1999). Recommendations for the 2000 Censuses of Population and Housing in the ECE Region.

7 I N T E G R A T E S I P U M Si » Integrate (harmonize), not standardize 1. retain all original detail 2. harmonize every digit » How is this possible? Composite codes (multiple digits, 111) Not serial (1, 2, 3,....) (example: next slide) » Why? Researcher confidently understands the logic uses as much detail as needed

8 Composite coding scheme: Employment Status Composite coding scheme: Employment Status

9 Metadata : Employment Status Metadata : Employment Status EMPSTAT Employment status Description EMPSTAT indicates whether or not the respondent was part of the labor force -- working or seeking work -- over a specified period of time. Depending on the sample, EMPSTAT can also convey further information. The first digit of EMPSTAT is fully comparable, and classifies the population into three groups: employed, unemployed, and inactive. The combination of employed and unemployed yields the total labor force. The second and third digits of EMPSTAT preserve additional information available for some countries and census years but not for others. Employment status is sometimes referred to in other sources as "activity status." Comparability -- General The age of persons to whom the question applies varies across the samples (see Universe). The reference period for the employment status question varies. For most samples, employment status was reported with respect to the day of the census or…

10 Metadata : Employment Status, example: Mexico Metadata : Employment Status, example: Mexico Comparability -- Mexico The universe and reference period are fully comparable across the Mexico samples. The 1970 Census did not provide detail on the inactive population except for "houseworkers," while the later samples have numerous subcategories. In 1990, the employment status question refers to "Principal Activity" and therefore under- reports secondary economic activity by students, housewives, family-workers, the semi- retired, and others. The 2000 Census sought to overcome deficiencies in reporting work status for people whose primary activity was not work (students, housewives, retirees, etc.), but who in fact were working according to international definitions. A second question introduced for the first time in 2000 sought to capture this secondary economic activity. For strict comparability with earlier Mexican censuses, this recovered activity (codes 1101-1106) should be considered "inactive." … Integrate: retain all significant detail, harmonize everything Not standardize: force square pegs in round holes


Download ppt "5. Integration of Microdata and Metadata (9 slides)"

Similar presentations


Ads by Google