The Luxembourg Income Study at Age 25 Providing Protected Microdata for the Analysis of Income, Employment and Wealth Prepared for 56th Session of the International Statistical Institute 24 Aug 2007
Outline Background Microdata Access LIS Data Security Data Comparability
LIS Background Research center and microdata archive Founded founded in 1983 International board of advisers Financed by the participating countries supplemental funding from the Luxembourg government and outside grants
LIS Background Staff Janet Gornick, Director Professor of Political Science and Sociology Baruch College, CUNY, U.S.A. Markus Jantti, Research Director Professor of Economics Åbo Akademi University, Finland Two offices Luxembourg & U.S. 7 researchers and statisticians
Goal Promote comparative social science and policy research Provide secure method for microdata access LIS handles contact with data providers Administration Confidentiality agreements Translation Allows access not available to individual researchers Country-specific privacy restrictions Harmonise social and economic microdata Standardised variable names promote ease of use Content selection improves comparability
LIS Data Base Socio-economic micro-data Availability private households representative of the country population Availability 6 waves every 5 years starting in 1980 every 4 years starting with Wave VI (2004) historical data base pre-1980 32 countries expanding in Wave VI and beyond 150 current data sets 19 Wave VI data sets ready to be lissified
LIS Data Base
LIS Research Used by over 1,500 researchers in 35 countries to analyze economic and social policy effects poverty income inequality employment status wage patterns gender inequality family formation child-wellbeing health status immigration political behavior and public opinion women’s economic status economic gender inequality
LIS Research Articles in major academic journals Economics Political science Sociology Well-known for use in measuring comparative poverty and income inequality Used to inform OECD, UN, World Bank Instrumental in changes in child policy in Great Britain Bradshaw and Chen (1997)
Microdata Access Public Access Restricted Access Key Figures Working Paper Series User Support Restricted Access Microdata Programming Access (“LISsy”) Web Tabulator Visiting Scholar Program Workshops
LISsy System Purpose Results Allow for user-programmed analysis Programs run directly on microdata Use any of 3 popular statistical packages Stata, SAS, SPSS Fully automated system running 24/7 Results Output from user programs Extent of analysis limited only by software functionality and user ability Not confined to pre-packaged tables or analyses
Web Tabulator Purpose Results Provide user-friendly access to microdata design and create tables derived from LIS datasets no need for knowledge of statistical packages secure internet interface Results User-created cross-national cross-tabulations view on-line export results to text file
Visiting Scholar Program Purpose Provide direct access to subset of LIS microdata Individual software need Analysis requires viewing individual records Direct access to LIS experts Application process LIS pays all expenses Results Individually-tailored analysis Knowledge and guidance of LIS staff
Workshops Purpose Results Provide intensive training for new users Annual summer workshop 25-30 researchers Week-long training course Taught by entire staff Outside experts and researchers Country workshops Individually-tailored workshops LIS staff travels to researchers
LIS Data Security Users must meet specific criteria Available for social science research purposes only No private or commercial use is permitted Researchers must be: Working for or attending an academic organization Member of government or non-profit organization research departments
LIS Data Security Users must register to analyze data Describe research objectives and projected length of project Sign a confidentiality pledge Not to attempt to identify individuals Not to attempt to copy or list individual records Re-register annually update contact information renew confidentiality pledge
LIS Data Security Accessed by users through e-mail E-mail system is not in direct contact with the microdata Database cannot be downloaded
LIS Data Security Users submit requests by e-mail LISsy : Clearly identifies sender LISsy : Accepts requests from e-mail server Checks e-mail request Identity Registration Security & confidentiality issues Sends requests to batch processing Checks output for confidentiality issues Returns completed listings by e-mail Only to the address given during the registration process Listings only returned with aggregated information
Balancing Access and Data Security Provides user-friendly microdata access Remote access 3 popular statistical packages Individual-specific analysis Speed Maintains confidentiality and security Country-specific requirements General issues Registration and identification of user User output
Comparability Challenges Country : Different institutions/societal norms across countries Surveys Different types of original collection instrument Level of detail of information collected differs Time : Changes in institutions and surveys Technical differences Weighting procedures Treatment of missing values, imputation methods Topcoding Differences in confidentiality requirements
LIS Golden Rules Maximise comparability Ease of use Allow flexibility Set clear definitions for each variable Follow the definitions as much as possible Preserve cross-sectional comparability first and comparability over waves after Ease of use Create standardised codes within variables Allow flexibility Keep country-specific detail to allow users to redefine to suit their specific needs Preserve as much detail as possible Document Warn users of all deviations from the ideal definition
On-line Documentation Survey information Technical information on the original survey Data collection methods, reference period, sample, sampling errors etc Lissification tables Precise definition and contents of each LIS variable Explains deviations from ideal variable definitions Basic descriptives Unweighted descriptive statistics of each variable Institutional information exhaustive information on the tax and transfer programs corresponding to microdata variables
Additional Help Additional comparative databases country-level policy indicators welfare states database family policy database
Future Challenges Expansion New countries Less likely to conform to existing definitions Need to add more country-specific information in addition to core LIS variables? What is the right mix of data expansion and comparability?