Methodological Foundations of the Web-based PayWizard Dataset (USA) Isabelle Ferreras (Harvard Labor and Worklife Program/ FNRS/Louvain ) Damian Raess (MIT) Members of the team headed by Prof. Richard Freeman (HarvardLWP/NBER) - A 5 Minute Presentation - 3rd Global WageIndicator Conference. Amsterdam, April 16, 2008
General Approach How to generalize from a non-representative sample? Building a strategy to probe the significance of the web-generated dataset: –1/ Questionnaire design –2/ Basic statistical tests (a simple model determining income) – 2.a. Correlation analysis – 2.b. Regression analysis – 2.c. Weighing might not be enough –3/ More to come... With YOU! 18 months after launching, a tough road in the US: a highly competitive market ; now visitors a month, waiting for completed surveys! Number of observations so far = 3000
1/ Questionnaire design 1/ Locate the “reference survey” in your country = a random sample representative of the entire US population In the USA: US Current Population Survey (generated by the Bureau of Labor Statistics and the Census Bureau) 2/ Identify the questions of the WageIndicator survey that correspond to questions covered by that “reference survey”, and aim at matching these. Example: the wage question(s) should be conformed to the one of your “reference survey”.
2/ Basic statistical tests –A simple model determining income by age, sex, education –Descriptive variables –Sex, percentage of men: CPS, 51%; PayW, 54% –Age, median: CPS, 41; PayW, 37 –Education, years of schooling
2/ Basic statistical tests –Weekly wage (logwage)
2/ Basic statistical tests –A simple model determining income –2.a. Correlation analysis
2/ Basic statistical tests –A simple model determining income –2.b. Regression analysis
2/ Basic statistical tests –A simple model determining income –2.b. Regression analysis
2/ Basic statistical tests - 2.b. Regression analysis
Conclusion: Weighing might not be enough 3/ More to come... We need YOU! –A collaborative effort... THANK YOU ! &