Establishing Comparable Poverty Estimates in Serbia (and elsewhere…) Jill Luoto January 25, 2007 Western Balkans Poverty Analysis Course: World Bank.

Slides:



Advertisements
Similar presentations
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Advertisements

SADC Course in Statistics Modelling ideas in general – an appreciation (Session 20)
ADePT Automated DECs Poverty Tables Michael Lokshin, Zurab Sajaia and Sergiy Radyakin DECRG-PO The World Bank.
The Simple Regression Model
Methods of Economic Investigation Lecture 2
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Micro-level Estimation of Child Undernutrition Indicators in Cambodia Tomoki FUJII Singapore Management
Further Updating Poverty Mapping in Albania Gianni Betti*, Andrew Dabalen**, Celine Ferrè** and Laura Neri* * University of Siena, Italy, ** The World.
Specification Error II
Tanzania poverty update Poverty Monitoring Group (PMG) September 4, 2014.
Multiple Linear Regression Model
2.5 Variances of the OLS Estimators
Statistics for Managers Using Microsoft® Excel 5th Edition
Topic 3: Regression.
Review.
Poverty and Income Distribution in Ethiopia: By Abebe Shimeles, PhD.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Palestinian Central Bureau of Statistics (PCBS) Palestine Poverty Maps 2009 March
Measuring and Monitoring Poverty for the MDGs Johan A. Mistiaen Economist-Statistician Development Data Group The World Bank Overview of the Approach and.
Getting Started with Hypothesis Testing The Single Sample.
Squeezing more out of existing data sources: Small Area Estimation of Welfare Indicators Berk Özler The World Bank Development Research Group, Poverty.
The Research Process. Purposes of Research  Exploration gaining some familiarity with a topic, discovering some of its main dimensions, and possibly.
Constructing the Welfare Aggregate Part 2: Adjusting for Differences Across Individuals Bosnia and Herzegovina Poverty Analysis Workshop September 17-21,
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Objectives of Multiple Regression
The new HBS Chisinau, 26 October Outline 1.How the HBS changed 2.Assessment of data quality 3.Data comparability 4.Conclusions.
3. Multiple Regression Analysis: Estimation -Although bivariate linear regressions are sometimes useful, they are often unrealistic -SLR.4, that all factors.
Linear Regression Inference
Introduction to plausible values National Research Coordinators Meeting Madrid, February 2010.
One-Factor Experiments Andy Wang CIS 5930 Computer Systems Performance Analysis.
Quantitative Methods Heteroskedasticity.
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
Distributional Implications of Power Sector Reforms in the Philippines WONDIELYN Q. MANALO-MACUA University of Tsukuba.
PARAMETRIC STATISTICAL INFERENCE
Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American.
Copyright © 2011 Pearson Education, Inc. Analysis of Variance Chapter 26.
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Managerial Economics Demand Estimation & Forecasting.
Public Policy Analysis ECON 3386 Anant Nyshadham.
The dynamics of poverty in Ethiopia : persistence, state dependence and transitory shocks By Abebe Shimeles, PHD.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Chapter 6: 1 Sampling. Introduction Sampling - the process of selecting observations Often not possible to collect information from all persons or other.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
General Linear Model.
Targeting of Public Spending Menno Pradhan Senior Poverty Economist The World Bank office, Jakarta.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Constructing the Welfare Aggregate Part 2: Adjusting for Differences Across Individuals Salman Zaidi Washington DC, January 19th,
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Expert Group on Measuring Poverty and Social Exclusion in the Western Balkans: Summary and Main Recommendations Gero Carletto Development Research Group.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 26 Analysis of Variance.
Workshop on MDG, Bangkok, Jan.2009 MDG 3.2: Share of women in wage employment in the non-agricultural sector National and global data.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
1 1 A Sustainable Poverty Monitoring System for Policy Decisions Bjørn K. G. Wold, Astrid Mathiassen and Geir Øvensen Division for Development Cooperation,
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
DATA FOR EVIDENCE-BASED POLICY MAKING Dr. Tara Vishwanath, World Bank.
Conjoint Analysis. 1. Managers frequently want to know what utility a particular product feature or service feature will have for a consumer. 2. Conjoint.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
AC 1.2 present the survey methodology and sampling frame used
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
Mapping MPI and Monetary Poverty: The Case of Uganda
Gathering and Organizing Data
Migration and the Labour Market
CH2. Cleaning and Transforming Data
Statistical Data Analysis
Gathering and Organizing Data
Poverty Maps for Sri Lanka
Presentation transcript:

Establishing Comparable Poverty Estimates in Serbia (and elsewhere…) Jill Luoto January 25, 2007 Western Balkans Poverty Analysis Course: World Bank

Goals Introduce an adaptation of the poverty mapping methodology that enables the prediction of new poverty estimates that are strictly comparable when otherwise incomparable welfare estimates exist Present brief summary of findings for Serbia Lead everyone in an exercise using the PovMap software on Serbian data

The Problem Estimating the evolution of poverty in Serbia over recent years is complicated by a change in official surveys –Living Standards Measurement Survey (LSMS) implemented in 2002 and 2003 –Household Budget Survey (HBS) implemented The two survey instruments have different consumption modules. Some of the differences include: –LSMS included a list of item codes for consumption goods –HBS utilized open diary format –Different recall periods: 1 week in LSMS, 2 weeks in HBS –Different imputation procedures for housing rents and other expenditure items All in all, many differences in the way consumption and resulting poverty were estimated across surveys

Different Consumption Definitions Lead to… Incomparable Poverty Estimates –Lanjouw and Lanjouw (2001) offer real world examples where only slight changes in the definition of the consumption aggregate affect resulting poverty estimates dramatically For Serbia, the different consumption modules between LSMS and HBS have caused policymakers to generally consider their respective poverty estimates not to be comparable –This leaves open the question as to what happened to poverty in Serbia between 2003 and 2005

Possible Solution: Adaptation of the poverty-mapping methodology that aims to reconcile comparability of consumption definitions across surveys Other components of LSMS and HBS collect similar information –Geographic information –Household demographics –Asset ownership –Education and Labor Information Instead of imputing consumption definition from a survey into a census across space, impute from survey to survey across time –Necessarily ensures an identical definition of consumption across data sources –Implicit assumption that the relationship between consumption and its correlates remains stable over timeassumption

Methodology, In Brief Establish the completely comparable components between surveys Estimate a model of consumption in one survey using as explanatory variables only those correlates of consumption that are comparably defined across surveys Take the point estimates from that model of consumption and impute them into the other survey to estimate new consumption figures using same set of explanatory variables Derive new estimate of poverty using predicted consumption figures

Example: Two Surveys, Years 1 and 2 Find comparable survey components such that X 1 and X 2 have equal definitions, i.e., Examples of X variables: Household Demographics Education of HH Members Asset Ownership Housing Quality Indicators X 1 =X 2

X1X1 X1X1 X1X1 X2X2 Poverty Mapping typically imputes consumption definition across space from a survey to a census (within same year or short period of time) This adaptation imputes a consumption definition from one survey into another from a different year, i.e., across time

Implementation Gather all of the variables that collect similar information in LSMS and HBS (there are many…) –Generally 5 main categories for the types of information that are useful in describing a household’s welfare and commonly collected in surveys: »Geographic Information »Demographics »Education and Profession Variables »Asset Ownership/Wealth Indicators »Basic Health Information –Define new variable in each dataset that has same definition (and name) across datasets –Compare means, distributions of similar variables across surveys to ensure capturing same information

Finding common variables across surveys… HBS QuestionnaireLSMS Questionnaire

Restoring Comparability to Education Variables My variable definition matches exactly between surveys My Definition:

Importing data into PovMap We will be using subsets of pre-made Stata datasets from LSMS 2003 and HBS 2005 that have been matched and have identical variable names Go to: File  New Project  Name your project Each dataset must have a hierarchical household-level identifying variable that can be truncated to identify the cluster –Example: HID=32601Cluster=326

Stage 1: “Checker” Stage Compare distributions of variables across datasets If you think after this final stage of comparison that the variables are truly capturing the same information, “set” the variable to be included as a potential regressor Since we’re imputing from one survey to another from different year, it’s important to keep in mind that some variables are going to change over time, e.g., % owning cell phones

Summary Statistics for Comparably Defined Variables: Geographic Variables

Summary Statistics: Demographic Variables

You can also compare the entire distributions of similar variables across data sources

Summary Statistics: Education and Profession Variables

Summary Statistics: Housing Quality Indicators

Summary Statistics: Durables Ownership Variables

Stage 2: Building a Consumption Model You've chosen all of the potential explanatory variables after all phases of screening (comparing surveys, comparing distributions) and now you move on to building your model of consumption Categorical variables are translated into a sequence of dummies Build models stepwise or “intuitively” choose explanatory variables using OLS Aim for highest R 2 possible to best capture variation in household welfare levels Simultaneity and Omitted Variables Bias are not important for our purposes

Estimate consumption on subset of variables comparably defined across surveys; aim for highest R 2 Regression results from LSMS 2003

Stage 3: Cluster Effects Decompose the error term into a cluster effect and an idiosyncratic household effect: –This stage deals with modeling the cluster effect Since disturbance terms are likely to be correlated within clusters (due to unobserved geographic and other factors beyond those already included as regressors), this stage accounts for this by estimating a cluster random effect If you click on the "no locational effect" button, you take away this cluster effect from your estimation –Underestimated standard errors

Stage 4: Idiosyncratic Model Here, still using the base survey data, you are trying to model the heteroskedasticity of the household idiosyncratic effect to allow it a more flexible form This stage tries to model the variance in the household-specific error terms as functions of the included X variables and combinations of variables Can use stepwise modeling or basic OLS or any other method to choose the explanatory variables that best explain variation in the household idiosyncratic effect Generally very low R 2 ’s in this stage (it’s all unobserved variation); is sufficient

Stage 5: Household Effects Shows you a plot of the residuals from the model of the idiosyncratic household level error terms The “Prediction Plot” generally shows that your predictions aren’t so great here Empirical distribution of residuals can be compared to the normal and t-distributions (of varying degrees of freedom)

Stage 6: Simulation Here, we need to simulate the residual terms (both the cluster effect and the household idiosyncratic effect) since they are necessarily unknown in the latter survey (or census) Distributional forms: You can either impose a normal distribution or allow for a more flexible semi- parametric distributional form using information from the predicted residuals from base data Choose the level of aggregation of your poverty estimates Choose poverty line, household size variable, and poverty indicators of your choosing Go!

Stage 7: Results Compare your resulting estimates of poverty with the baseline estimates from first survey

Conclusions For Serbia, this exercise suggests a gradual decline in poverty between 2003 and 2005 Resulting poverty headcount estimate of 7.5% based on models of consumption from both LSMS 2002 and LSMS 2003 –Lower than official estimate of 9.1 for 2005 based on consumption module of HBS –Nearly 30% drop in poverty from LSMS 2003 headcount estimate of 10.5 if results are believed This methodology can be used in a variety of settings to restore comparability of surveys to estimate evolution of poverty over time within a country or region Download the PovMap software at: