Tricky Data Issues in the PEN Datasets

Slides:



Advertisements
Similar presentations
SADC Course in Statistics Common complications when analysing survey data Module I3 Sessions 14 to 16.
Advertisements

Advantages and limitations of non- and quasi-experimental methods Module 2.2.
Social Welfare gains from Community Forests In Orissa, India By, Jon Barnes.
CE Overview Jay T. Ryan Chief, Division of Consumer Expenditure Survey December 8, 2010.
PEAS wprkshop 2 Non-response and what to do about it Gillian Raab Professor of Applied Statistics Napier University.
How data quality affects poverty and inequality measurement PovcalNet team DECPI The World Bank.
Constructing the Welfare Aggregate Part 2: Adjusting for Differences Across Individuals Bosnia and Herzegovina Poverty Analysis Workshop September 17-21,
Econ 231: Natural Resources and Environmental Economics SCHOOL OF APPLIED ECONOMICS.
N ational T ransfer A ccounts 1 The Lifecycle Deficit: A Review Sang-Hyop Lee University of Hawaii at Manoa.
African Centre for Statistics United Nations Economic Commission for Africa Handbook on Supply and Use Table: Compilation, Application, and Good Practices.
Estimating Living Wage Globally Martin Guzi Masaryk University, Czech Republic WTO Public Forum 2014.
Welcome and time use data orientation Gretchen Donehower NTA Time Use and Gender Workshop Tuesday, October 23, 2012 Facultad de Ciencias Sociales, Universidad.
Poverty measurement: experience of the Republic of Moldova UNECE, Measuring poverty, 4 May 2015.
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
Creating a collection of standardized datasets on household consumption Olivier Dupriez World Bank, Development Data Group 6 June.
Adjusting for Family Composition and Size Module 4: Poverty Measurement and Analysis February, 2008.
Handling Attrition and Non- response in the 1970 British Cohort Study Tarek Mostafa Institute of Education – University of London.
Workshop on Price Index Compilation Issues February 23-27, 2015 Imputation of Missing Values, Seasonal Products and Quality Changes Gefinor Rotana Hotel,
1 A framework for international comparisons of volume and prices in health care Interim Report Manfred Huber 7 th Meeting of HA Experts and Correspondents.
Expert Group Meeting on MDG, Astana, 5-8 Oct.2009 MDG 3.2: Share of women in wage employment in the non-agricultural sector Sources of discrepancies between.
Item-Non-Response and Imputation of Labor Income in Panel Surveys: A Cross-National Comparison ITEM-NON-RESPONSE AND IMPUTATION OF LABOR INCOME IN PANEL.
Constructing the Welfare Aggregate Part 2: Adjusting for Differences Across Individuals Salman Zaidi Washington DC, January 19th,
Quality Change and Price Indexes Comments by Ellen Dulberger Brookings Workshop on Economic Measurement, February 1, 2001 Second Session Hedonic Price.
CPI Measurement Problems The Case of Malawi By Charles Machinjili National Statistical Office Malawi.
Economic valuation OF NATURAL RESOURCES
INFO 4470/ILRLE 4470 Visualization Tools and Data Quality John M. Abowd and Lars Vilhuber March 16, 2011.
Workshop on MDG, Bangkok, Jan.2009 MDG 3.2: Share of women in wage employment in the non-agricultural sector National and global data.
How do we know when we are better off?.  Satisfy our wants and needs  We do this through purchasing goods and services  Goods and services gives us.
How do we know when we are better off?.  Satisfy our wants and needs  We do this through purchasing goods and services  Goods and services gives us.
National Income.
PEN STUDY MALAWI (Chimaliro and Liwonde Forests) PRELIMINARY RESULTS
Monika Singh University of British Columbia
PEN in DRC Riyong Kim Bakkegaard Putting PEN to Paper March 2009
Sampling and Experimentation
VIET NAM Mini-PRESENTATION
Handling Attrition and Non-response in the 1970 British Cohort Study
PEN in Uganda CIFOR, March 2009
PEN INDIA RESEARCH Putting PEN to Paper
Food Balance Sheets FBS component: Food availability.
PEN partner mini-presentations PEN workshop, Bogor, March
Summary – what’s next? Arild Angelsen PEN workshop, Bogor, March 2009.
The Lifecycle Deficit: A Review
Forest dependency in the Brazilian and Bolivian Amazon
The Language of Sampling
Obtaining information on non-responders: a development of the basic question approach for surveys of individuals Patten Smith (Ipsos MORI) Richard Harry.
Sampling And Sampling Methods.
SAMPLING (Zikmund, Chapter 12.
Introduction to Survey Data Analysis
Training course to enhance collection of fisheries and aquaculture statistics Module 5 – Obtaining SSF and aquaculture statistics through a household.
The Language of Sampling
MDG Labour Indicators: Measurement, availability and discrepancies of data MDG 3.2: Share of women in wage employment in the non-agricultural sector ILO.
Food Balance Sheets FBS component: Food availability.
The European Statistical Training Programme (ESTP)
Conservation by cultivation: Linkages between an endangered endemic fir (Abies guatemalensis Rehder) and peasant economies in the western highlands of.
Day 1 Parameters, Statistics, and Sampling Methods
SAMPLING (Zikmund, Chapter 12).
Evaluating Impacts: An Overview of Quantitative Methods
Lifecycle Deficit (Consumption & Labor Income)
Food Balance Sheets FBS component: Food Processing.
Task Force on Environmental transfers of the Working Group on
Hello and Welcome to… Data analysis
Day 1 Parameters, Statistics, and Sampling Methods
Gross Domestic Product & Growth
Gathering and Organizing Data
Item 20: Price and Volume Measures
Agenda item 5.2 Methodology
National Income.
Treatment of Missing Data Pres. 8
Rural income and forest dependence –some evidence from Guatemala
Chapter 13: Item nonresponse
Presentation transcript:

Tricky Data Issues in the PEN Datasets Monica Fisher & Arild Angelsen March 26, 2009

Organization Introduce the four main tricky data issues. Break into groups of 5-10 people to discuss each issue and the recommended solution. (Spend ~30 minutes in discussion.) One volunteer from each group reports back on the pros/cons of the recommended solution to each data problem. Other recommended solutions are very welcome. Discussion. 2 2

The Four Main Issues Missing data problems: Wave nonresponse (quarter is missing) Item nonresponse (fields missing) Challenges to meaningful welfare comparisons: Inter-household differences in size and composition Inter-country (26 PEN countries) price level differences 3 3

Missing Data 1: Wave Nonresponse What is the problem? Possible solutions: Case deletion Single imputation Simple sample mean or conditional mean Hot deck (randomly matched to similar hh) Regression Multiple imputation Estimates are uncertain -> several datasets Recommended solution: Case deletion for cases having less than three quarters of income data 4 4

Imputation of sectoral incomes Multiple imputation regression using income other quarters, and hh and village characteristics Single imputation: formula: forinc3,i (pred) = (forinc1,i + forinc2,i + forinc4,i) * forinc3,v /(forinc1,v + *forinc2,v + forinc4,v) = forinc3,v * (forinc1,i + forinc2,i + forinc4,i)/(forinc1,v + forinc2,v + forinc4,v) (so HH forest income + seasonal adjustment) 5 5

Missing Data 2: Item Nonresponse What is the problem? Possible solutions (same as for wave nr): Case deletion Single imputation Multiple imputation Recommended solution: Single imputation: Regression to derive a simple formula 6 6

Welfare Comparison 1: Household What is the problem? Some families are bigger than others Some eat more than others One TV is enough Some possible solutions: Per capita adjustment Adult equivalence scales Nutritional equivalence scales Recommended solution: Nutritional equivalence scale 7 7

Welfare Comparison 2: Country What is the problem? Different currencies used Price levels differ Recommended solution: Purchasing power parity 8 8

Aggregation issue: Definitions of forest –env income Origin of product Cultivated Collected Forest Forest income (incl. plantations) Forest income Non-forest Agricultural income Environmental income (non-forest env.inc.) The forest product should depend on the existence of forests, but some in-between categories: Minerals? Fish? Plantations? FAO definition? Read PEN guidelines! *

Other issues Calculating net income Allocating agr costs Negative income? Uneven timing of surveys Increasing income over survey methods Data aggregation Appropriate categories Averages Pricing subsistence products, part. firewood 10 10

Concluding remarks Get all issues on the table Some experimental work needed: the cost and benefit of more refined methods getting a simple formula (optimal ignorance) “Do things are simple as possible, but not simpler” (Albert Einstein) 11 11

Group discussion Discuss and suggest solutions on how to deal with major issues outlined: Missing values Income categories Firewood pricing List any new ‘tricky’ data issues 30 min – group; plenary I (far corner): born 1-7; II (near corner): 8-15; III (coffee table): 16-23 IV(miombo); : 24-31 12 12

Group 1 Missing values: Firewood pricing: Missing data reflects reality, careful to impute -10: I don’t want to respond Firewood pricing: No market = zero price ? Use value for price? Underestimation of illegal activities Categories of income: Distinction forest and non-forest env. Income

Group 2 What is a forest? Be consistent Negative values: Seasonality (timing costs and harvest) Look at large input expenses in Q4, but also see how fit with income data in Q1 Poor harvest (ok) No particular forest product dominates (except fuelwood) Might be an aggregation problem (e.g. aggregate types of fruits) Some products not considered forest products Probing done by enumerators Wage income and business income: disaggregate Missing values: Do as Ronnie says Income categories – adult eq. Use regionally differentiated scales? Firewood pricing Compare PEN and official price figures WTP – use ‘local estimated price’ instead; respondents have difficulties to put price on non-market items

Group 4 Area estimates, intercropping Fuelwood prices: meta analysis , how priced? Other fuelwood price studies? Missing quarters Simplest formula that we are confident in Experiment with more advanced methods High attrition rates: any biases? Reprentativeness of studies: “Meta study of case studies with good data” Adult equivalents: Agree on some simple ways to calculate that

Group 3

Additional New Tricky Issues Timing of surveys vs time-value of money (USD): How do we compare the different surveys? Allocation of input costs: what about subsistence inputs? Need to standardize these costs. Definition of forest products? Need to standardize. Definitions are systematically applied. How to deal with the site selection bias? Cost estimation of other inputs e.g. Fodder. How do we value increase in number of livestock and weight?

Fuelwood Pricing Need to crosscheck proportion of energy income spent by other hhs away from forests vs those close to the forests. Hypothesis: Proportion of energy spending for the latter < the former.

Income Categories Household vs village level costs/prices? WTP vs market price? Hypothesis: WTP <= market price