Download presentation
Presentation is loading. Please wait.
1
Tricky Data Issues in the PEN Datasets
Monica Fisher & Arild Angelsen March 26, 2009
2
Organization Introduce the four main tricky data issues.
Break into groups of 5-10 people to discuss each issue and the recommended solution. (Spend ~30 minutes in discussion.) One volunteer from each group reports back on the pros/cons of the recommended solution to each data problem. Other recommended solutions are very welcome. Discussion. 2 2
3
The Four Main Issues Missing data problems:
Wave nonresponse (quarter is missing) Item nonresponse (fields missing) Challenges to meaningful welfare comparisons: Inter-household differences in size and composition Inter-country (26 PEN countries) price level differences 3 3
4
Missing Data 1: Wave Nonresponse
What is the problem? Possible solutions: Case deletion Single imputation Simple sample mean or conditional mean Hot deck (randomly matched to similar hh) Regression Multiple imputation Estimates are uncertain -> several datasets Recommended solution: Case deletion for cases having less than three quarters of income data 4 4
5
Imputation of sectoral incomes
Multiple imputation regression using income other quarters, and hh and village characteristics Single imputation: formula: forinc3,i (pred) = (forinc1,i + forinc2,i + forinc4,i) * forinc3,v /(forinc1,v + *forinc2,v + forinc4,v) = forinc3,v * (forinc1,i + forinc2,i + forinc4,i)/(forinc1,v forinc2,v + forinc4,v) (so HH forest income + seasonal adjustment) 5 5
6
Missing Data 2: Item Nonresponse
What is the problem? Possible solutions (same as for wave nr): Case deletion Single imputation Multiple imputation Recommended solution: Single imputation: Regression to derive a simple formula 6 6
7
Welfare Comparison 1: Household
What is the problem? Some families are bigger than others Some eat more than others One TV is enough Some possible solutions: Per capita adjustment Adult equivalence scales Nutritional equivalence scales Recommended solution: Nutritional equivalence scale 7 7
8
Welfare Comparison 2: Country
What is the problem? Different currencies used Price levels differ Recommended solution: Purchasing power parity 8 8
9
Aggregation issue: Definitions of forest –env income
Origin of product Cultivated Collected Forest Forest income (incl. plantations) Forest income Non-forest Agricultural income Environmental income (non-forest env.inc.) The forest product should depend on the existence of forests, but some in-between categories: Minerals? Fish? Plantations? FAO definition? Read PEN guidelines! *
10
Other issues Calculating net income Allocating agr costs
Negative income? Uneven timing of surveys Increasing income over survey methods Data aggregation Appropriate categories Averages Pricing subsistence products, part. firewood 10 10
11
Concluding remarks Get all issues on the table
Some experimental work needed: the cost and benefit of more refined methods getting a simple formula (optimal ignorance) “Do things are simple as possible, but not simpler” (Albert Einstein) 11 11
12
Group discussion Discuss and suggest solutions on how to deal with major issues outlined: Missing values Income categories Firewood pricing List any new ‘tricky’ data issues 30 min – group; plenary I (far corner): born 1-7; II (near corner): 8-15; III (coffee table): IV(miombo); : 24-31 12 12
13
Group 1 Missing values: Firewood pricing:
Missing data reflects reality, careful to impute -10: I don’t want to respond Firewood pricing: No market = zero price ? Use value for price? Underestimation of illegal activities Categories of income: Distinction forest and non-forest env. Income
14
Group 2 What is a forest? Be consistent Negative values:
Seasonality (timing costs and harvest) Look at large input expenses in Q4, but also see how fit with income data in Q1 Poor harvest (ok) No particular forest product dominates (except fuelwood) Might be an aggregation problem (e.g. aggregate types of fruits) Some products not considered forest products Probing done by enumerators Wage income and business income: disaggregate Missing values: Do as Ronnie says Income categories – adult eq. Use regionally differentiated scales? Firewood pricing Compare PEN and official price figures WTP – use ‘local estimated price’ instead; respondents have difficulties to put price on non-market items
15
Group 4 Area estimates, intercropping
Fuelwood prices: meta analysis , how priced? Other fuelwood price studies? Missing quarters Simplest formula that we are confident in Experiment with more advanced methods High attrition rates: any biases? Reprentativeness of studies: “Meta study of case studies with good data” Adult equivalents: Agree on some simple ways to calculate that
16
Group 3
17
Additional New Tricky Issues
Timing of surveys vs time-value of money (USD): How do we compare the different surveys? Allocation of input costs: what about subsistence inputs? Need to standardize these costs. Definition of forest products? Need to standardize. Definitions are systematically applied. How to deal with the site selection bias? Cost estimation of other inputs e.g. Fodder. How do we value increase in number of livestock and weight?
18
Fuelwood Pricing Need to crosscheck proportion of energy income spent by other hhs away from forests vs those close to the forests. Hypothesis: Proportion of energy spending for the latter < the former.
19
Income Categories Household vs village level costs/prices?
WTP vs market price? Hypothesis: WTP <= market price
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.