Nobuo Yoshida SWIFT program lead and Lead Economist, World Bank

Slides:



Advertisements
Similar presentations
Request Dispatching for Cheap Energy Prices in Cloud Data Centers
Advertisements

SpringerLink Training Kit
Luminosity measurements at Hadron Colliders
From Word Embeddings To Document Distances
Choosing a Dental Plan Student Name
Virtual Environments and Computer Graphics
Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI
THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –
D. Phát triển thương hiệu
NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN
Điều trị chống huyết khối trong tai biến mạch máu não
BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.
Nasal Cannula X particulate mask
Evolving Architecture for Beyond the Standard Model
HF NOISE FILTERS PERFORMANCE
Electronics for Pedestrians – Passive Components –
Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel
L-Systems and Affine Transformations
CMSC423: Bioinformatic Algorithms, Databases and Tools
Some aspect concerning the LMDZ dynamical core and its use
Bayesian Confidence Limits and Intervals
实习总结 (Internship Summary)
Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,
Front End Electronics for SOI Monolithic Pixel Sensor
Face Recognition Monday, February 1, 2016.
Solving Rubik's Cube By: Etai Nativ.
CS284 Paper Presentation Arpad Kovacs
انتقال حرارت 2 خانم خسرویار.
Summer Student Program First results
Theoretical Results on Neutrinos
HERMESでのHard Exclusive生成過程による 核子内クォーク全角運動量についての研究
Wavelet Coherence & Cross-Wavelet Transform
yaSpMV: Yet Another SpMV Framework on GPUs
Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.
MOCLA02 Design of a Compact L-­band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Fuel cell development program for electric vehicle
Overview of TST-2 Experiment
Optomechanics with atoms
داده کاوی سئوالات نمونه
Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium  
ლექცია 4 - ფული და ინფლაცია
10. predavanje Novac i financijski sustav
Wissenschaftliche Aussprache zur Dissertation
FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,
Particle acceleration during the gamma-ray flares of the Crab Nebular
Interpretations of the Derivative Gottfried Wilhelm Leibniz
Advisor: Chiuyuan Chen Student: Shao-Chun Lin
Widow Rockfish Assessment
SiW-ECAL Beam Test 2015 Kick-Off meeting
On Robust Neighbor Discovery in Mobile Wireless Networks
Chapter 6 并发:死锁和饥饿 Operating Systems: Internals and Design Principles
You NEED your book!!! Frequency Distribution
Y V =0 a V =V0 x b b V =0 z
Fairness-oriented Scheduling Support for Multicore Systems
Climate-Energy-Policy Interaction
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Ch48 Statistics by Chtan FYHSKulai
The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.
Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs
Online Learning: An Introduction
Factor Based Index of Systemic Stress (FISS)
What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.
THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*
Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.
The Toroidal Sporadic Source: Understanding Temporal Variations
FW 3.4: More Circle Practice
ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف
Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM
Limits on Anomalous WWγ and WWZ Couplings from DØ
Presentation transcript:

Nobuo Yoshida SWIFT program lead and Lead Economist, World Bank July 19, 2017

? What is SWIFT?

What is SWIFT SWIFT (Survey of Well-being via Instant and Frequent Tracking) is a new household survey instrument SWIFT collects non-consumption data (X) SWIFT is designed for producing welfare indicators from X in a cost- effective, timely, and user-friendly manner SWIFT can be used for Increasing frequency of poverty data Estimating poverty for a specific area or group of people Monitoring the welfare effects of government’s investment or program

How does SWIFT works (1) SWIFT Survey SWIFT survey does not collect consumption or income directly SWIFT survey includes 15 to 20 simple questions (X) that are selected by models developed from LSMS We collect this SWIFT survey! X

Collect data (X) by a SWIFT survey How does SWIFT work? (2) LSMS SWIFT survey C X Ĉ=F(X) X Develop projection model (regress C on X) Impute Ĉ Identify X C=F(X) Collect data (X) by a SWIFT survey C: Consumption X: Household variables (e.g. education, employment) Ĉ=F(X): Projected Consumption data

Is SWIFT reliable?

SWIFT Guideline Version 2.0 SWIFT Guideline V 2.0 includes two main parts: Modeling and simulations Data collection (sampling, logistics, questionnaire design, and training)

Modeling Create a model by running regressions 𝑙𝑛𝑦 ℎ =𝛼+ 𝛽 1 ∗ 𝑥 1ℎ + 𝛽 2 ∗ 𝑥 2ℎ +…+ 𝛽 𝑘 ∗ 𝑥 𝑘ℎ + 𝜀 ℎ Left hand side variable: log of household expenditure per capita (or per adult equivalence) Select right hand side variables from the group of poverty correlates and estimate the coefficients using “Stepwise” regression Estimate distributions of the coefficients and errors

Stepwise model selection Mechanically look for variables that are statistically significant Need to look for the level of significance Usually, 5 percent But, SWIFT selects an optimal level of significance

Simulation stage Simulate HH expenditure for each household in SWIFT Survey Randomly drawing coefficients ( ) and errors ( ) from the estimated distributions Simulation is repeated 20 - 100 times Compute poverty headcount rates using the simulated HH expenditures for each round Average poverty rates as poverty rates Standard errors of the poverty rates are estimated from the distribution of poverty rates Use software 𝑙𝑛𝑦 ℎ =𝛼+ 𝛽 1 ∗ 𝑥 1ℎ + 𝛽 2 ∗ 𝑥 2ℎ +…+ 𝛽 𝑘 ∗ 𝑥 𝑘ℎ + 𝜀 ℎ Repeat this simulation process 100 times Identify poor households using the simulated HH exp for each round Calculate poverty headcount rates for each round and use means as the poverty estimates Use the standard deviation of 100 poverty headcount rates as standard errors of poverty estimates

Issues Over-fitting Multi-collinearity A model performs very well in LSMS but might not outside Multi-collinearity Stepwise regression is vulnerable to multi-collinearity Stability of coefficients over time Models developed in LSMS might not be no longer valid Misspecification of error structure Error distributions can be very complex Estimation of standard errors Formula of poverty mapping is not accurate

A model might not be stable over time 2010 2015 LSMS SWIFT LSMS LSMS (Modeling) SWIFT SWIFT (Simulations)

A within-sample test might not be reliable LSMS 2012/13 LSMS 2012/13 𝐶,𝑋 𝐶 ,𝑋 C=f(X) 𝐶 =f(X) A dataset used for developing a model is the same as a dataset which the model is applied for If an over-fitting problem exists, the within-sample performance can be very different from the out-of-sample performance

Distribution might not be normal Actual Simulated with normal distribution Log of household expenditure per capita Note: data from Sri Lanka HIES 2012/13, Central province.

To improve SWIFT modeling & simulations Cross Validation Multicollinearity checks Stability test using Backward/Forward Imputation Addressing distributional issues with a combination of PovMap and MI

1. Cross-Validation Cross-Validation is used to see the out-of-sample performance rather than within-sample performance The risk of overfitting problem rises as more variables are included Using the cross-validation approach, we try to find the optimal number of variables To ease programing, we search the optimal p-value for the stepwise regression

Cross-Validation: Step 1 Randomly Split by three GLSS 2012/13 𝐶,𝑋 𝐶 1 , 𝑋 1 𝐶 2 , 𝑋 2 𝐶 3 , 𝑋 3

Cross-Validation: Step 2 Randomly Split by three Training Data Testing Data GLSS 2012/13 𝐶,𝑋 𝐶 1 , 𝑋 1 𝐶 2 , 𝑋 2 𝐶 3 , 𝑋 3 modeling Compare 𝐶 =𝑓( 𝑋 3 ) 𝐶=𝑓 𝑋 𝑖

Cross-Validation: Step 3 𝐶 1 , 𝑋 1 𝐶 3 , 𝑋 3 𝐶 2 , 𝑋 2 𝐶 1 , 𝑋 1 Training Data 𝐶 3 , 𝑋 3 Testing Data

Cross-Validation: Statistics of interest Mean Squared Errors 1 𝑁 𝑖=1 𝑁 ( 𝑌 𝑖 − 𝑌 𝑖 ) 2 Average Squared (or absolute) difference between actual and projected poverty rates 1 3 𝐻 1 − 𝐻 1 2 + 𝐻 2 − 𝐻 2 2 + 𝐻 3 − 𝐻 3 2

Cross-Validation: Select the best p-value for stepwise regression P-value = 0.06 is the best Average of absolute differences between actual and projected poverty rates Mean Squared Errors

2. Are signs of coefficients reasonable? Variables 2010 Rural model Coef. Std. Err. Intercept 16.87 0.06 Household size -0.22 0.02 Household size 2 0.01 0.00 Dependency ratio -0.77 0.16 Dependency ratio 2 0.52 0.17 Head: Male 0.10 0.03 Head: Grades enrolled 2 Cooking: coal/wood 0.21 Own: Car 0.32 0.09 Own: TV Own: Vent 0.12 0.04 Me-Zochi dist. 0.15 Cantagalo dist. 0.07

3. Backward/Forward Imputation to Test Stability of Models 2012/13 2005/6 2006/7 2007/8 2008/9 2009/10 2010/11 2011/12 C,X C,X Compare Modeling Ĉ=f(X) C=f(x) Simulation C,X Consumption and non-consumption data collected by LSMS Projected consumption data in LSMS05/06 Ĉ=f(X)

Example of Backward/Forward Imputation from Afghanistan analysis i) Backward/Forward imputation ii) Final Estimation for 2013/14 Poverty Rate Survey year   95% Confidence Interval 2007-08 2011-12 2013-14 Actual 36.3 35.8 [34.94, 37.60] [34.14, 37.40] Imputed 37.2 35.2 39.1 [35.75, 38.63] [33.56, 36.78] [37.71, 40.55]

4. Estimation of poverty rates There are two major simulation approaches Poverty Mapping Method or ELL Developed by Elbers, Lanjouw and Lanjouw (2003) (ELL) It is often called ELL Multiple Imputation Method (MI) Developed by Rubin (1987) and Harvard Univ. It is often called MI

Comparisons between ELL and MI ELL: Good for incorporating a complex error structure of regression models MI: Easy to incorporate sampling errors and produce an accurate estimation of standard errors of poverty estimates Ideally, we should combine these two methodologies First, estimate models and simulate household expenditures with ELL Then, estimate poverty statistics and the standard errors with MI

Rubin-Schafer’s formula For a scalar population parameter Q: 𝑄 = 1 𝑚 𝑗=1 𝑚 𝑄 𝑗 𝑉𝑎𝑟 𝑄 = 1+ 1 𝑚 𝐵+ 𝑈 where B is the between-imputation variance 𝐵= 1 𝑚−1 𝑗=1 𝑚 ( 𝑄 𝑗 − 𝑄 ) 2 and 𝑈 is the average of within-imputation variances 𝑈 = 1 𝑚 𝑗=1 𝑚 𝑈 𝑗

Simulations with a more flexible distribution using PovMap Actual Simulated by PovMap Log of household expenditure per capita Note: data from Sri Lanka HIES 2012/13, Central province.

Steps for SWIFT modeling & simulation Cross-validation to decide the optimal p-value Run the stepwise regression using the optimal p-value determined by the cross-validation Check the coefficients of the final model Simulate household expenditures using PovMap or MI Estimate poverty rates using the simulated expenditures and MI’s formula using “mi estimate” To check stability, conduct backward imputation analysis

Review process of SWIFT Selected as one of the Innovation Challenge Program in 2013 SWIFT Guideline V 1.0 Concept Note Review Meeting in July 2014 Decision Meeting in June 2015 SWIFT Guideline V 2.0 Decision Note Decision Meeting for FY16 Papers Serbia telephone data experimentation (soon PRWP) Bangladesh High frequency poverty data collection (PRWP) Sri Lanka Survey to Survey Imputation (PRWP) Ethiopia Survey to Survey imputation (seeking permission from the govt) Afghanistan Survey to Survey imputation (soon PRWP) A note on Paraguay SWIFT Water survey (published in CPF) A paper on Bangladesh Energy-SWIFT Survey (under preparation)

Research SWIFT Econometrics of Projections (Y) rather than Econometrics of structural estimation (beta) Explore LASSO and other estimators Effects of the number of simulations The impact of increasing the number of simulations on the selection of the optimal p SWIFT V 2.0 for Iraq Collection of consumption data as well

Advantages of SWIFT

Estimation of income growth Advantages of SWIFT Creation of Quintiles Estimation of poverty Estimation of income growth Equity Tool Yes No PPI SWIFT

Comparison of different methodologies (Sharma et al. 2014)

Improving marketing with partners

Training for SWIFT We have training for SWIFT It includes introduction and theory of SWIFT (half day) Hands-on training with a demo data (1.5 days) Hands-on training with your data (3 days) A new online course is now available!

Training for SWIFT