Consumption calculations with real data – CORRECTED VERSION (CORRECTIONS IN RED) Gretchen Donehower Day 3, Session 2, NTA Time Use and Gender Workshop Wednesday, May 23, 2012 Institute for Labor, Science and Social Affairs (ILSSA) Hanoi, Vietnam
Outline 1.One note of caution about gender implications of consumption allocation assumptions 2.Arranging data to start the calculations 3.Sample code fragments that can be modified based on your own time use survey – Use of this code is COMPLETELY optional. There are many ways to program this algorithm. Write your programs the way you think best. 4.How to check your results 5.Sensitivity tests
A note of caution Our allocation assumptions are equal for males and females – So, the only way males and females can differ is through household structure (e.g. if households with only baby girls have greater average production of childcare, baby girls will look like they have higher consumption of care than baby boys) In contexts with heavy “son preference,” this may be a very bad assumption! – Do some sensitivity tests trying an allocation of childcare or other activities to girls of x% and boys (1- x%)
Arranging data to start the calculations Time use survey data should be arranged as in the simplified example spreadsheet – One line for each hh member (time use respondents and non-respondents together) – Includes variables for the hh production activities performed by any time use respondents Can be in money or time units Time produced should be listed on same row as the person who produced it; non-producers should have zeros – Survey weight variable for time respondents (if available) – Could use the survey data before you “collapse” to age- and sex-means to calculate the production age profiles Also need total population counts by age and sex – Sort by age and sex for later merge with time use data
Impute age of consumption [1] First generate variables that count up number of persons in the household of each sex and age, for each of the possible target age groups. Example: gen child=(age<18) gen adult=(age>=18) gen all=1 foreach allocgrp in child adult all { bysort hhid: egen num`allocgrp’=sum(`allocgrp’) foreach sss in 1 2 { foreach aaa of numlist 0(1)80 85 { gen tempctr=(sex==`sss')*(age==`aaa')*(`allocgrp') bysort hhid: egen numsex`sss'age`aaa'`allocgrp'=sum(tempctr) drop tempctr }
Impute age of consumption [2] For general household activities, create consumption variables by multiplying amount produced by fraction of persons in household by age and sex. Example for variable “clean”: foreach sss in 1 2 { foreach aaa of numlist 0(1)80 85 { gen clean_cons_sex`sss’_age`aaa’=clean*(numsex`sss’age`aaa’all/numall) } Then do the same for all other general household activity variables.
Impute age of consumption [3] For age targeted care activities in the household, create consumption variables dividing amount produced by number of persons in target age group by age and sex. Examples for variables “childcarehh” and “adultcarehh”: foreach sss in 1 2 { foreach aaa of numlist 0(1)17 { gen childcarehh_cons_sex`sss’_age`aaa’=childcarehh* (numsex`sss’age`aaa’child/totchild) } foreach sss in 1 2 { foreach aaa of numlist 18(1)80 85 { gen adultcarehh_cons_sex`sss’_age`aaa’=adultcarehh*(numsex`sss’age`aaa’adult/t otadult) }
Impute age of consumption [4] For any care activities outside of the household, consumption variable is not allocated by age. Example for variables “volunteer” and “childcarenhh” and “adultcarenhh”: gen volunteer_cons_unk=volunteer gen childcarenhh_cons_unk=childcarenhh gen adultcarenhh_cons_unk=adultcarenhh
Collapse consumption variables to level of producers Example with variable called “timeresp” which is 1 if person gave time use information, 0 if not. keep if timeresp==1 collapse (mean) *_cons_* [w=surveywt], by(age sex) fast So now, what do I have? – A line for each age/sex group of producers and a variable for each age/sex group that consumed each activity
Merge with population data and calculate aggregate matrix Make sure population data has same age and sex variables as average producer matrix first. Then: sort sex age merge sex age using population drop _merge foreach sss in 1 2 { foreach aaa of numlist 0(1)80 85 { foreach vvv in clean laund cook hhmaint lawngar hhmgmt petcare purch trav chcarehh adultcarehh { capture replace `vvv'_cons_ sex`sss’_age`aaa’ = `vvv'_cons_ sex`sss’_age`aaa’ *pop/ } foreach vvv in chcarenhh adultcarenhh volunteer { replace `vvv'_cons_age_unk=`vvv'_cons_age_unk*pop/ } Note: “capture” tells Stata not stop the loop if it can’t find one of the variables called. (Not all age variables were made for all activities.)
Sum Down Columns, Transpose and Make Age and Sex Variables Transpose exchanges rows for columns. We want this so we can merge with the population data again. collapse (sum) *_cons_*, fast xpose, clear varname gen activity=substr(_varname,1,strpos(_varname,"_")-1) gen sex=substr(_varname,length(activity)+10,1) gen agefinder=strpos(_varname,"age") gen totlength=length(_varname) gen age=substr(_varname,agefinder+3,agefinder+4-totlength) replace age=substr(_varname,agefinder+3,2) if age=="" replace sex=" " if age=="_u" replace age=" " if age=="_u" destring age sex, replace /* CAUTION: MAKE SURE THAT THE VARIABLES AGE, SEX, AND ACTIVITY ARE CORRECT BEFORE RUNNING THE NEXT COMMAND! */ drop _varname agefinder totlength
The thing I forgot End up with four variables: age, sex, activity, v1 (“v1” is the name stata gave to my aggregate values after the “xpose” command) `hhproduction’ below is a list of activity variables reshape wide v1, i(age sex) j(activity) string foreach vvv in `hhproduction' { rename v1`vvv' `vvv' }
Merge with Population and Distribute Non-household Care [1] Merge with population data (same as before computing aggregate by producers): sort sex age merge sex age using population data drop _merge Generate variables that give the age distributions appropriate for distributing non-household care: gen child=(age<18)*pop gen adult=(age>=18)*pop gen all=pop foreach vvv in child adult all { egen num`vvv'=sum(`vvv') } Generate variables that give total amount of non-household care to be distributed: foreach vvv in chcarenhh adultcarenhh volunteer { egen tot`vvv'=sum(`vvv') }
Merge with Population and Distribute Non-household Care [2] Total childcare variable is sum of household childcare plus fraction of total non-household childcare: gen childcare=childcarehh + totchildcarenhh * (child/numchild) Total adultcare variable is sum of household adultcare plus fraction of total non-household adultcare. If volunteering will have the same imputed wage as adultcare, can combine them: gen adultcare=adultcarehh + totadultcarenhh * (adult/numadult) + totvolunteer *(all/numall)
Divide by Population for Per Capita Consumption foreach vvv in clean cook hhmaint hhmgmt laund lawngar petcare purch trav childcare adultcare { replace `vvv'=`vvv'/(pop/ ) } This data is in time units. Multiply by imputed wages and adjustment factors to change into money units.
How to check that calculations are correct Look at production and consumption for each activity. Do they look reasonable? – Is childcare only being consumed by children? Adult care only by adults? – Is consumption of general household activities reasonably smooth? Does aggregate consumption equal aggregate production for each type of activity?
Sensitivity Tests Run consumption program after any alternative methods you use for production – Change imputed wages – Include or exclude multiple activities – etc. Run consumption program with alternative allocations if that is relevant in your context – Examples: Make up a scenario of “son preference” for childcare Change allocation of time spent cooking if you think that males are given preference for food Sensitivity testing comparing a gender bias scenario with a no-bias scenario could be its own research project
Lab Session Start programming consumption algorithm!