Using official data-sources Tom Spencer, Social Justice Analysis
To discuss… Me! What official sources? Getting the data Understanding the data Using the data
Me! Social Justice Statistician –Measure Scottish poverty and income inequality Use Family Resources Survey Data Collected by DWP
What is an official dataset? Secondary data from survey/admin records Collected by central government Cover loads of different areas
Advantages of using official (secondary) data Free! Collection methodology of a high standard (check for National Statistics logo) Comparable across UK/Scotland Under-used resource
Getting hold of the data Statistical publications Other government reports and ‘ad-hocs’ Downloadable tables and excel files Scottish Neighbourhood Statistics or similar Analysis datasets from Complete (?) dataset from data owners Can you use one of the first 4?
We download the dataset. And then what? How was it collected? Who is included/excluded? Are these people, households or families? What is GS_NEWWA? Does child income include free milk?
Things to read… Statistical publication + appendices –Methodology –Key figures Documentation for dataset –Explanation of variables Sample code Speak to statistician?
Getting to know your data (1) Import to analysis package Simple summary figures: –Counts –Means Tables This can take a long time
Getting to know your data (2) - reproduce some published figures Probably involve using weighting/grossing factors –Used to produce population-wide estimates from sample –# poor people in sample * (# people in Scotland / # people in sample) = # poor people in Scotland –Complicated! But probably already calculated in dataset Example code or advice from data owners may help.
What do you want to know? Find a related, published figure Reproduce it Adjust the code
Then what? Check figures –Do they look sensible? –How do they compare to recent trends/ other areas? Are you publishing? Useful/friendly to copy work to data owners
Concluding thoughts Speak to the relevant analysts – this is what they are there for! Is there an easier way? Using raw data takes time – but is a powerful tool. Reproduce existing figures to ensure you understand the data
Questions? wse/Social-Welfare/IncomePoverty