Presentation is loading. Please wait.

Presentation is loading. Please wait.

Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding.

Similar presentations


Presentation on theme: "Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding."— Presentation transcript:

1 Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding

2 What CORE is COntinuous REcording of Social Housing Lettings Census – hybrid of interview and administrative data Household level data collected Private Registered Providers and Local Authorities Collected from all housing providers in England since 2004 Many types of information are collected, not just the number of lettings…

3 Lettings log

4 2012/13 Headline stats Context – 378,700 lettings Household characteristics – 91% UK nationals, 22% in work, 3% under 18 Most common reason given for why the household left their last settled home - overcrowding Average weekly rent - £79.58 / £104.52 Length of time vacant – 32 days Staying within local authority – 90% 378,700 lettingsOvercrowding £79.58 per week 32 days vacant 90% remain in LA

5 Complimentary data sets Local Authority Housing Statistics (LAHS) English Housing Survey (EHS)

6 Users

7 Interests around household characteristics And media interest…

8 QIF bid Two problems we sought to resolve… Placed bid to the UKSA’s Quality Improvement Fund (QIF) Work carried out by the ONS Methodology Advisory Board

9 Problem 1: LA missing records Lettings volume varies greatly by local authority Local Authority Housing Statistics (LAHS): nearly complete lettings data at LA level CORE: lettings data at household level

10 Problem 1: LA missing records Some LAs do not provide logs for every letting in CORE Introduces bias into demographic statistics Lettings grossed to LAHS counts on urban/rural classification Does not account for demographics of population

11 Solution 1: Improved Weighting Geographic approach maintained ONS area classifications (OACs) are used to replace urban/rural classifications. Areas grouped on many factors using a cluster methodology

12

13 Solution 1: Improved Weighting What is our best estimate for lettings per ONS cluster area? The highest of LAHS or CORE for each LA If neither, we use an imputed LAHS figure Sum these to get total lettings per ONS cluster area

14 Solution 1: Improved Weighting Highest of LAHS, CORE, imputed LAHS for each LA Sum lettings per ONS cluster area group Compare to reported CORE figure per area group Ratio of best estimate to CORE figure = weight

15 Problem 2: Record level missing data Both LA and PRPs submit logs with missing household characteristics Age, sex, ethnicity, nationality and economic status This can happen because  tenant refuses to provide the information  some LAs do not interview  admin data constraints  IT constraints

16 Solution 2: Imputation So how do we account for this? Donor imputation: Neighbour Imputation Method Canadian Census Edit and Imputation System – CanCEIS (Canadian Census 2001, UK Census 2011) Efficient, free license, variety of record editing rules

17 Solution 2: Imputation Raw data comes to DCLG (SPSS) Data reformatted for CanCEIS (ASCII) CanCEIS finds incomplete and donor records CanCEIS matches records Household characteristics that are available (age, sex, ethnicity, nationality, economic status) Area classification, provider type (LA/PRP), previous tenure, size of property, asylum seeker, refugee status (and client type) Record randomly picked from pool of donors Imputed output data set AgeSexNationalityAreaAsylum 45MUK6N 35MEEA2N 27FMISSING4N AgeSexNationalityAreaAsylum 451160 351220 272-1040 AgeSexNationalityAreaAsylum 451160 351220 272-1040 AgeSexNationalityAreaAsylum 272-1040 272240 × 10 2

18 The complete process Raw data comes to DCLG Weighting Imputation Complete records Weights assigned Final data set

19 Results What happens when we weight and impute? PRPLATotal % UK113,07169,25691.8% A104,2582,5473.4% Other EEA1,2869361.1% Other3,5373,7103.6% Missing4,32417,1319.7% Total lettings220,056 PRPLATotal % UK116,94496,41091.4% A104,4273,5693.4% Other EEA1,3471,3691.2% Other3,7585,5104.0% Total lettings233,334 Original reported dataWeighted and imputed dataImputed data PRPLATotal % UK116,94484,43991.5% A104,4273,1183.4% Other EEA1,3471,2041.2% Other3,7584,8193.9% Total lettings220,056

20 Testing But what further tests can we do? Remove logs from a complete data set and then test weighting against the complete version Deleting data and then imputing it to check error rate Finding other unaccounted biases needing weighting Any other thoughts?

21 Future work CORE is now National Statistics – improvements pending Use areas from 2011 census data Affordable rent weighting and imputation Improve data quality and volume from LAs – 2013/14 first year all LAs will participate On going disclosure control investigations Make CORE data more easily available via Open Data Communities

22 Thank you. Questions and comments please!


Download ppt "Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding."

Similar presentations


Ads by Google