Presentation is loading. Please wait.

Presentation is loading. Please wait.

Matching of administrative data to validate the 2011 Census in England and Wales NRS & RSS Edinburgh, October 2012.

Similar presentations


Presentation on theme: "Matching of administrative data to validate the 2011 Census in England and Wales NRS & RSS Edinburgh, October 2012."— Presentation transcript:

1 Matching of administrative data to validate the 2011 Census in England and Wales NRS & RSS Edinburgh, October 2012

2 AGENDA Context: 2011 Census quality assurance and the role of administrative data Data matching challenges and solutions Data to be matched Matching methods and interpretation Substantive results so far...

3 An overview of the methods 5 yr age/sex CCS areas 5 yr age/sex EA /LA level 1 yr age/sex OA level DSE Bias adj Overcount DSE Bias adj Overcount Ratio estimator Nat adj Coverage imputation ProductMethod Supplementary analysis Core checks Main QA Panel High Level QA Panel First Release QA Review and sign-off Quality assurance

4 Challenges and solutions IssueSolution Matching limited to small QA ‘window’Match selected LAs ahead of QA Some data not available in advanceFlexible data architecture so new sources can be added Research questions only emerge during QA Stratified approach to matching so the methods were tailored to the questions Scale of matching task potentially huge Initially restrict matching to CCS postcode clusters One: many address matchesRevised address data architecture

5 Data to be matched CensusNon-Census Post-out Address Register NHS Patient Register Address Register History File Higher Education Statistics Agency (HESA) data Census returnsEnglish and Welsh School Censuses ‘Associated Address’ dataElectoral Registers Census Management Information System Valuation Office Agency data

6 Methods Data cleaning, de-duplication, standardisation, quality analysis Definitional alignment with Census enumeration base Exact matching (dwelling: Address/ person: name, DoB, gender and postcode) Score-based address matching Probabilistic person matching Clerical resolution of candidate pairs from automatch Clerical search for unmatched residuals Resolution of unmatched residuals against the Address Register History file and Census ‘associated addresses’ Evidence-based assessment of residuals

7 Interpretation: Who is actually present? Non-URsCensus non-usual residents (matched and unmatched to PR) PR records unmatched to Census respondents and assessed as not present Matched to address deactivated in the field Matched to unoccupied or vacant/absent/ 2 nd res dummy Matched to ARHF invalid address UR elsewhere, this is Usual Address 1 Year Ago Matched to Census UR elsewhere UnaccountedUnmatched and unaccounted for PR records unmatched to Census respondents and assessed present PR matched to Census missed/ unaccounted-for address PR matched to address with ‘occupied’ dummy PR validated through other administrative sources PR/ Census confirmed URs PR/ Census matched records Census URs unmatched to PR

8 Match rates in a ‘control’ LA

9 Female outcomes in a ‘control’ LA

10 Male outcomes in a ‘control’ LA

11 Match results in university towns

12 University town: female outcomes

13 University town: male outcomes

14 London: population churn

15 London churn: female outcomes

16 London churn: male outcomes

17 London LA: implied sex ratios

18 Data mining to address specific Census/PR anomalies University Hall of ResidenceGP registrations/Hall capacity

19 Female students living in halls in April 2011 by NHS Authority acceptance date

20 Male students living in halls in April 2011 by NHS Authority acceptance date

21 LA summary: proportion of F4s and proportion unresolved, within CCS postcode clusters

22 LA summary: concentration of Flag 4s in the PR residual

23 LA summary: LA types, residual size and Flag 4s

24 Further investigations Planned analysis of the PR residuals’ addresses and households to identify ‘ghost’ records Longitudinal matching of the 2012 Patient Register to 2011 data to identify registrations that have been cancelled by GP practices in the year following Census Cluster analysis of all E&W LAs to see whether the typology of LAs identified through matching is mirrored in list inflation patterns nationally Multi-level modelling to summarise results, with individual and area level explanatory variables


Download ppt "Matching of administrative data to validate the 2011 Census in England and Wales NRS & RSS Edinburgh, October 2012."

Similar presentations


Ads by Google