Matching of administrative data to validate the 2011 Census in England and Wales NRS & RSS Edinburgh, October 2012.

Slides:



Advertisements
Similar presentations
Measuring Coverage: Post Enumeration Surveys Owen Abbott Office for National Statistics, UK.
Advertisements

1 Cohort management and the Secondary Uses Service (SUS) Nirupa Dattani Office for National Statistics.
Administrative Data Sources ONS Centre for Demography.
2011 Key challenges Peter Benton Head of 2011 Census Design Authority.
A model-based approach for estimating international emigration for local authorities Brian Foley, Office for National Statistics BSPS day meeting London.
Internal Migration Research Update Kostas Loukas, Population Statistics Research Unit
Bosna i Hercegovina Agencija za statistiku Bosne i Hercegovine Bosna i Hercegovina Agencija za statistiku Bosne i Hercegovine Post-enumeration Survey-A.
Weighting and Imputation for CORE Social Housing Statistics Julia Bowman & Niall Goulding.
EGM – Population & Housing Censuses Eurostat / UNECE - Geneva - 24/25 May 2012 Beyond 2011 The future of population statistics (England & Wales) Alistair.
National Chlamydia Screening Programme Chlamydia testing and diagnoses in year olds, England January – December 2013 CTAD Team HIV & STI Department.
Geography and Geographical Analysis using the ONS Longitudinal Study Christopher Marshall & Julian Buxton CeLSIUS.
2011 Census Using administrative data to address under-enumeration Robert Beatty Northern Ireland Census Office.
General Register Office for S C O T L A N D information about Scotland's people Producing small area housing and household statistics from Council Tax.
Beyond 2011 The Future of Population Statistics Martin Ralphs, Office for National Statistics.
Beyond 2011: Automating the linkage of anonymous data Pete Jones Office for National Statistics.
Quality Measures for ONS population estimates: Introduction Local Insight Reference Panels Autumn
National Statistics Quality Review on International Migration Estimates Update on taking forward the recommendations of the review Emma Wright & Giles.
National Chlamydia Screening Programme Chlamydia testing and diagnoses in year olds, England January – December 2014 CTAD Team HIV & STI Department.
Towards a high quality 2011 Census The Census Field Operation and LA liaison Pete Benton Deputy Director, Census Programme.
Coverage assessment and adjustment methodology Owen Abbott Methodology Directorate, ONS.
2011 CENSUS Coverage Assessment – What’s new? OWEN ABBOTT.
2011 Census 2007 Census Test – emerging findings Garnett Compton, ONS Updated 4 September 2007 BSPS – 12 September 2007.
1 Measuring Quality Issues Associated with Internal Migration Estimates Joanne Clements, Amir Islam, Ruth Fulton & Jane Naylor Demographics Methods Centre.
Plausibility Ranges for Population Estimates Focusing on ranges for children.
General Register Office for S C O T L A N D information about Scotland's people BSPS Review of migration methods using health registrations Nick.
GEOG3025 Census and administrative data 1: Sources and methods.
DATA EVALUATION METHODS USED IN THE PREVIOUS CENSUSES POST ENUMERATION AND DEMOGRAHIC ANALYSIS Gebeyehu Abelti Deputy Director General, Population & Social.
Imputation in the 2001 Census Robert Beatty NILS User Forum 11 December 2009.
Choosing Core NILS data and its impact on Research Rónán Adams Máire Brolly NILS User Forum 11 th December 2009.
October 28-30, 2009 UNECE Geneva Quality Assessment of 2008 Integrated Census - Israel Pnina ZADKA Central Bureau of Statistics Israel.
General Register Office for S C O T L A N D information about Scotland's people Comparison between NHSCR and Community health index sources of migration.
Towards a high quality 2011 Census The 2011 Census Questionnaire Pete Benton Deputy Director, Census Programme.
Design of the 2011 Census Coverage Survey Owen Abbott (ONS) James Brown (Institute of Education)
EGM – Population & Housing Censuses Eurostat / UNECE - Geneva - 24/25 May 2012 Building the address register for the 2011 Census (England & Wales) Alistair.
Assessing the accuracy of different models for combining aggregate level administrative data Dilek Yildiz Supervisors: Peter W. F. Smith, Peter G.M. van.
Modelling international migration to produce local level estimates Ruth Fulton Office for National Statistics.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
Jonathan Smith and Cal Ghee Migration Statistics Improvement, ONSCD Centre for Demography Improving internal migration estimates of students.
Mismatches and matches in address information from the Census and the BSO: A longitudinal perspective Ian Shuttleworth and Brian Foley, Queen’s.
United Nations Workshop on Evaluation and Analysis of Census Data, 1-12 December 2014, Nay Pyi Taw, Myanmar DATA VALIDATION-I Evaluation of editing and.
Analysis of the characteristics of internet respondents to the 2011 Census to inform 2021 Census questionnaire design Orlaith Fraser & Cal Ghee.
Data Management and Analysis John Hollis Demographic Consultant, GLA Data Management and Analysis Statistical Aspects.
2011 Census Data Quality Assurance Strategy: Plans and developments for the 2009 Rehearsal and 2011 Census Paula Guy BSPS 10 th September 2009.
© Statistisches Bundesamt, VI A Statistisches Bundesamt The new method of the next german Population census Johann Szenzenstein, Federal Statistical Office,
2011 Census Address Register Development Garnett Compton 2 October 2008.
The 2011 Census: Estimating the Population Alexa Courtney.
JOINT UN-ECE/EUROSTAT MEETING ON POPULATION AND HOUSING CENSUSES GENEVA, 7-9 JULY 2010 A QUALITY ASSURANCE STRATEGY FOR THE 2011 CENSUS IN ENGLAND AND.
2009 Survey of Disability, Ageing and Carers (SDAC) – emerging data Presentation to Carers NSW Biennial Conference 17 March 2011 Steve Gelsi Assistant.
2007 Census Test – Analysis of Coverage Owen Abbott Methodology Directorate.
Copyright 2010, The World Bank Group. All Rights Reserved. COVERAGE, FRAMES & GIS, Part 1 Quality assurance for census 1.
UN ECE Seminar on New Frontiers for Statistical Data Collection 31 Oct – 2 Nov 2012 Beyond 2011 The future of population statistics Andy Teague, Office.
Synthetic Approaches to Data Linkage Mark Elliot, University of Manchester Jerry Reiter Duke University Cathie Marsh Centre.
Adjusting for coverage error in administrative sources in population estimation Owen Abbott Research, Development and Infrastructure Directorate.
Beyond 2011 Voluntary Sector Statistics User Event Minda Phillips Amelia Ash.
Evaluating imputation of sex and age for substitutes in substitute households Michael Ryan 2008 UNECE Work Session on Statistical Data Editing.
The value of the Census to the family history researcher by Hywel Roberts, C.Stat. The published census records between 1841 and 1911 show that more information.
The evolution of the England and Wales census in a European context Garnett Compton, ONS RSS Conference, 9 September 2015.
Evaluating the potential for moving away from a traditional census Becky Tinsley Office for National Statistics (ONS), UK.
Agenda Introduction Why have a PES? Essential features of a PES
Methodologies & Procedures for Evaluation
Methodologies and Procedures for Evaluating Coverage and Content Error Pres. 6 United Nations Regional Workshop on the 2010 World Programme on Population.
Measuring Internal Migration: Comparing Census and Administrative Data
2011 Census The First Results
Integrating administrative data – the 2021 Census and beyond
Choosing Core NILS data and its impact on Research
Beyond 2011 Administrative data sources and low-level aggregate models for producing population estimates.
Methodologies and Procedures for Evaluating Coverage and Content Error Pres. 6 United Nations Regional Workshop on the 2010 World Programme on Population.
Pnina ZADKA Central Bureau of Statistics Israel
Pnina ZADKA Central Bureau of Statistics Israel
Presentation transcript:

Matching of administrative data to validate the 2011 Census in England and Wales NRS & RSS Edinburgh, October 2012

AGENDA Context: 2011 Census quality assurance and the role of administrative data Data matching challenges and solutions Data to be matched Matching methods and interpretation Substantive results so far...

An overview of the methods 5 yr age/sex CCS areas 5 yr age/sex EA /LA level 1 yr age/sex OA level DSE Bias adj Overcount DSE Bias adj Overcount Ratio estimator Nat adj Coverage imputation ProductMethod Supplementary analysis Core checks Main QA Panel High Level QA Panel First Release QA Review and sign-off Quality assurance

Challenges and solutions IssueSolution Matching limited to small QA ‘window’Match selected LAs ahead of QA Some data not available in advanceFlexible data architecture so new sources can be added Research questions only emerge during QA Stratified approach to matching so the methods were tailored to the questions Scale of matching task potentially huge Initially restrict matching to CCS postcode clusters One: many address matchesRevised address data architecture

Data to be matched CensusNon-Census Post-out Address Register NHS Patient Register Address Register History File Higher Education Statistics Agency (HESA) data Census returnsEnglish and Welsh School Censuses ‘Associated Address’ dataElectoral Registers Census Management Information System Valuation Office Agency data

Methods Data cleaning, de-duplication, standardisation, quality analysis Definitional alignment with Census enumeration base Exact matching (dwelling: Address/ person: name, DoB, gender and postcode) Score-based address matching Probabilistic person matching Clerical resolution of candidate pairs from automatch Clerical search for unmatched residuals Resolution of unmatched residuals against the Address Register History file and Census ‘associated addresses’ Evidence-based assessment of residuals

Interpretation: Who is actually present? Non-URsCensus non-usual residents (matched and unmatched to PR) PR records unmatched to Census respondents and assessed as not present Matched to address deactivated in the field Matched to unoccupied or vacant/absent/ 2 nd res dummy Matched to ARHF invalid address UR elsewhere, this is Usual Address 1 Year Ago Matched to Census UR elsewhere UnaccountedUnmatched and unaccounted for PR records unmatched to Census respondents and assessed present PR matched to Census missed/ unaccounted-for address PR matched to address with ‘occupied’ dummy PR validated through other administrative sources PR/ Census confirmed URs PR/ Census matched records Census URs unmatched to PR

Match rates in a ‘control’ LA

Female outcomes in a ‘control’ LA

Male outcomes in a ‘control’ LA

Match results in university towns

University town: female outcomes

University town: male outcomes

London: population churn

London churn: female outcomes

London churn: male outcomes

London LA: implied sex ratios

Data mining to address specific Census/PR anomalies University Hall of ResidenceGP registrations/Hall capacity

Female students living in halls in April 2011 by NHS Authority acceptance date

Male students living in halls in April 2011 by NHS Authority acceptance date

LA summary: proportion of F4s and proportion unresolved, within CCS postcode clusters

LA summary: concentration of Flag 4s in the PR residual

LA summary: LA types, residual size and Flag 4s

Further investigations Planned analysis of the PR residuals’ addresses and households to identify ‘ghost’ records Longitudinal matching of the 2012 Patient Register to 2011 data to identify registrations that have been cancelled by GP practices in the year following Census Cluster analysis of all E&W LAs to see whether the typology of LAs identified through matching is mirrored in list inflation patterns nationally Multi-level modelling to summarise results, with individual and area level explanatory variables