Adjusting for coverage error in administrative sources in population estimation Owen Abbott Research, Development and Infrastructure Directorate.

Slides:



Advertisements
Similar presentations
Measuring Coverage: Post Enumeration Surveys Owen Abbott Office for National Statistics, UK.
Advertisements

Page 1 Measuring Survey Quality through Representativity Indicators using Sample and Population based Information Chris Skinner, Natalie Shlomo, Barry.
Paul Smith Office for National Statistics
Will 2011 be the last Census of its kind in England and Wales? Roma Chappell, Programme Director Beyond 2011 Office for National Statistics, July 2011.
Burton Reist Chief, 2020 Research and Planning Office U.S. Census Bureau 2014 SDC and CIC Steering Committee Meeting March 5, Census Updates.
Matching of administrative data to validate the 2011 Census in England and Wales NRS & RSS Edinburgh, October 2012.
Planning for the 2020 Census Presentation to the SDC/CIC Steering Committees Daniel H. Weinberg Assistant Director for ACS and Decennial Census June 17,
EGM – Population & Housing Censuses Eurostat / UNECE - Geneva - 24/25 May 2012 Beyond 2011 The future of population statistics (England & Wales) Alistair.
Kevin Deardorff Assistant Division Chief, Decennial Management Division U.S. Census Bureau 2014 SDC / CIC Conference April 2, Census Updates.
2001 Census Programme Delivering UK Census Data to Researchers: Progress and Challenges David Martin University of Southampton and ESRC/JISC Census Programme.
United Nations Workshop on Revision 3 of Principles and recommendations for Population and Housing Censuses and Census Evaluation Amman, Jordan, 19 – 23.
2011 Census Using administrative data to address under-enumeration Robert Beatty Northern Ireland Census Office.
Lecture 3: Data sources Health inequality monitoring: with a special focus on low- and middle-income countries.
Beyond 2011 – A new paradigm for population statistics? Pete Benton, Beyond 2011 Programme Director Office for National Statistics, UK.
Work Package 5: Integrating data from different sources in the production of business statistics Daniel Lewis Office for National Statistics (UK)
GEOG3025 Census and administrative data sources 3: Integration and future development.
1 Demographic Analysis of the 2010 Census Jason Devine U.S. Census Bureau 2010 SDC Steering Committee Meeting February 23, 2010 This presentation is released.
12th Meeting of the Group of Experts on Business Registers
List frames area frames and administrative data, are they complementary or in competition? Elisabetta Carfagna University of Bologna Department of Statistics.
Central egency for public mobilization and statistics.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Coverage assessment and adjustment methodology Owen Abbott Methodology Directorate, ONS.
2020 Census: Program Overview, Testing, and Technological Innovations
2011 CENSUS Coverage Assessment – What’s new? OWEN ABBOTT.
Emerging methodologies for the census in the UNECE region Paolo Valente United Nations Economic Commission for Europe Statistical Division International.
Lesli Scott Ashley Bowers Sue Ellen Hansen Robin Tepper Jacob Survey Research Center, University of Michigan Third International Conference on Establishment.
Plans for the Research and Testing Phase of the 2020 Census Presentation to the State Data Centers October 15, 2010 Daniel H. Weinberg (Assistant Director.
Deliverable 2.6: Selective Editing Hannah Finselbach 1 and Orietta Luzi 2 1 ONS, UK 2 ISTAT, Italy.
Person Activity Register - a statistical register of persons Administrative Statistics Seminar Dublin Castle 20 th Feb 2014
© Copyright ONS Joint ECE/EUROSTAT work session on Population Censuses Geneva November 2004 Ian White.
European Conference on Quality in Official Statistics Roma, July 8-11, 2008 New Sampling Design of INSEE’s Labour Force Survey Sébastien Hallépée Vincent.
The application of selective editing to the ONS Monthly Business Survey Emma Hooper Office for National Statistics
Jeroen Pannekoek - Statistics Netherlands Work Session on Statistical Data Editing Oslo, Norway, 24 September 2012 Topic (I) Selective and macro editing.
Plausibility Ranges for Population Estimates Focusing on ranges for children.
GEOG3025 Census and administrative data 1: Sources and methods.
Research for 2021 Census in England and Wales Potential innovations.... Garnett Compton, ONS UNECE Census Meeting, 30 September – 2 October 2015.
1 Understanding and Measuring Uncertainty Associated with the Mid-Year Population Estimates Joanne Clements Ruth Fulton Alison Whitworth.
Design of the 2011 Census Coverage Survey Owen Abbott (ONS) James Brown (Institute of Education)
May 12-15, Evaluating the Integrated Census Israel Pnina ZADKA Central Bureau of Statistics Israel.
Assessing the accuracy of different models for combining aggregate level administrative data Dilek Yildiz Supervisors: Peter W. F. Smith, Peter G.M. van.
Census Coverage Measurement Survey Update 2010 State Data Center/ Census Information Center Steering Committee Meeting February 23, 2010 Thomas.
1 A Study of Sources for the Error Structure in Estimates of Census Coverage Error Components Mary H. Mulry U.S. Census Bureau 2009 International Total.
2011 Census Data Quality Assurance Strategy: Plans and developments for the 2009 Rehearsal and 2011 Census Paula Guy BSPS 10 th September 2009.
S T A T I S T I K A U S T R I A Quality Assessment of register-based Statistics A Quality Framework Manuela LENK Directorate.
Beyond 2011 Administrative data sources and low-level aggregate models for producing population counts.
JOINT UN-ECE/EUROSTAT MEETING ON POPULATION AND HOUSING CENSUSES GENEVA, 7-9 JULY 2010 A QUALITY ASSURANCE STRATEGY FOR THE 2011 CENSUS IN ENGLAND AND.
Investigating the Potential of Using Non-Probability Samples Debbie Cooper, ONS.
2007 Census Test – Analysis of Coverage Owen Abbott Methodology Directorate.
Beyond 2011 The future for population statistics? (Introduction to options and consultation)
Q2010 Special session 34 Data quality and inference under register information Discussion by Carl-Erik Särndal.
UN ECE Seminar on New Frontiers for Statistical Data Collection 31 Oct – 2 Nov 2012 Beyond 2011 The future of population statistics Andy Teague, Office.
The combined use of multiple data sources in the population census Fabio Crescenzi, Giuseppe Sindoni National Institute of Statistics Rome, Italy
IAOS Shanghai – Reshaping Official Statistics Some Initiatives on Combining Data to Support Small Area Statistics and Analytical Requirements at.
Marc Hamel and Julie Trépanier May 21, 2014 Canadian Statistical Demographic Database: A research project.
1 A theoretical framework for register-based statistics --- Can we carry on without it? Li-Chun Zhang Statistics Norway
Beyond 2011 Voluntary Sector Statistics User Event Minda Phillips Amelia Ash.
The evolution of the England and Wales census in a European context Garnett Compton, ONS RSS Conference, 9 September 2015.
Population estimates from administrative data sources 5 th Administrative Data Seminar Dublin Castle 12 th April
Data Science in Official Statistics: The Big Data Team
Evaluating the potential for moving away from a traditional census Becky Tinsley Office for National Statistics (ONS), UK.
Agenda Introduction Why have a PES? Essential features of a PES
Pete Benton Alistair Calder
Integrating administrative data – the 2021 Census and beyond
Tabulations & Dual System of Estimation (DSE)
Beyond 2011 Administrative data sources and low-level aggregate models for producing population estimates.
Population Statistics without a Census or Register
Tabulations & Dual System of Estimation (DSE)
Presentation transcript:

Adjusting for coverage error in administrative sources in population estimation Owen Abbott Research, Development and Infrastructure Directorate

Agenda Framework for population estimates Where are we now Census Beyond 2011 Recent work on estimation Plans for future research Summary

Introduction Fundamental need for high quality population estimates (Local Authority by age and sex) Currently obtained through decennial census and cohort component method for intervening years Quality varies (and is not even) This session reviews: Making a population estimate with and without a census Future plans for research

Framework for producing population estimates

Where are we now? Reminder of how the framework was used in the 2011 Census Outline the how we applied the framework with administrative data only

2011 Census framework for producing population estimates

The 2011 Census Census Coverage Survey Large (350k households) Designed around expected coverage patterns High quality matching Automated and lots of clerical effort Dual System Estimation Bias adjustments: Corrections for biases in the DSE Overcoverage

Producing population estimates using linked administrative data

Population estimation without a census Construction of SPDs from admin data Reliance on matching Developed rules using multiple sources Large PCS similar to CCS Can use web data collection Similar estimation methodology DSE based Explored alternative weighting classes Beginning to develop bias adjustments

Estimation methodology research Coverage survey non-response is key issue Have used DSE, but requires: accurate matching of persons Independence overcoverage adjustments Lots of other assumptions Have been considering using weighting classes as an alternative

Estimation methodology research Weighting Classes: This approach requires addresses to be linked between survey and auxiliary Then can use information about (survey) responding and non-responding addresses

New developments - estimation

Plans for future research 2021 Census Administrative data based

Improvements for 2021 Census Expand use of admin data in data collection Aim to reduce variability in response rates Use admin data to enhance base census data NISRA did this in 2011 Can use SPD construction ideas Explore Weighting classes What would 2011 estimate have looked like? Revise sample design Aim to reduce variability in quality across LAs

Further work on admin data based estimates Continue to explore matching methods Understanding and measuring matching error Continue to learn more about key sources List lag/inflation/cleaning/changes Continue to explore ways of combining sources to construct SPDs Develop signs of life indicators Use of address register

Further work on alternative Coverage survey Sample design – clustered/unclustered? Practicalities (e.g. Timing) Carry on work to explore estimation methodology Comparing DSE vs Weighting Class Performance in presence of matching error Adjusting for erroneous inclusions Adjusting for within-household non-response Develop small area estimation method(s)

Key research questions What will the coverage patterns be like in an online 2021 Census? What are the coverage patterns in the evolving SPDs? Where does administrative (or other) data have the most benefit (cost/quality)?

Summary Population estimates are the key outputs Need to focus on how these are delivered from an online census AND carry on developing potential administrative based methods Understanding and influencing the underlying coverage patterns is critical

Discussant Li-Chun Zhang University of Southampton & Statistics Norway

Population size estimation Internationally speaking England & Wales: options so far explored Trimmed Dual-System Estimation (TDSE) Modelling erroneous enumerations Census 2021 and Beyond 2021

Internationally speaking Register-based population counts Negligible cost; no field work ‘Near-perfect’ Central Population Register (CPR) “Traditional” census Census enumeration + 2 coverage surveys Independent sample for under-coverage adj. Dependent sample for over-coverage adjustment In-between CPR-enumeration + 2 coverage surveys Can afford much larger surveys

England & Wales Dependent sampling of records from SPD deemed infeasible Dependent sampling of addresses/postcodes from SPD deemed feasible Independent under-coverage survey can not yield valid “type 4” over-coverage estimates “Type 4”: erroneous inclusion

Options explored: SPD, Weighting, DSE

Trimmed DSE (TDSE) Score selection of SPD records → k PCS matching → k = (k 1, k 0 ) TDSE

TDSE: an illustration

TDSE: N=1000, high-quality scenario Scoring rate: P(erroneous) high, say, 70% Catch rate (PCS, SPD): high, say, 90% Erroneous SPD enumeration: low, say, 2%

Stopping rule: r=50, N=1000

Stopping rule: r=250, N=1000

Stopping rule in expectation: N=1000 Rates (%) Initial DSE Stoppage TDSE Ideal SD(DSE) Approx SD(TDSE) No. errors Expected selection 70, 90, , 90, , 75, , 75, , 90, , 75, , 75, (1) (2) (3)

Modelling erroneous counts: 2021 Model-A: P(erroneous | in Census and T-SPD) = P(erroneous | in Census but not in T-SPD) * P(erroneous | in T-SPD but not in Census) Model-B: P(erroneous | in Census and T-SPD) = P(erroneous | in Census) * P(erroneous | in T-SPD) (Can be fitted with PCS in addtion)

Discrimination: Model A (left) B (right)

Beyond 2021 option: unwinding SPD? SPD has multiple input datasets Unwinding SPD, say, SPD-I = PR, somewhat trimmed SPD-II = everything else, somewhat trimmed Less stringent model assumptions? SPD-I SPD-II SPD-III (Analogous: independence vs. null 2 nd -order interaction)

Discrimination: Model A (left) B (right)

Investigations forward Premise: no dependent sampling? Weighting class adjustment Nonresponse bias after reweighting acceptable? SPDs: trimming & scoring Connecting SPDs and TDSE-modelling Early-stoppage once model captures remaining bias Improve efficiency via bias-adjusted TDSE Use SPDs to improve census 2021 estimates Small-area smoothing of adjustments? Future population statistics without census