Presentation is loading. Please wait.

Presentation is loading. Please wait.

Adjusting for coverage error in administrative sources in population estimation Owen Abbott Research, Development and Infrastructure Directorate.

Similar presentations


Presentation on theme: "Adjusting for coverage error in administrative sources in population estimation Owen Abbott Research, Development and Infrastructure Directorate."— Presentation transcript:

1 Adjusting for coverage error in administrative sources in population estimation Owen Abbott Research, Development and Infrastructure Directorate

2 Agenda Framework for population estimates Where are we now Census Beyond 2011 Recent work on estimation Plans for future research Summary

3 Introduction Fundamental need for high quality population estimates (Local Authority by age and sex) Currently obtained through decennial census and cohort component method for intervening years Quality varies (and is not even) This session reviews: Making a population estimate with and without a census Future plans for research

4 Framework for producing population estimates

5 Where are we now? Reminder of how the framework was used in the 2011 Census Outline the how we applied the framework with administrative data only

6 2011 Census framework for producing population estimates

7 The 2011 Census Census Coverage Survey Large (350k households) Designed around expected coverage patterns High quality matching Automated and lots of clerical effort Dual System Estimation Bias adjustments: Corrections for biases in the DSE Overcoverage

8 Producing population estimates using linked administrative data

9 Population estimation without a census Construction of SPDs from admin data Reliance on matching Developed rules using multiple sources Large PCS similar to CCS Can use web data collection Similar estimation methodology DSE based Explored alternative weighting classes Beginning to develop bias adjustments

10 Estimation methodology research Coverage survey non-response is key issue Have used DSE, but requires: accurate matching of persons Independence overcoverage adjustments Lots of other assumptions Have been considering using weighting classes as an alternative

11 Estimation methodology research Weighting Classes: This approach requires addresses to be linked between survey and auxiliary Then can use information about (survey) responding and non-responding addresses

12 New developments - estimation

13 Plans for future research 2021 Census Administrative data based

14 Improvements for 2021 Census Expand use of admin data in data collection Aim to reduce variability in response rates Use admin data to enhance base census data NISRA did this in 2011 Can use SPD construction ideas Explore Weighting classes What would 2011 estimate have looked like? Revise sample design Aim to reduce variability in quality across LAs

15 Further work on admin data based estimates Continue to explore matching methods Understanding and measuring matching error Continue to learn more about key sources List lag/inflation/cleaning/changes Continue to explore ways of combining sources to construct SPDs Develop signs of life indicators Use of address register

16 Further work on alternative Coverage survey Sample design – clustered/unclustered? Practicalities (e.g. Timing) Carry on work to explore estimation methodology Comparing DSE vs Weighting Class Performance in presence of matching error Adjusting for erroneous inclusions Adjusting for within-household non-response Develop small area estimation method(s)

17 Key research questions What will the coverage patterns be like in an online 2021 Census? What are the coverage patterns in the evolving SPDs? Where does administrative (or other) data have the most benefit (cost/quality)?

18 Summary Population estimates are the key outputs Need to focus on how these are delivered from an online census AND carry on developing potential administrative based methods Understanding and influencing the underlying coverage patterns is critical

19 Discussant Li-Chun Zhang University of Southampton & Statistics Norway

20 Population size estimation Internationally speaking England & Wales: options so far explored Trimmed Dual-System Estimation (TDSE) Modelling erroneous enumerations Census 2021 and Beyond 2021

21 Internationally speaking Register-based population counts Negligible cost; no field work ‘Near-perfect’ Central Population Register (CPR) “Traditional” census Census enumeration + 2 coverage surveys Independent sample for under-coverage adj. Dependent sample for over-coverage adjustment In-between CPR-enumeration + 2 coverage surveys Can afford much larger surveys

22 England & Wales Dependent sampling of records from SPD deemed infeasible Dependent sampling of addresses/postcodes from SPD deemed feasible Independent under-coverage survey can not yield valid “type 4” over-coverage estimates “Type 4”: erroneous inclusion

23 Options explored: SPD, Weighting, DSE

24 Trimmed DSE (TDSE) Score selection of SPD records → k PCS matching → k = (k 1, k 0 ) TDSE

25 TDSE: an illustration

26 TDSE: N=1000, high-quality scenario Scoring rate: P(erroneous) high, say, 70% Catch rate (PCS, SPD): high, say, 90% Erroneous SPD enumeration: low, say, 2%

27 Stopping rule: r=50, N=1000

28 Stopping rule: r=250, N=1000

29 Stopping rule in expectation: N=1000 Rates (%) Initial DSE Stoppage TDSE Ideal SD(DSE) Approx SD(TDSE) No. errors Expected selection 70, 90, 9010221001442029 70, 90, 9010561001445071 70, 75, 701071100112135071 30, 75, 7010711000121550167 70, 90, 901278100145250357 70, 75, 75133210001014250357 30, 75, 70135710061251250833 (1) (2) (3)

30 Modelling erroneous counts: 2021 Model-A: P(erroneous | in Census and T-SPD) = P(erroneous | in Census but not in T-SPD) * P(erroneous | in T-SPD but not in Census) Model-B: P(erroneous | in Census and T-SPD) = P(erroneous | in Census) * P(erroneous | in T-SPD) (Can be fitted with PCS in addtion)

31 Discrimination: Model A (left) B (right)

32 Beyond 2021 option: unwinding SPD? SPD has multiple input datasets Unwinding SPD, say, SPD-I = PR, somewhat trimmed SPD-II = everything else, somewhat trimmed Less stringent model assumptions? SPD-I SPD-II SPD-III (Analogous: independence vs. null 2 nd -order interaction)

33 Discrimination: Model A (left) B (right)

34 Investigations forward Premise: no dependent sampling? Weighting class adjustment Nonresponse bias after reweighting acceptable? SPDs: trimming & scoring Connecting SPDs and TDSE-modelling Early-stoppage once model captures remaining bias Improve efficiency via bias-adjusted TDSE Use SPDs to improve census 2021 estimates Small-area smoothing of adjustments? Future population statistics without census


Download ppt "Adjusting for coverage error in administrative sources in population estimation Owen Abbott Research, Development and Infrastructure Directorate."

Similar presentations


Ads by Google