Beyond 2011 Administrative data sources and low-level aggregate models for producing population counts
Outline Overview of the Beyond 2011 Programme Potential models Administrative sources Models using aggregate data
Context Census Population statistics Social conditions Housing Uses: Resource allocation Service planning: local use, hospitals, commercial Policy development and monitoring EU regulations and duty to report to parliament small areas + multivariate combinations
Beyond 2011 Why look beyond 2011? Rapidly changing society Evolving user requirements Traditional Census difficulties New Opportunities – SRSA (2007) International trend Exploring options for the future provision of population and socio-demographic statistics
Statistical Development Statistical development in 3 work streams Census-type designs Short form / long form – Canada Short form / survey – USA Rolling Census - France Individual-level administrative data Low level aggregate data
Integrating Administrative Sources
Statistical Development Statistical development in 3 work streams Census-type designs Individual-level administrative data Low level aggregate data Use of social surveys considered across all options Short list of models early 2012 to be compared with 2011 Census outputs
Administrative Sources Current focus on five administrative sources: DWP/HMRC Customer Information System (CIS) NHS Patient Register School Census HESA Student Data Migrant Worker Scan (MWS) Quality assessment and suitability for purpose Coverage Quality Comparison with Mid Year Population Estimates
Percentage difference between CIS and Patient Register (2010)
CIS vs Patient Register (2010) London
Inner City London
Other Inner City
Low level aggregates overview Beginning to consider models to produce estimates from aggregate admin sources Initial focus on estimating LAD-level counts by age and sex Socio-demographic variables added using small area estimation techniques
Low level aggregate models Basic model for population counts: Combine admin sources at low level Weighted combination of broad coverage datasets Specific datasets for certain subgroups of the population, e.g. students
Low level aggregate models Enhancements to basic model: LA classification, e.g. urban, rural, university, London Different weights for different groups Different datasets for different groups
Low level aggregate models Basic model with coverage survey 2011 Census data used as gold standard to compare with admin data to determine assumptions and weights Survey needed post 2011 to carry out this recalibration
Low level aggregate models Basic model with small area estimation Models used to estimate additional variables Combine survey with admin sources Geographical detail depends on sample size and sample design Limitations for cross classification
Low level aggregate models Bayesian approach Combine administrative sources as prior information in Bayesian model Approaches developed by Southampton Uni and Stats NZ / University of Canterbury NZ Hybrid approach Basic model (with coverage survey?) used in some areas Other approaches used in hard to count areas
Methodological issues Reliance on administrative data Quality/coverage/definitions/future proofing Provision of socio-demographic data Measuring the accuracy of the estimates UK solution