Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton Beyond 2011 Programme Director Office for National Statistics.

Slides:



Advertisements
Similar presentations
Sampling: Theory and Methods
Advertisements

Using Growth Models to improve quality of school accountability systems October 22, 2010.
THE 2004 LIVING CONDITIONS MONITORING SURVEY : ZAMBIA EXTENT TO WHICH GENDER WAS INCORPORATED presented at the Global Forum on Gender Statistics, Accra.
Measuring Coverage: Post Enumeration Surveys Owen Abbott Office for National Statistics, UK.
Statistics NZs experience in using Administrative Data in an Integrated Programme of Economic Vince Galvin General Manager Strategy & Communications.
Title I, Part A and Section 31a At Risk 101
Improving the commercial sectors access to government data Visiting Professor Birkbeck College University of London SUF Users Forum 18 th February 2007.
Will 2011 be the last Census of its kind in England and Wales? Roma Chappell, Programme Director Beyond 2011 Office for National Statistics, July 2011.
Research into an alternative sampling frame for the FRS Antonia Simon, Development Team, DWP.
Outline of talk The ONS surveys Why should we weight?
The Census Area Statistics Myles Gould Understanding area-level inequality & change.
Introduction to Sampling : Censuses vs. Sample Surveys
SADC Course in Statistics Basic summaries for demographic studies (Session 03)
Child Care Subsidy Data and Measurement Challenges 1 Study of the Effects of Enhanced Subsidy Eligibility Policies In Illinois Data Collection and Measurement.
Fitness for Work: the Government response to “Health at Work – an independent review of sickness absence” Cost of sickness absence: Economy - £15bn Employers.
Company Name PRESENTATION NAME Compilation and Dissemination of Energy Statistics International Workshop on Energy Statistics, Beijing, Sep 2012 International.
1 The Social Survey ICBS Nurit Dobrin December 2010.
Pennsylvania Value-Added Assessment System (PVAAS) High Growth, High Achieving Schools: Is It Possible? Fall, 2011 PVAAS Webinar.
Update on Population Statistics Research Projects Jonny Tinsley, Population Statistics Research Unit
1 Lincolnshire Research Observatory Lincolnshire’s Changing Population Components of Change and the Demographic Impact Eleanor.
Title I, Part A Targeted Assistance 101 Field Services Unit Office of School Improvement.
Lincolnshire Research Observatory Measuring Lincolnshire’s Population Measuring Lincolnshire’s Population – An introduction to.
Improving Migration and Population Statistics: Mid-Year Estimates Office for National Statistics Centre for Demography.
Estimating Net Child Care Price Elasticity Of Partnered Women With Preschool Children Using Discrete Structural Labour Supply-child Care Model Xiaodong.
1 Volume measures and Rebasing of National Accounts Training Workshop on System of National Accounts for ECO Member Countries October 2012, Tehran,
The Net Undercount of Children in the Decennial Census Based on Demographic Analysis by Dr. William P. O’Hare O’Hare Data and Demographic Services, LLC.
Data, Now What? Skills for Analyzing and Interpreting Data
Administrative Data Sources ONS Centre for Demography.
A model-based approach for estimating international emigration for local authorities Brian Foley, Office for National Statistics BSPS day meeting London.
EGM – Population & Housing Censuses Eurostat / UNECE - Geneva - 24/25 May 2012 Beyond 2011 The future of population statistics (England & Wales) Alistair.
Beyond 2011 – A new paradigm for population statistics? Pete Benton, Beyond 2011 Programme Director Office for National Statistics, UK.
Beyond 2011 The Future of Population Statistics Martin Ralphs, Office for National Statistics.
MEASURING INCOME AND POVERTY AT A NATIONAL LEVEL Sian Rasdale Social Justice Analysis, Scottish Government.
12th Meeting of the Group of Experts on Business Registers
Migration Statistics Improvement Programme – Overview of Phase 2 ONS Centre for Demography.
General Register Office for S C O T L A N D information about Scotland's people General Register Office for Scotland “Information about Scotland’s people”
1 Statistical Disclosure Control for Communal Establishments in the UK 2011 Census Joe Frend Office for National Statistics.
Coverage assessment and adjustment methodology Owen Abbott Methodology Directorate, ONS.
2011 CENSUS Coverage Assessment – What’s new? OWEN ABBOTT.
Plans for the Research and Testing Phase of the 2020 Census Presentation to the State Data Centers October 15, 2010 Daniel H. Weinberg (Assistant Director.
1 Sources of gender statistics Angela Me UNECE Statistics Division.
United Nations Economic Commission for Europe Statistical Division Sources of gender statistics Angela Me UNECE Statistics Division.
Register-based migration statistics and using additional administrative data sources Barica Razpotnik Statistical Office of the Republic of Slovenia UNECE.
Plausibility Ranges for Population Estimates Focusing on ranges for children.
New and easier ways of working with aggregate data and geographies from UK censuses Justin Hayes UK Data Service Census Support.
GEOG3025 Census and administrative data 1: Sources and methods.
1 Understanding and Measuring Uncertainty Associated with the Mid-Year Population Estimates Joanne Clements Ruth Fulton Alison Whitworth.
UNSD/STATISTICS KOREA International Seminar on Population and Housing Censuses: Beyond the 2010 Round Seoul, November 2012 Beyond 2011: The future.
Design of the 2011 Census Coverage Survey Owen Abbott (ONS) James Brown (Institute of Education)
The Long Term Strategy for Population Surveys in Scotland 2009 – 2019 Alex Stannard Statistician, Scottish Government.
Assessing the accuracy of different models for combining aggregate level administrative data Dilek Yildiz Supervisors: Peter W. F. Smith, Peter G.M. van.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
Beyond 2011 Administrative data sources and low-level aggregate models for producing population counts.
The 2011 Census: Estimating the Population Alexa Courtney.
JOINT UN-ECE/EUROSTAT MEETING ON POPULATION AND HOUSING CENSUSES GENEVA, 7-9 JULY 2010 A QUALITY ASSURANCE STRATEGY FOR THE 2011 CENSUS IN ENGLAND AND.
Beyond 2011 The future for population statistics? (Introduction to options and consultation)
UN ECE Seminar on New Frontiers for Statistical Data Collection 31 Oct – 2 Nov 2012 Beyond 2011 The future of population statistics Andy Teague, Office.
Using administrative data to produce official social statistics New Zealand’s experience.
IAOS Shanghai – Reshaping Official Statistics Some Initiatives on Combining Data to Support Small Area Statistics and Analytical Requirements at.
Jo Watson sepho South East Public Health Observatory Solutions for Public Health Day 2: Session 2 Populations and geography.
Business case development and benefits quantification Neil Townsend 25 November 2012.
Beyond 2011 Voluntary Sector Statistics User Event Minda Phillips Amelia Ash.
The evolution of the England and Wales census in a European context Garnett Compton, ONS RSS Conference, 9 September 2015.
Evaluating the potential for moving away from a traditional census Becky Tinsley Office for National Statistics (ONS), UK.
Integrating administrative data – the 2021 Census and beyond
User Workshop Manchester 26th November 2012 Beyond 2011
Beyond 2011 Administrative data sources and low-level aggregate models for producing population estimates.
Pete Benton , Beyond 2011 Programme Director
Presentation transcript:

Beyond 2011 The future for population statistics? IMA Mathematics 2012 Pete Benton Beyond 2011 Programme Director Office for National Statistics

Outline Background to the Census The Beyond 2011 Programme Statistical options for the future Key mathematical challenges Timeframes Next steps

The purpose of the census The basis for national decision making: Service planning where to locate schools, hospitals, etc. housing plans transport Resource allocation health and local govt £100bn each per year Policy making and monitoring Equality – age, sex, ethnicity, disability Ageing population – pensions etc Academic and social research

Key Census outputs Benchmark statistics on: Population units: people and housing with key demographics (age, sex, ethnicity) Population structures: households, families Population and housing attributes For small areas and small population groups With multivariate analysis Consistent and comparable

The 2011 Census Very successful - 94% response overall - Over 90% across London overall - Over 80% response in every Local Authority Significant improvement in key Local Authorities The result of extensive mathematical modelling - Response targets to achieve required output quality - Predicted initial response from key groups / areas - Numbers of field staff required to reach final targets - Daily live response rate modelling to support operational decisions

The Beyond 2011 Programme Why change? – Why look beyond 2011? Rapidly changing society Evolving user requirements New opportunities – data sharing Traditional census – costly and infrequent?? UK Statistics Authority to Minister for Cabinet Office “As a Board we have been concerned about the increasing costs and difficulties of traditional Census-taking. We have therefore already instructed the ONS to work urgently on the alternatives, with the intention that the 2011 Census will be the last of its kind.”

Beyond 2011 : Statistical options Aggregate analysis 100% linkage to create ‘statistical population spine’ (Intermediate) Sample linkage e.g. 1% of postcodes Address register + Survey Administrative data options Traditional Census (long form to everyone) Rolling Census (over 5/10 year period) Short Form (everyone), Long form (Sample) Short Form + Annual Survey (US model) Census options Survey option(s)

SOURCESFRAMEDATA ESTIMATION OUTPUTS All National to Small Area Beyond 2011 – statistical options Population Data Socio demographic Attribute Data Address Register Household Communal Maintained national address gazetteer – provides frame for population data & surveys Population estimates Attribute estimates Interactional Analysis E.g. TTWA Longitudinal data Household structure etc CENSUS Adjusting for Adjusting for non response Coverage Assessment incl. under & over-coverage - by survey and admin data? missing data and error bias in survey (or sources) Quality measurement Population distribution provides weighting for attributes Socio demographic Survey(s ) Admin Source Commercial sources? Comm Source ?? increasing later? Surveys to fill gaps

Potential data sources Population data NHS Patient Register DWP/HMRC Customer Information System Electoral roll (> 17 yrs) School Census (5-16 yrs) Higher Education Statistics Agency data (Students) Birth and Death registrations Socio-demographic sources Surveys DVLA? Commercial sources? Utilities? TV licensing?

DWP CIS population counts compared with ONS Mid Year population estimates

Patient Register population counts compared with ONS Mid Year population estimates

Electoral Roll population counts compared with ONS Mid Year population estimates

Higher Education Students Customer Information System Coverage Of Main Administrative Sources Extras includes: Some duplicates International students on short-term courses Students ceased studying, not formally deregistered Extras includes: Short-term migrant children Missing includes: Under 17s Ineligible voters Non responders Missing includes: Non school aged people Independent school children Home schooled children Missing includes: Some migrant worker dependants Some international students Undocumented asylum seekers Missing includes: Migrants not (yet) registered Newborn babies Some private only patients Missing includes: Non higher education students Independent University students Extras includes: Some duplicates Some ex-pats Some deceased Short-term migrants Extras includes: Multiple registrations Some ex-pats Some deceased Short-term migrants Extras includes: Some ex-pats Some deceased Short-term migrants Missing includes: Non-drivers Under 17’s Some foreign-licence holders Extras includes: Some ex-pats Some deceased UK Driving Licence Resident Population CIS PRD Electoral Roll Patient Register Data School Census SC ER SC ER DVLA HESA CIS PRD

Key risks of non census alternatives Public opinion Technical challenge Changes in administrative datasets UK harmonisation Getting a decision

Key mathematical challenges Methods for Production of statistics Coverage assessment and adjustment Data matching Correcting for missing data Small area population attribute modelling Methods for Protection of confidentiality Data pre-processing and encryption Statistical Disclosure Control Evaluation Quantifying financial benefits Defining what is an ‘acceptable’ level of quality

Coverage assessment How many fish in your pond? Day 1, catch 100, tag them, put them back Day 2, catch 50, find 25 already tagged How many fish in your pond? Answer: 200 (ish) According to day 2, half in the pond are marked We marked 100, so there must be about 200 altogether “Dual System Estimation”

Application to the census We ‘fish’ twice, in 1% of postcodes Census Then census coverage survey (CCS) 6 weeks later No need for tags They have names, addresses, dates of birth We match the two separate lists of people (500k) to work out What percentage of people in the CCS had first been ‘caught’ in the census Thus, the total population in each postcode

Coverage adjustment Apply the adjustment factor to the other 99% of postcodes where we did no CCS With appropriate stratification Add ‘synthetic’ records Extra households Extra people With the right key characteristics In roughly the right locations Using ‘Donor imputation’ to complete each record So that all the final tables add up to the right number

Dual system estimation - formulae Counted By CCS? Yes NoTOTAL Counted Yesn 11 n 10 n 1+ By Census? Non 01 n 00 n 0+ TOTALn +1 n +0 n ++ Total population n ++ = n 1+  n +1 n 11 We can make life very complicated for people who aren’t mathematicians!

Application to administrative data Administrative data sources also have undercount But the bigger problems are due to time lags - Emigration; deaths Results in overcount in administrative sources - Internal migration Results in people recorded in the wrong location - overcount in one area, undercount in another Just applying Dual System Estimation would result in significant over-estimation

Potential overcount estimation approaches (1) Redesigned coverage survey asking: who usually lives here? when did you move in? where are you registered to vote? where are you registered with a GP? who lived here before you? where do they live now? does John Smith still live here? Increasing sensitivity Reducing appropriateness / legality

Potential overcount estimation approaches (2) Match new coverage survey to admin data Measure coverage patterns, develop models Intermediate model Match records only in CS postcodes Full linkage model Match records in all sources across all postcodes Keep records if same location on all datasets => more likely to be correct Particularly if recently recorded ‘activity’ Develop intelligent rules to resolve residual records Reduces scale of overcount - but increases undercount

Small Area Estimation Surveys only give sufficient precision at relatively high levels of geography Users require information at lower levels Census ‘output area’ ~ 125 households / 300 people SAE - family of methods to increase precision of survey estimates at lower geographies by “borrowing strength” from other, more detailed data sources, or neighbouring areas Widely used by National Statistical Institutes e.g. unemployment, income, households in poverty - but generally univariate, estimating means

CVsSample size= 1,000,000 people Prevalence 0.5%1%5%10%15%20%50% Population size National 50,000,0001.4%1.0%0.4%0.3%0.2% 0.1% Region 5,500,0004.3%3.0%1.3%0.9%0.7%0.6%0.3% LA 150, %18.2%8.0%5.5%4.3%3.7%1.8% LA (small) 50, %31.5%13.8%9.5%7.5%6.3%3.2% MSOA (avg) 7, %82.9%36.3%25.0%19.8%16.7%8.3% MSOA (min) 5, %99.5%43.6%30.0%23.8%20.0%10.0% LSOA (avg) 1, %175.9%77.1%53.0%42.1%35.4%17.7% LSOA (min) 1, %222.5%97.5%67.1%53.2%44.7%22.4% OA %406.2%178.0%122.5%97.2%81.6%40.8% Ward (Eng) 7, %84.1%36.8%25.4%20.1%16.9%8.5% Ward (Wales) 3, %118.9%52.1%35.9%28.5%23.9%12.0% Precision of direct survey outputs

Potential components (Very?) Large survey Administrative sources aggregate (area based) or unit record available for lower geographic levels than survey outputs Possible models Generalised Linear Models (GLM): multi-level models spatial / temporal extensions can add power Bayesian or frequentist estimation frameworks Micro-simulation

Small area modelling - issues Quality of ancillary data is absolutely critical Most existing applications use census covariates More powerful models incorporate time and space effects, but are more complex Every variable is different, and requires different models There’s often no substitute for geography as a predictor ‘similar people gather in similar areas’ BUT clear academic view – the methods exist, it just depends on data

population estimates population characteristics outputs detailed design procure / develop develop / test ADMIN DATA SOLUTION detailed design procure / develop develop / test rehearserunoutputs TRADITIONAL CENSUS SOLUTION research / definition initiation BEYOND 2011 ‘Phase 1’ Sept 2014 recommendation & decision point Beyond Timeline - the key decision

population estimates population characteristics outputs detailed design procure / develop develop / test research / definition initiation 2024 address register admin sources required on an ongoing basis – ideally the National Address Gazetteer – subject to confirmation of quality public sector & commercial ? developing over time coverage surveys testing continuous assessment attribute surveys info from existing surveys – e.g. labour force survey, integrated household survey etc supplemented by new targeted surveys as required modelling increasing modelling over time Beyond Timeline (non census solution) test linkage increasing linkage over time

address register required on an ongoing basis administrative sources will change and disappear and be added & develop over time continuous coverage survey existing surveys increasing linkage over time increasing modelling over time need for attribute surveys declines over time ? regular production of population and attribute estimates ongoing methodology refinement Beyond and into the future

Improving quality & quantity accuracy of population estimates accuracy of characteristics estimates range of topics small area detail multivariate small area detail experimental statistics develop to become national statistics

Statistical benefit profile Benefit CensusAlternative method loss gain loss gain

Cost profile (real terms) Cost Census ??? Alternative method

Next steps Research potential methods and models Using census data To understand coverage patterns in admin data To simulate new survey designs As a gold standard – how well can we replicate census results? Assess quality, costs, benefits, risks Discuss with stakeholders (!) Public acceptability research Report progress every six months Make recommendations in 2014

Advice and assistance very welcome!