1 Statistical Disclosure Control for Communal Establishments in the UK 2011 Census Joe Frend Office for National Statistics.

Slides:



Advertisements
Similar presentations
Transitions from independent to supported environments in England and Wales: examining trends and differentials using the ONS Longitudinal Study Emily.
Advertisements

Will 2011 be the last Census of its kind in England and Wales? Roma Chappell, Programme Director Beyond 2011 Office for National Statistics, July 2011.
Statistical Disclosure Control (SDC) for 2011 Census Progress Update Keith Spicer – ONS SDC Methodology 23 April 2009.
Output Consultation Plans and Statistical Disclosure Control Strategy developments Angele Storey and Jane Longhurst ONS.
Household Projections for England Yolanda Ruiz DCLG 16 th July 2012.
Progress on the SDC Strategy for the 2011 Census 23 rd June 2008 Keith Spicer and Caroline Young.
Inferential Statistics
Data linking – Project update 15 th May 2012 – Homecare & SDS event Atlantic Quay Ellen Lynch & Euan Patterson.
2011 Key challenges Peter Benton Head of 2011 Census Design Authority.
Irish Census of Population & National Disability Survey, th Meeting of the Washington Group on Disability Statistics September 19-21, 2007 Dublin,
EGM – Population & Housing Censuses Eurostat / UNECE - Geneva - 24/25 May 2012 Beyond 2011 The future of population statistics (England & Wales) Alistair.
Statistical Disclosure Control Philip Johnston, Information Services Division, NHSNSS ScotPHO training course, 1 April 2011.
Statistical Disclosure Control for the 2011 UK Census Keith Spicer Office for National Statistics.
Sample of Anonymised Records: User Meeting Propensity to migrate by ethnic group: 1991 & 2001 Paul Norman 1, John Stillwell 2 & Serena Hussain 2 School.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Methods of Geographical Perturbation for Disclosure Control Division of Social Statistics And Department of Geography Caroline Young Supervised jointly.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 8-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
CCG 1 MoSeS Introduction and Progress Report Andy Turner
The ONS Longitudinal Study. © London School of Hygiene and Tropical Medicine The Office for National Statistics Longitudinal Study (LS) o What is it o.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Total Quality Management BUS 3 – 142 Statistics for Variables Week of Mar 14, 2011.
Linking lives through time Marital Status, Health and Mortality: The Role of Living Arrangement Paul Boyle, Peteke Feijten and Gillian Raab.
Household projections for Scotland Hugh Mackenzie April 2014.
Effects of Income Imputation on Traditional Poverty Estimates The views expressed here are the authors and do not represent the official positions.
Chapter 10 Hypothesis Testing
Confidence Intervals and Hypothesis Testing - II
GEOG3025 Census and administrative data sources 2: Outputs and access.
Screening Data for Disclosure Risk and the Research behind One Possible Tool Kristine Witkowski Research support from the National Institute of Child Health.
Using the Health Survey for England to examine ethnic differences in obesity, diet and physical activity Vanessa Higgins & Angela Dale Centre for Census.
The Application of the Concept of Uniqueness for Creating Public Use Microdata Files Jay J. Kim, U.S. National Center for Health Statistics Dong M. Jeong,
Intruder Testing: Demonstrating practical evidence of disclosure protection in 2011 UK Census Keith Spicer, Caroline Tudor and George Cornish 1 Joint UNECE/Eurostat.
Statistical Disclosure Control for the 2011 UK Census Jane Longhurst, Caroline Young and Caroline Miller (ONS)
1 Statistical Disclosure Control Methods for Census Outputs Natalie Shlomo SDC Centre, ONS January 11, 2005.
Plans for Access to UK Microdata from 2011 Census Emma White Office for National Statistics 24 May 2012.
2011 CENSUS Coverage Assessment – What’s new? OWEN ABBOTT.
Survey Harmonisation in Scotland an overview of the theoretical and the practical By Janette Purbrick, Office of the Chief Statistician 24 th January 2008.
1 Things That May Affect Estimates from the American Community Survey.
Incorporating recent trends in household formation into household projections for Scotland Esther Roughsedge Household Estimates and Projections Branch.
A Strategy for Prioritising Non-response Follow-up to Reduce Costs Without Reducing Output Quality Gareth James Methodology Directorate UK Office for National.
Sustainable rural populations: the case of two National Park areas Alan Marshall Ludi Simpson Cathie Marsh Centre for Census and Survey Research.
1 Assessing the Impact of SDC Methods on Census Frequency Tables Natalie Shlomo Southampton Statistical Sciences Research Institute University of Southampton.
Internet versus paper mode effects in the 2011 Census of England and Wales: analysis of Census Quality Survey agreement rates Cal Ghee 26 September 2014.
2011 Census: Lessons learned from the Business Sector Dr Barry Leventhal MRS Census & Geodemographics Group CAG Meeting 8 th January 2015.
The Impact of Disclosure Control on Labour Market Statistics (& other issues)– the User’s Gripes Jill Tuffnell Head of Research Cambridgeshire County Council.
Measuring Socially and Economically Sustainable Rural Communities A policy based approach Pippa Gibson Defra.
5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.
WP 19 Assessment of Statistical Disclosure Control Methods for the 2001 UK Census Natalie Shlomo University of Southampton Office for National Statistics.
JOINT UN-ECE/EUROSTAT MEETING ON POPULATION AND HOUSING CENSUSES GENEVA, MAY 2009 DETERMINING USER NEEDS FOR THE 2011 UK CENSUS IAN WHITE, Office.
Process Quality in ONS Rachel Skentelbery, Rachael Viles & Sarah Green
General Register Office for S C O T L A N D information about Scotland's people The 2011 Census in Scotland: what will be different and why Sandy Taylor.
UNSD/STATISTICS KOREA International Seminar on Population and Housing Censuses: Beyond the 2010 Round Seoul, November 2012 Beyond 2011: The future.
Improved Register Data Matching and its Impact on Survey Population Estimates Steve Vale Office for National Statistics, UK.
EGM – Population & Housing Censuses Eurostat / UNECE - Geneva - 24/25 May 2012 Building the address register for the 2011 Census (England & Wales) Alistair.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
Characterizing Rural England using GIS Steve Cinderby, Meg Huby, Anne Owen.
Jonathan Smith and Cal Ghee Migration Statistics Improvement, ONSCD Centre for Demography Improving internal migration estimates of students.
Data Management and Analysis John Hollis Demographic Consultant, GLA Data Management and Analysis Statistical Aspects.
The 2011 Census: Estimating the Population Alexa Courtney.
JOINT UN-ECE/EUROSTAT MEETING ON POPULATION AND HOUSING CENSUSES GENEVA, 7-9 JULY 2010 A QUALITY ASSURANCE STRATEGY FOR THE 2011 CENSUS IN ENGLAND AND.
JOINT UN-ECE/EUROSTAT MEETING ON POPULATION AND HOUSING CENSUSES GENEVA, 7-9 JULY 2010 DISSEMINATING THE RESULTS OF THE 2011 CENSUS IN ENGLAND AND WALES.
UN ECE Seminar on New Frontiers for Statistical Data Collection 31 Oct – 2 Nov 2012 Beyond 2011 The future of population statistics Andy Teague, Office.
Census 2011 – A Question of Confidentiality Statistical Disclosure control for the 2011 Census Carole Abrahams ONS Methodology BSPS – York, September 2011.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Section 1.3 Objectives Discuss how to design a statistical study Discuss data collection techniques Discuss how to design an experiment Discuss sampling.
11 Measuring Disclosure Risk and Data Utility for Flexible Table Generators Natalie Shlomo, Laszlo Antal, Mark Elliot University of Manchester
Creation of synthetic microdata in 2021 Census Transformation Programme (proof of concept) Robert Rendell.
Progress towards a table builder with in-built disclosure control for 2021 Census Keith Spicer UNECE, 22 September 2017.
Assessing Disclosure Risk in Microdata
Integrating administrative data – the 2021 Census and beyond
2001 Census Disclosure Control UK variations
Presentation transcript:

1 Statistical Disclosure Control for Communal Establishments in the UK 2011 Census Joe Frend Office for National Statistics

2 Definitions UK SDC Policy decision Household methodology Communal establishment methodology Summary Q&A Overview

3 Communal Establishments (CEs): An establishment providing managed residential accommodation CE type: The broadened category in which a CE will appear as in the Census output tables Residents: All persons living in a CE Client: Non-staff residents that the CE caters for (eg. patients of a hospital, clients of a hotel etc.) Staff: Staff / Owners living in a CE Family: Family members / partners that live in a CE with either a member of staff or a client Definitions

4 104 Delivery Groups (DGs) in England & Wales (≈ 500,000 persons & 200,000 households per DG) LADs in a DG ≈ 20 MSOAs in an LAD (≈ 7,500 persons & 3,000 households per MSOA) Geography DG LAD MSOA

5 Registrars General’s Agreement, November 2006 In line with the Statistics and Registration Service Act, 2007 (SRSA) More importance placed on protecting attribute disclosure than identification Small cells (0s, 1s, 2s) allowed provided there is sufficient uncertainty that those cells counts are real, and that creating this uncertainty does not cause significant damage to the data. Targeted Record Swapping selected UK SDC Policy

6 Whole households are swapped Risk score calculated for each household Non-response affects the swap rate SDC for households I High risk = Higher chance of being sampled Low non-response rate in delivery group = Higher swap rate

7 House is sampled Matched only as far as necessary Match on household size and other variables SDC for households II MSOA 1MSOA 2 Household matched on: No. of adults No. of children Are there pets? Household unique within MSOA = Find match outside MSOA

8 House is swapped SDC for households III MSOA 1MSOA 2

9 1.SDC methodology for CEs to remain consistent with that of households Targeted record swapping 2.Keep the numbers of persons and the numbers of CEs unchanged at all geographies Individual records swapped 3.Keep swapping within delivery group SDC for CEs: The rules

10 The wide range of CE types and resident types Population characteristics will vary between CE types and resident types The risk and impact of disclosure will vary between CE types and resident types The public nature of CEs If a CE is identified it can essentially be viewed as a smaller geography SDC for CEs: The challenge

11 Maximise utility / Minimise damage Minimise swap rate Create an efficient matching process Swap individuals within the same CE type Swap individuals within the same resident type (eg. staff with staff, clients with clients, family residents with family residents) SDC for CEs: The aims

12 Response rates are likely to vary as much between CE types than between delivery group Impact and likelihood of disclosure varies between CE types The factors which will affect the disclosure risk are: Rarity of CE type in the area Number of residents in the CE type Other factors impacting on uncertainty Set swap rate for each CE type in each MSOA Minimising the swap rate I

13 Numbers of clients and staff vary within CE types Set protection scores for staff and client residents, in each CE type, in each MSOA Family residents have set swap rate within the delivery group Minimising the swap rate II

14 Calculating the Protection Scores For client residents: For staff residents:

15 Characteristics of residents will be different between CE types Swap within CE type The problem: Rule 2: Keep swapping within delivery group How do we swap individuals in a CE type, unique in the delivery group? Must swap between CE types when this happens Matching variables chosen to so key attributes remain consistent with the CE type Swap within CE type

16 Swap rates may not be the same Characteristics of staff, clients and family residents will be different Swap within resident type Matching variables chosen to so key attributes remain consistent with the CE type Matching variables will be different between staff, client and family residents Swap within resident type

17 Resident type: Clients 73 client residents CE type: Hotels, B&Bs and guest houses 8 CEs of this type in MSOA 1 Protection score: A = 1 B = 1 C = 1 D = 2 E = 1 1 x 1 x 1 x 2 x 1 = 2 So, CPS = 2 Example 1: Creating the CPS I MSOA 1

18 Resident type: Clients 73 client residents CE type: Hotels, B&Bs and guest houses 8 CEs of this type in MSOA 1 Protection score: A = 1 B = 1 C = 1 D = 2 E = 1 1 x 1 x 1 x 2 x 1 = 2 So, CPS = 2 Low swap rate Example 1: Creating the CPS II MSOA 1

19 Individuals are swapped Risky records are targeted Swap rate dependent on Protection Score Example 1: Matching I MSOA 1 High risk = Higher chance of being sampled Low protection score = Lower swap rate

20 Individual is sampled Matched only as far as necessary Matched on CE type, resident type and client specific variables Example 1: Matching II MSOA 1 MSOA 2 Clients matched on: Pattern of jumper Do they have a hat? Do they have glasses?

21 Example 1: Matching III MSOA 1 MSOA 2 Individual is swapped

22 Prison is unique within delivery group = Swap individual outside of the CE type Matched only as far as necessary Matched on resident type and client specific variables Example 2: Matching I MSOA 1 MSOA 2 Clients matched on: Pattern of jumper Do they have a hat? Do they have glasses?

23 Individual is swapped Still able to find a match Limit the damage to the data Example 2: Matching II MSOA 1 MSOA 2 Individual matched on: Pattern of jumper Is there a hat? Do they have glasses?

24 SDC for CEs: The Process Calculate protection score for each CE type and resident type in each MSOA Select a sample of records to be swapped using the risk score as a weighting Assign risk score for individual records Swap records Match records on CE type and a selection of variables, dependent on resident type Set swap rate for dependent on the protection score

25 Both CE and household methodology will use targeted record swapping Numbers of households, CEs and persons will remain unchanged at all geographies CE methodology will swap individuals SDC methodology aims to maximise utility: Minimise amount of swapping using protection scores Swap only as far as necessary Aim to swap within CE type and resident type Match on different variables for different resident types Summary

26 Q&A For general SDC questions: For CE SDC questions: