Internet access to UK Census interaction data: that's WICID! John Stillwell Centre for Computational Geography University of Leeds, Leeds LS2 9JT
Focus §The provision and development of a Census Interaction Data Service (CIDS) §The development of a web-based information system (WICID) that allows researchers to extract data on interaction flows between origins and destinations
Presentation § Population censuses § CIDS challenge: aims and objectives § Census based interaction data sets § Importance of interaction data in planning § System architecture and metadata structure § Features of the interface § Some examples of data extraction and analysis
Acknowledgements § Funded by ESRC and JISC under the 2001 Census of Population Programme (Award number H ) § Began: 1 August 2001 § Ends: 31 July 2006 § Paul Boyle and Zhiqiang Feng, University of St Andrews § Keith Cole and Justin Hayes, University of Manchester
Population censuses § reliable § comprehensive § small area scale § very important for geographic research but § periodic § not problem free
Problems § Incorrect or missing information e.g. origin or workplace unstated § Measures taken to preserve confidentiality e.g. minimum thresholds for persons and households § Underenumeration e.g. missing million in is first ‘One Number Census’ where underenumeration has been adjusted for by imputation
2001 Census population counts: male deficit in young adult ages: cause for concern? Parity
Digital data products §Various different products of 1991 Census - LBS: Local Base Statistics - SAS: Small Area Statistics - SMS: Special Migration Statistics -SWS: Special Workplace Statistics -SARs: Samples of Anonymised Records -LS: Longitudinal Study -DBD: Digital Boundary Data §Similar set of products for 2001 Census
Data Support Units §Census Registration Service (CRS) (University of Essex) §Census Dissemination Unit (University of Manchester) for CAS (LBS/SAS) §Geography Data Unit: UKBORDERS (University of Edinburgh) for DBD §Census Interaction Data Service (CIDS) (University of Leeds) for SMS/SWS §Census Microdata Unit, CCSR (University of Manchester) for SARs §Centre for Longitudinal Study Information and User Support (CeLSIUS) (University of London School of Hygiene and Tropical Medicine) for LS
CIDS Challenge § Aim is to increase the use of the Origin- Destination (interaction) data sets that will be produced as outputs from the 2001 UK Census (as well as the equivalent special migration and commuting data sets produced from 1991 and 1981 censuses)
Objectives § To provide users with access to existing census based interaction data sets (for 1991 and 1981) via the Internet § To develop an ‘user-friendly’ interface for query building and data extraction § To develop a flexible system that allows new data sets (e.g Census interaction data) to be added without any major system redesign
Objectives § To allow users the option of constructing queries that involve the selection of origin and destination areas at different spatial scales § To produce a library of popular sub-sets of interaction data for quick extraction and downloading § To provide facilities to download the data in alternative formats
Interaction Data Sets for 1991 Original migration data sets §SMS Set 1 (2 tables, 12 counts, wards) §SMS Set 2 (12 tables, 93 counts, districts)
Interaction Data Sets for 1991 Original commuting and student data sets § SWS Set C (9 tables, 274 counts, wards) § Table 100 (2 tables, 47 counts, districts) Derived data sets § SMS Set 1 Tables 3-10 (adjusted by Rees and Duke-Williams) so-called SMSGAPS § SMS Set 1 Table 3 (adjusted by Simpson and Middleton) so called SMS-MIGPOP
Suppression problem §Only flows for some origin- destination pairs are available §Can identify the extent of suppression using a raster image based on SMS Set 2 Table 1 (no flows suppressed) §Each cell shaded - white when cell empty - grey when flow is black when flow is 10+
Interaction Data Sets for 1981 §1981 SMS Set 2 (Male and female migrants between wards in 1991) §1981 SWS Set C (Five tables counts - of journey to work flows between wards in 1991) §all re-estimated for 1991 ward geography by Boyle and Feng as part of a Census Development Programme project
Importance of interaction data in planning § Interaction data sets are very important because they provide information on migration and commuting flows between small areas (wards) § e.g. migration data used for population estimation and projection by central and local government agencies § e.g. commuting data used to define TTWAs and in transport planning
System Architecture
Metadata Structure
Features of the Interface: CIDS homepage
Login page
Welcome page
Different parts of the page Status line Feedback and Help Part of each page where you will interact Logout Key Links
Off the shelf
Flow summaries
General query page
Area Selection Tools
List Area Selection Tool
Origins selected
Data Selection Tools
Data Selection Options
Table Selection for 1991 SMS 2
Variable Selection with Table 1 of 1991 SMS 2
Current query summary
Data extraction
Output Planner
Output previewer
Download file
Examples of data extractions § In-migrants to Cardiff § Immigrants from Greece by age and sex § Immigrants from Greece § Net migration in Scotland § Students to and from Sussex § Social class of commuters in Leeds
Query: What were the in-migration flows to Cardiff district from various origins in ?
In-migration flow to Cardiff, using SMS Set 2, Source: 1991 Census Crown Copyright: ESRC/JISC Purchase
Query: What is the age structure of migrants from Greece to GB in ?
Age profile of migrants from Greece, Source: 1991 Census Crown Copyright: ESRC/JISC Purchase
Query: What is the distribution of migrants from Greece by district of GB in ?
Districts receiving most migrants from Greece, Source: 1991 Census Crown Copyright: ESRC/JISC Purchase
What are the net migration balances of Scottish districts for (a) moves within Scotland and (b) moves between Scotland and England and Wales?
Source: 1991 Census Crown Copyright: ESRC/JISC Purchase
Query: Where do universities in Sussex get their students from?
Main county origins with term-time addresses in Sussex Source: 1991 Census Crown Copyright: ESRC/JISC Purchase
Social structure of Leeds and surrounding districts using 1991 SWS
Query: What are the journey to work flows between zones by social class in 1991?
Social class structure of commuters in Leeds and between Leeds and surrounding districts, 1991 Source: 1991 Census Crown Copyright: ESRC/JISC Purchase
Source: 1991 Census Crown Copyright: ESRC/JISC Purchase Proportions of commuters to Central Leeds from other wards and surrounding districts by social class, 1991 Professional Managerial and Technical
Source: 1991 Census Crown Copyright: ESRC/JISC Purchase Skilled Non-manual Skilled Manual
Source: 1991 Census Crown Copyright: ESRC/JISC Purchase Partly Skilled Unskilled
Conclusions/Next steps § CIDS now available and WICID stable § Usage rates reasonable (particularly since Athens One Stop Sign On was introduced by Census Registration Service) §Access available at
Next steps § Flexible metadata structure of WICID will allow 2001 Census results to be included in the system § WICID will be used to verify the 2001 data in 2003
Next steps Mapping option for geographical area selection now being developed
Next steps § Analysis and modelling facilities in WICID intended but raises difficult questions § Important to consider the research agenda for the future… what are the key research areas that should be investigated: e.g. validation studies, comparative studies, empirical research, modelling work