InFuse: Data Feeds for the UK 2001 and 2011 Censuses and Beyond Justin Hayes Census Dissemination Unit (CDU) Mimas The University of Manchester.

Slides:



Advertisements
Similar presentations
DDI for the Uninitiated ACCOLEDS /DLI Training: December 2003 Ernie Boyko Statistics Canada Chuck Humphrey University of Alberta.
Advertisements

The Economic and Social Data Service (ESDS) Kevin Schürer ESDS/UKDA ESDS Awareness Day 5 December 2003.
Access to Economic and Social Data via the UK Data Archive Jack Kneeshaw UKDA.
ESDS Qualidata Libby Bishop, ESDS Qualidata Economic and Social Data Service UK Data Archive ESDS Awareness Day Friday 5 December 2003Royal Statistical.
2011 Census Outputs Dissemination Plans Working in Partnership Chris Ashford – ONS Census Outputs.
Will 2011 be the last Census of its kind in England and Wales? Roma Chappell, Programme Director Beyond 2011 Office for National Statistics, July 2011.
Obesity e-Lab Enabling obesity research using the Health Surveys for England: The Obesity e-Lab project Dexter Canoy The University of Manchester
ESDS Resources for BCS Users Vanessa Higgins Centre for Census and Survey Research University of Manchester.
Mapping and Visualising Census Data Keith Cole Jackie Carter Geo-data forum - 4/4/2001.
Where next…. Stakeholder workshop, 29 Jan To the end of the project.
Richard Wiseman & Celia Russell ESDS International, Mimas IASSIST 2009 Mind the Gap: Global Data Sharing.
Introduction to the unit and mixed methods approaches to research Kerry Hood.
Connecticut State Data Center at the Map and Geographic Information Center - MAGIC Connecticut State Data Center Data Collaborator for Planning, Analysis,
Let us Bring You to Your Census: Recent Developments in UK Census Data Provision Lucy Bell Census Registration Service Co-ordinator UK Data Archive
ESSnet on SDMX phase II Laura Vignola ISTAT Rome, 3-4 December 2012.
GeoConvert: Creating that Spatial Relationship David Rawnsley Mimas, University of Manchester.
Dissemination Channels for the 2011 Census data Accessing ONS data Callum Foster Office for National Statistics.
California Digital Library Applications in the Real World: The Counting California Experience with the DDI Patricia Cruse Ilona Einowski Juri Stratford.
Learning and Teaching with the UK Census Developing the Collection of Historical and Contemporary Census Data and Materials into a Major Learning and Teaching.
2001 Census Programme Delivering UK Census Data to Researchers: Progress and Challenges David Martin University of Southampton and ESRC/JISC Census Programme.
Part of the Arts and Humanities Data Service and the UK Data Archive. Funded by the Joint Information Systems Committee and the Arts and Humanities Research.
A Data Curation Application Using DDI: The DAMES Data Curation Tool for Organising Specialist Social Science Data Resources Simon Jones*, Guy Warner*,
30 May 2003IASSIST 2003: Strength in Numbers From manuscripts to metadata: collaborative working in the Archives Hub Amanda Hill University of Manchester.
Do Users Know What They Really Really Want? Jackie Carter Justin Hayes Richard Wiseman Mimas, University of Manchester Photo by: Sunshine Junior, Creative.
MOSES: Modelling and Simulation for e-Social Science Mark Birkin, Martin Clarke, Phil Rees School of Geography, University of Leeds Haibo Chen, Institute.
Access to UK Census Data for Spatial Analysis: Towards an Integrated Census Support Service John Stillwell 1, Justin Hayes 2, Rob Dymond-Green 2, James.
Census.ac.uk Census Area Statistics and Casweb David Rawnsley Census Dissemination Unit (CDU) Mimas University of Manchester.
Geographical Data Products Carol Blackwood UKBORDERS 3 rd July 2012.
GEOG3025 Census and administrative data sources 2: Outputs and access.
DATE Making Sense of the Census: One Year On with the CAIRD Project Rob Dymond Mimas Development Officer Census Dissemination Unit Mimas University of.
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
Beyond Skill and Drill Using Web 2.0 Technologies to Increase Engagement and Participation for ALL Students Text barriers to w. code and your.
Case Studies: Statistics Canada (WP 11) Alice Born Statistics UNECE Workshop on Statistical Metadata.
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
Simple Census Aggregate Data Without the Tables Justin Hayes Census Support Service UK Data Service TWRI Conference 5 October 2012.
Transparency and Open Data: GSS Response Iain Bell HoP MoJ.
The Statistical Spatial Framework for Australia - enabling location analysis Gemma Van Halderen First Assistant Statistician Population, Education & Data.
Developing and improving data resources for social science research A strategic approach to data development and data sharing in the social sciences Peter.
Developing and improving data resources for social science research A strategic approach to data development and data sharing in the social sciences Peter.
The Brain Project – Building Research Background Part of JISC Virtual Research Environments (Phase 3) Programme Based at Coventry University with Leeds.
New and easier ways of working with aggregate data and geographies from UK censuses Justin Hayes UK Data Service Census Support.
Introduction to ESDS International Celia Russell Economic and Social Data Service MIMAS April 14 th 2004 University of Manchester Delivering the World:
Shelter Cluster Coordinating humanitarian shelter Update from the Global Shelter Cluster Shelter Centre meeting Geneva, 25 May 2012.
Data and Metadata Session 5 Mark Viney Australian Bureau of Statistics 6 June 2007.
GIS data sources; catalogs of data and services. USGS: National Mapping.
Structural analysis of the aggregate outputs from the 2011 Census to develop alternative integrated multidimensional conceptual models of data and geographies.
Census.ac.uk Introduction to the ESRC Census Programme and overview of practical workshop afternoon David Martin, Programme Coordinator.
Slide 1 SDTSSDTS FGDC CWG SDTS Revision Project ANSI INCITS L1 Project to Update SDTS FGDC CWG September 2, 2003.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Aim: “to support the enhancement and implementation of the standards needed for the modernisation of statistical production and services”
® Using (testing?) the HY_Features model, 95th OGC Technical Committee Boulder, Colorado USA Rob Atkinson 3 June 2015 Copyright © 2015 Open Geospatial.
CENSUS OUTPUTS Dissemination Plans Chris Ashford 2011 Census Outputs : Technical Delivery.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Collection-level description: from theory to practice Minerva project meeting Paris, 24 January 2003 Pete Johnston UKOLN, University of Bath Bath, BA2.
The OECD-UNSD Trade System – A Progress Report OECD Trade Experts Meeting – September 2007.
Dissemination of ONS Data - Future Channels and Tools Callum Foster, Web Data Access Project ONS 1.
UKBORDERS is the Geography Data Support Unit for the ESRC Census Programme. UKBORDERS focus is on Digital Boundaries and postcode directories.
Sectoral reporting beyond accountability for sustainability
Global Statistical Geospatial Framework – interoperability challenges
The Re3gistry software and the INSPIRE Registry
Linked Data for SDG Reporting
Presentation 2b 2018 Census Products & Services Engagement.
Upcoming changes to the DMX technical standard
Sub-regional workshop on integration of administrative data, big data
Prepared by Peter Boško, Luxembourg June 2012
A strategic approach to data development and data sharing in the social sciences Peter Elias NCRM/SRA Workshop: "Data Linkage: Exploring the Potential"
The role of metadata in census data dissemination
The Role of Metadata in Census Data Dissemination
SDMX IT Tools SDMX Registry
Palestinian Central Bureau of Statistics
Presentation transcript:

InFuse: Data Feeds for the UK 2001 and 2011 Censuses and Beyond Justin Hayes Census Dissemination Unit (CDU) Mimas The University of Manchester

»CDU background »Recent work on CAIRD Project »Current work on InFuse Project »Forthcoming work in collaboration with ONS »Future ideas Where are we going?

Data Feed? Structure Describe Interoperable Open Standards Expose Consolidate Understandable Usable Transferable Comprehensive Comparable Online Integrate Flexible Consolidates information relating to a dataset and integrates it by enforcing a structure which it describes using open standards to allow comprehensive and comparable information to be exposed and transferred online in ways that make it understandable, interoperable, flexible, and, most importantly, usable.

Dimensions, Codelists and Codes General Health

»Dissemination of aggregate outputs from recent UK censuses to UK academics »Small team funded by ESRC »Service, research and engagement roles »Two decades of pioneering work ›Casweb ›Retrieval and reprocessing of UK 1971 Census ›GeoConvert CDU Background

»Large and complex dataset »Lack of global structures ›‘Hand crafted’ tables as primary instrument ›Inconsistent structures ›‘Age’ particularly problematic example »No comprehensive description »Scattered information ›Poor connection of data and metadata ›Approximately 300 tables with many inconsistencies ›Metadata in multiple locations with varying access Barriers to Effective Dissemination

Age Bands 99 age bandings 76 unique to a single table

223 Age Codes

Standard Table 13 Framework

Standard Table 13 Data

Text String Cell Descriptions S013:37 (AGE OF HRP 24 OR UNDER - Rented from council : ALL HOUSEHOLDS ) S013:38 (AGE OF HRP 24 OR UNDER - Rented from council : Economically Active - Total ) S013:39 (AGE OF HRP 24 OR UNDER - Rented from council : Economically Active - Employee ) S013:40 (AGE OF HRP 24 OR UNDER - Rented from council : Economically Active - Self-employed ) S013:41 (AGE OF HRP 24 OR UNDER - Rented from council : Economically Active - Unemployed ) S013:42 (AGE OF HRP 24 OR UNDER - Rented from council : Economically Active - Full-time students ) S013:43 (AGE OF HRP 24 OR UNDER - Rented from council : Economically Inactive - Total ) S013:44 (AGE OF HRP 24 OR UNDER - Rented from council : Economically Inactive - Retired ) S013:45 (AGE OF HRP 24 OR UNDER - Rented from council : Economically Inactive - Student ) S013:46 (AGE OF HRP 24 OR UNDER - Rented from council : Economically Inactive - Looking after home/family ) S013:47 (AGE OF HRP 24 OR UNDER - Rented from council : Economically Inactive - Permanently sick or disabled ) S013:48 (AGE OF HRP 24 OR UNDER - Rented from council : Economically Inactive - Other )

»Incomplete and unconnected information »Poor exploration »Potential for misinterpretation and misuse »Not interoperable »Applications must provide specific metadata »Frustrating for users and service providers Effects of Barriers

»Consolidate all related information »Extract and apply consistent structures »Describe to make understandable and transferable »Publish data via web service and API »Build our own user applications »Use open standards wherever possible »Take advantage of external development »Encourage ONS to do the same for 2011 »Find money to do all this! Challenges to Improve Services

»Additional funding from ESRC »One researcher for one year from June 2008 »Feasibility project ›Dimensionalised sample of 40 tables ›Conceptual structure based on SDMX ›SOAP-based web service and API »CAIRD application ›Codelist-based data selector ›CSV and SDMX outputs CAIRD Project

CAIRD Geography Selector

CAIRD Data Selector

CAIRD SDMX output

»Mimas strategic funding to take results of CAIRD Project into service »One researcher from August 2009 to present »Initial application launch September 2010 »2001 Census for England and Wales »Tangible outputs just commencing InFuse

»Initial phase of work ›Workshop for expert academic census users ›Questionnaire ›Functional and requirements specifications ›IASSIST 2010 InFuse User Requirements

»Restructuring and parsing of output tables »Information from Census Definitions Volume »Development of master set of codelists »Creation of geography codelists »De-universification »Encoding of hierarchies »Incorporation of core set of metadata »Multiple value counts problem Structuring the 2001 Census

»Theme based exploration »Handling sparsity through guided exploration »Text search ›Thesaurus and gazetteer »Move to RESTful web service with private API »URI schema for RDF development »Encoding of, and operation on hierarchies »Modular, open source design for re-use »Integration of digital boundary data »Initial text output InFuse Features

»InFuse URI schema › data/contenttype/datasets?format=htmlhttp:// /InFuseWS/InFuseWS.svc/ data/contenttype/datasets?format=html »InFuse text search with thesaurus ›Search targets: codelists, codes, glossary, areas, areatypes › data/contenttype/datasets/dsid/1/glossary/searc h?keywords=racehttp:// /InFuseWS/InFuseWS.svc/ data/contenttype/datasets/dsid/1/glossary/searc h?keywords=race Initial InFuse Outputs

InFuse URI Schema

InFuse URI Schema: Codelists

InFuse Thesaurus Text Search

»Data feed influence on ONS 2001 plans ›Data Feed Network ›Census Web Services Working Group (CWSWG) ›ONS commitment to disseminate via API »Collaborative funding ›Two researchers for one year! ›Test datasets for ONS API ›Work on 2001 to 2011 comparability ›Application development for testing of ONS API CDU/ESRC/ONS Collaboration for 2011

»More datasets »More metadata »Work on definitional and geographical comparability »Further application development »SDMX and RDF interaction »Release of a public API »GeoConvert module »Linkage of unit and aggregate data In the InFuse Pipeline

»It’s possible to retrospectively structure and disseminate complex datasets via data feeds, but much easier to do at source. »Potential for improved and expanded secondary usability of datasets will act as a stimulus for the development and use of open standards methods and structures in dataset creation. Summary

» Contact