Manchester Computing Supercomputing, Visualization & e-Science OntoGrid GridPrimer Training University of Manchester 18 th to 22 nd October 2004 ConvertGrid:

Slides:



Advertisements
Similar presentations
Access to Economic and Social Data via the UK Data Archive Jack Kneeshaw UKDA.
Advertisements

The Economic and Social Data Service (ESDS) Karen Dennison UK Data Archive Improving access to government datasets 18 January 2007.
What are Geographical Information Systems (GIS) & ArcView GIS software? What is a Geographical Information System (GIS)? Introduction to ESRI ArcView 3.x.
ESDS Government Resources for the GLF/ GHS ESDS Government Centre for Census and Survey Research University of Manchester.
ESDS Government Resources ESDS Government Centre for Census and Survey Research University of Manchester.
ESDS Government Resources for Government Crime Surveys ESDS Government Centre for Census and Survey Research University of Manchester.
ESDS Resources for BCS Users Vanessa Higgins Centre for Census and Survey Research University of Manchester.
ESDS Government Resources for the LFS and APS Anthony Rafferty ESDS Government Centre for Census and Survey Research University of Manchester.
Mapping and Visualising Census Data Keith Cole Jackie Carter Geo-data forum - 4/4/2001.
RELEASE OF THE 2001 CENSUS RESULTS March Release of the 2001 Census Content Media and formats Release schedule Arrangements for using the results.
Grid-Enabling Data: Sticking Plaster, Sellotape, & Chewing Gum? Colin C. Venters National Centre for e-Social Science University.
/ ConvertGrid: Grid Enabling Population Datasets Keith Cole National Centre for e-Social Science (NCeSS) & MIMAS University.
ESDS International Annual Conference 2006 Institute of Materials London 27 th November 2006 ESDS International Annual Conference 2006.
State of the Cities Database Philippa Robinson Neighbourhoods, Cities and Regions Communities and Local Government.
The micro-geography of UK demographic change Paul Norman School of Geography, University of Leeds Understanding Population Trends and Processes.
 Why the survey was devised – Interlink in partnership with TLI project and Salford City Council  Information collected from January till April’13 
T HE W EB - BASED I NTERFACE TO C ENSUS I NTERACTION D ATA - WICID Presentation to the ESRC Research Methods Festival Adam Dennett Centre for Interaction.
Sample of Anonymised Records: User Meeting Propensity to migrate by ethnic group: 1991 & 2001 Paul Norman 1, John Stillwell 2 & Serena Hussain 2 School.
Scottish Neighbourhood Statistics and the Scottish Index of Multiple Deprivation (SIMD) 2004 TRACEY STEAD OFFICE OF THE CHIEF STATISTICIAN SCOTTISH EXECUTIVE.
Northern Ireland Neighbourhood Information Service - NINIS Fiona Johnston Neighbourhood Statistics NISRA.
Modelling and Simulation for e-Social Science Mark Birkin School of Geography University of Leeds.
Part of the Arts and Humanities Data Service and the UK Data Archive. Funded by the Joint Information Systems Committee and the Arts and Humanities Research.
St. Lucia Country Report By Edwin St Catherine Director, Central Statistical Office Presented to IPUMS Workshop August 24 th, 2007.
Oxford eResearch Conference 2008 Paper Session 4A: NCeSS Oxford, UK, ( ) Experience of e-Social Science: A Case of Andy Turner and MoSeS Andy.
CCG 1 MoSeS Introduction and Progress Report Andy Turner
MOSES: Modelling and Simulation for e-Social Science Mark Birkin, Martin Clarke, Phil Rees School of Geography, University of Leeds Haibo Chen, Institute.
Access to UK Census Data for Spatial Analysis: Towards an Integrated Census Support Service John Stillwell 1, Justin Hayes 2, Rob Dymond-Green 2, James.
The micro-geography of UK demographic change Paul Norman School of Geography, University of Leeds understanding population trends and processes.
Scottish Index of Multiple Deprivation (SIMD) 2009 East Ayrshire Andrew White Office of the Chief Statistician 16 th February.
The Spatial Scale of Residential Segregation in Northern Ireland, 1991–2001 Chris Lloyd 1, Ian Shuttleworth 1 and David Martin 2 1 School of Geography,
Geospatial Data Needs - Workshop 19 January 2010 Geospatial needs – the private sector perspective Keith Dugmore Demographic Decisions & Honorary Professor,
Geographical Data Products Carol Blackwood UKBORDERS 3 rd July 2012.
“Applications of the NPD in Academic Research: Some Examples from the Centre for the Economics of Education” Joan Wilson, CEE, CEP, IoE DCSF, York, Friday.
Brent Diversity Profile Labour Market Work patterns in Brent May 2015.
Saadia GreenbergElena Fazio Office of Performance and Evaluation Administration on Aging US Department.
Supercomputing, Visualization & eScience1 e-Social Science Grid technologies for Social Science: the Seamless Access to Multiple Datasets (SAMD) project.
Population Changes in Salford Salford Strategic Partnership Executive 18 th March 2013 Jon Stonehouse Deputy Director Children’s Services John.
Scottish Index of Multiple Deprivation (SIMD) ScotPHO training course – day 4 Andrew White Office of the Chief Statistician, Scottish.
ESDS Resources Anthony Rafferty ESDS Government Centre for Census and Survey Research University of Manchester.
Scottish Index of Multiple Deprivation (SIMD) 2009 ScotStat Public Body Analyst Network Andrew White and Matt Perkins Office of.
Recent developments in the UK Using the indices and the underpinning data Tom Oxford Consultants for Social Inclusion (OCSI) David McLennan.
Health Statistics Information on STC website Calgary–DLI training–Dec 2003 Michel B. Séguin, Statistics Canada,
Manchester Computing Supercomputing, Visualization & eScience Celia Russell, Stephen Pickles and Mike Jones Combining Data Workshop ESRC Research Methods.
Grid Enabled Data Fusion for Calculating Poverty Measures Social Science Applications UK e-Science ALL HANDS MEETING 2006.
Understanding Geographies Statistics for the Voluntary Sector Alison Peacock Mission Planning Officer Wednesday 8 July 2009.
American Community Survey (ACS) 1 Oregon State Data Center Meeting Portland State University April 14,
Scottish Index of Multiple Deprivation (SIMD) 2009 Niamh Laffan Office of the Chief Statistician Scottish Government 29 th October.
The future of housing in Kensington and Chelsea 10 June 2013, London Lighthouse.
New and easier ways of working with aggregate data and geographies from UK censuses Justin Hayes UK Data Service Census Support.
ESDS International Celia Russell and Susan Noble Economic and Social Data Service University of Manchester ESDS International Conference 2007.
Housing and Health Is the glass half-empty or half-full? 8 th October 2009 Robert Cornwall.
Centre for Housing Research, University of St Andrews The Effect of Neighbourhood Housing Tenure Mix on Labour Market Outcomes: A Longitudinal Perspective.
General Register Office for S C O T L A N D information about Scotland's people Comparison between NHSCR and Community health index sources of migration.
SASPAC and the Census Making best use of the 2011 Census TWRI, York 5 October 2012 Alan Lewis SASPAC Programme Manager.
Demographic change at small area level Small area statistics to develop public policy Paul Norman School of Geography, University of Leeds ESRC RES
Combining the strengths of UMIST and The Victoria University of Manchester “Use cases” Stephen Pickles e-Frameworks meets e-Science workshop Edinburgh,
HISTORY DAY Project Categories. Types of Presentations n Research paper (individual only) n Documentary n Exhibit n Performance n Web site.
Scottish Index of Multiple Deprivation (SIMD) 2009 Understanding Deprivation and Making the Most of Local and National Datasets.
Exploiting census workplace data to build a daytime grid map of England and Wales. David Martin, Samantha Cockings, Alan Smith European Forum for Geostatistics,
Scottish Index of Multiple Deprivation (SIMD) 2009 The Prince’s Trust Scotland Andrew White Office of the Chief Statistician 6.
Scottish Index of Multiple Deprivation (SIMD) 2009 Dumfries & Galloway Andrew White Office of the Chief Statistician 05 th February.
Scottish Index of Multiple Deprivation (SIMD) 2009 Matt Perkins Office of the Chief Statistician 30 th October 2009.
Geneva, March 2012 Work Session on Gender Statistics INDICATORS OF GENDER EQUALITY IN LITHUANIA Sigutė Litvinavičienė Demographic.
Sinclair Sutherland Labour supply: Finding and using statistics.
Scottish Index of Multiple Deprivation (SIMD) 2009 Lothian NHS Board Andrew White Office of the Chief Statistician 01 st February.
Health & Social Care Information Centre SEPHIG: 12 th September, 2012.
UKBORDERS is the Geography Data Support Unit for the ESRC Census Programme. UKBORDERS focus is on Digital Boundaries and postcode directories.
How Can e-Social Science Promote the Re-Use of Data?
Pete Benton , Beyond 2011 Programme Director
Country Report of the Statistical Center of Iran for Workshop on Integrated Economic Statistics and Informal Sector for ECO Member Countries November.
Presentation transcript:

Manchester Computing Supercomputing, Visualization & e-Science OntoGrid GridPrimer Training University of Manchester 18 th to 22 nd October 2004 ConvertGrid: Cross-Referencing Data held in Different Geographies National Centre for eSocial Science (NCeSS)

Supercomputing, Visualization & e-Science2 Acknowledgements  Who’s doing the work? –Pascal Ekin –Linda Mason  Who’s paying? –ESRC Grant Reference RES  Who’s helping? –Keith Cole –Justin Hayes –Jon MacLaren –Stephen Pickles

Supercomputing, Visualization & e-Science3 What’s it all about?  MIMAS is a national data centre providing networked access to key data for the UK higher and further education and research communities.  Many researchers wish to cross-reference data from a number of MIMAS-provided datasets. We want to help them.  First, a little background...  These datasets are stored in different target geographies, e.g Wards, and 1991 Postcode Sectors  In order to cross-reference data from different datasets, some data will need to be converted from one target geography to another  There are a number of statistical methods for doing this.

Supercomputing, Visualization & e-Science Postcode Sectors Different Geographies 1991 Wards Source: Office for National Statistics

Supercomputing, Visualization & e-Science5 The “Convert” Project  The MIMAS helpdesk used to get many queries from people trying to do this themselves, identifying this as a common problem facing many researchers  These researchers would need to prepare their own conversion tables, leading to many people replicating the same, labour-intensive work  So, 225 UK-wide Geography conversion tables were developed as part of the “Updated UK Area Masterfiles” project (ESRC award H )  See also:

Supercomputing, Visualization & e-Science6 Creating the Convert Tables Postcodes were used as an intermediary to construct geography conversion tables as indicated in the diagram. Where the All Fields Postcode Directory (AFPD) indicates that all the postcodes in a source geography unit lie within one target geography unit (A), the conversion table has a single record for that source unit, with weight one. Where the source unit contains residential postcodes allocated to different target units (B), the AFPD provides a weight based on the number of residential addresses in the overlap of the source unit and target geography units. Source: “Summary of Research Results: Updated UK Area Masterfiles” ESRC award H Completed March 2001

Supercomputing, Visualization & e-Science7 What next?  These conversion tables form the basis for the Convert system, available at: where data supplied in one target geography can be converted to another.  Since its inception, the Convert service has been very popular (this year: 400 web page hits per month)  But...many researchers using this service are performing a common set of steps, namely: 1.Extract data from a number of datasets 2.Convert each set of data to the desired geography 3.Combine the converted sets into a single set of data  Makes sense to provide a service to automate this labour- intensive process. That’s what ConvertGrid will do.

Supercomputing, Visualization & e-Science8 Why use the “Grid” to do this?  Grid computing has been defined as “distributed computing across organizational boundaries” – why is this useful?  Grid technologies provide real Single Sign-On, similar to ATHENS authentication, but more general.  By Grid-enabling the datasets we encourage others to do the same – this is vital to the development of a real Grid.  Grid has good solutions for remote database access, e.g. OGSA-DAI which allows for encrypted transfer.  Also have distributed query processing (DQP) technology not currently available in commercial products

Supercomputing, Visualization & e-Science9 What the researcher is using...

Supercomputing, Visualization & e-Science10 What the researcher is using...

Supercomputing, Visualization & e-Science11 What the researcher sees... Households with 2 or more cars in Greater London and Greater Manchester, from the 1991 Census, displayed by Ward

Supercomputing, Visualization & e-Science12 So, what can we do with this?  Three “use cases” have been developed for the project, each based on a different theme.  Competition Time! –Each use case has been illustrated with a relevant still from a British film. –Name all three films correctly, and you can win a prize! –As a tie-breaker, put the year of release for each film too. –Written entries to be handed to me by the end of the drinks reception at the latest...

Supercomputing, Visualization & e-Science13 Scenario 1 – Education Theme  University administrator wishing to profile newly enrolled students in line with new ‘Widening Participation’ legislation  Target geography – 1998 ward  Datasets required: –User’s own dataset of postcode of student’s home residence. –Neighbourhood Statistics 1998 data Population estimates (1998 ward) University admissions by place of residence (1998 ward) –1991 Census Total population (1991 ward) Social class (1991 ward) Educational attainment (1991 ward) –Experian 1999 supply Total population (1999 PCS) Population in MOSAIC Group A (1999 PCS)

Supercomputing, Visualization & e-Science14 Scenario 2 – Crime Theme  Spatial correlation of recorded burglaries with house prices and other indicators of social wellbeing/deprivation.  Study target geography – 1998 LAD  Datasets required: –1991 Census Total population (1991 ward) Unemployment (1991 ward) Overcrowding (1991 ward) –Neighbourhood Statistics 1998 data Population estimates (1998 ward) Recorded household burglaries (1998 LAD) –Experian 1999 supply Total population (1999 PCS) Annual average house sale value (1999 PCS) Population in MOSAIC Group A (1999 PCS)

Supercomputing, Visualization & e-Science15 Scenario 3 – Health Theme  Health researcher wishing to look for relations between incidence of coronary heart disease and other demographic factors.  Study target geography – 1998 Primary Care Group  Datasets required: –1991 Census Total population (1991 ward) Limiting Long Term Illness (1991 ward) Unemployment (1991 ward) Ethnicity (1991 ward) –Neighbourhood Statistics 1998 data Population estimates (1998 ward) Heart disease diagnosis episodes (1998 LAD) –Experian 1999 supply Total population (1999 PCS) Population in MOSAIC Group A (1999 PCS)

Supercomputing, Visualization & e-Science16 Acknowledgements  Who’s doing the work? –Pascal Ekin –Linda Mason  Who’s paying? –ESRC Grant Reference RES  Who’s helping? –Keith Cole –Justin Hayes –Jon MacLaren –Stephen Pickles