Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Starts for the Promised Land Andy Turner Outline –Introduction –Population.

Slides:



Advertisements
Similar presentations
Multiple Indicator Cluster Surveys Survey Design Workshop
Advertisements

Will 2011 be the last Census of its kind in England and Wales? Roma Chappell, Programme Director Beyond 2011 Office for National Statistics, July 2011.
Balancing Access and Confidentiality Jenny Telford Australian Bureau of Statistics September 2008.
The Census Area Statistics Myles Gould Understanding area-level inequality & change.
E-Social Science: scaling up social scientific investigations Alex Voss, Andy Turner (ESRC National Centre for e-Social Science) Gabor Terstyanszky, Gabor.
Chuck Humphrey Data Library University of Alberta.
GEODE Workshop 16 th January 2007 Issues in e-Science Richard Sinnott University of Glasgow Ken Turner University of Stirling.
Sample of Anonymised Records: User Meeting Propensity to migrate by ethnic group: 1991 & 2001 Paul Norman 1, John Stillwell 2 & Serena Hussain 2 School.
Modelling and Simulation for e-Social Science (MoSeS) Mark Birkin, Martin Clarke, Phil Rees, Andy Turner, Belinda Wu, Justin Keen, Haibo Chen, John Hodrien,
Developing and improving data resources for social science research Enhancing, enriching and developing household sample surveys in the UK: the strategic.
GENESIS Web 2.0 Agent City Simulation: Establishing a user community and enabling collaborators to manipulate simulations and develop models Andy Turner.
Modelling and Simulation for e-Social Science Mark Birkin School of Geography University of Leeds.
Adding Census Geographical Detail into the British Crime Survey for Modelling Crime Charatdao Kongmuang Naresuan University, Thailand Graham Clarke and.
MoSeS meets NEC 10 th March 2008 MoSeSMoSeS Andy Turner
SEE-GEO Meeting 20 th March 2008 NCeSS e-Infrastructure for the Social Sciences Project: Security and Geospatial Services Andy Turner
4 th International Conference on e-Social Science: Workshop 5: Agent-Based Modelling for the Spatial-Social Sciences Reconstruction of the.
Modelling and Simulation for e-Social Science (MOSES) Mark Birkin, Martin Clarke, Phil Rees, Andy Turner, Belinda Wu, Justin Keen, Haibo Chen, John Hodrien,
School of something FACULTY OF OTHER School of Geography FACULTY OF ENVIRONMENT Modelling Individual Consumer Behaviour
Oxford eResearch Conference 2008 Paper Session 4A: NCeSS Oxford, UK, ( ) Experience of e-Social Science: A Case of Andy Turner and MoSeS Andy.
UK e-Science and the White Rose Grid Paul Townend Distributed Systems and Services Group Informatics Research Institute University of Leeds.
The NGS Roadshow Bath Geodemographic Modelling on the NGS Andy Turner
Individual and Household Level Estimates Based on 2001 UK Human Population Census Data Andy Turner CSAP Seminar on Microsimulation: Problems and Solutions.
CCG 1 MoSeS Introduction and Progress Report Andy Turner
GIS and Grid Computing Agenda Setting Workshop WELCOME! Andy Turner MoSeS.
School of Geography CENTRE FOR SPATIAL ANALYSIS AND POLICY e-Infrastructure for Large-Scale Social Simulation Mark Birkin Andy Turner.
E-Social Science: scaling up social scientific investigations Alex Voss, Andy Turner, Rob Procter National Centre for e-Social Science Gabor Terstyanszky,
Modelling and Simulation for e-Social Science (MoSeS) Mark Birkin, Martin Clarke, Phil Rees, Andy Turner, Belinda Wu (School of Geography) Haibo Chen (Institute.
Shirley Crompton Source: Rob Allan. Institutional Repository Subject Repository Data Producer Repository share resources solve bigger problems integrate.
Statistics and Data for Marketing Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 27, 2008.
An Introduction to Social Simulation Andy Turner Presentation as part of Social Simulation Tutorial at the.
EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008.
Andy Turner On MoSeS 28 th March 2007 Andy Turner On MoSeS Andy Turner
MOSES: Modelling and Simulation for e-Social Science Mark Birkin, Martin Clarke, Phil Rees School of Geography, University of Leeds Haibo Chen, Institute.
Country Paper on: Census Data Accessibility, Confidentiality and Copyright Policy: Ethiopia’s Experience Seminar United Nations Regional Seminar on Census.
Constructing Individual Level Population Data for Social Simulation Models Andy Turner Presentation as part.
AICT5 – eProject Project Planning for ICT. Process Centre receives Scenario Group Work Scenario on website in October Assessment Window Individual Work.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
Plans for Access to UK Microdata from 2011 Census Emma White Office for National Statistics 24 May 2012.
Geodemographic modelling collaboration Alex Voss, Andy Turner Presentation to Academia Sinica Centre for Survey Research
Providing Access to Census- based Interaction Data in the UK: That’s WICID! John Stillwell School of Geography, University of Leeds Leeds, LS2 9JT, United.
School of something FACULTY OF OTHER School of Geography FACULTY OF EARTH AND ENVIRONMENT MOSES: A Synthetic Spatial Model of UK Cities and Regions Mark.
Developing and improving data resources for social science research A strategic approach to data development and data sharing in the social sciences Peter.
1 Assessing the Impact of SDC Methods on Census Frequency Tables Natalie Shlomo Southampton Statistical Sciences Research Institute University of Southampton.
New and easier ways of working with aggregate data and geographies from UK censuses Justin Hayes UK Data Service Census Support.
Introduction to Spatial Microsimulation Dr Kirk Harland.
Web Access to Census Interaction Data John Stillwell and Oliver Duke-Williams Centre for Computational Geography University of Leeds, Leeds LS2 9JT Paper.
Infrastructures for Social Simulation Rob Procter National e-Infrastructure for Social Simulation ISGC 2010 Social Simulation Tutorial.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial International Symposium on Grid Computing Taipei, Taiwan, 7 th March 2010.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
Targeting of Public Spending Menno Pradhan Senior Poverty Economist The World Bank office, Jakarta.
Introduction to Survey Sampling
Disclosure Risk and Grid Computing Mark Elliot, Kingsley Purdam, Duncan Smith and Stephan Pickles CCSR, University of Manchester
1 Working with Canadian Census Microdata Martine Grenier and Mokili Mbuluyo Census Operations Division, Statistics Canada December 2007.
Exploring Microsimulation Methodologies for the Estimation of Household Attributes Dimitris Ballas, Graham Clarke, and Ian Turton School of Geography University.
2011 Population Census of Hong Kong -Interactive Data Dissemination Service November 2012 John Hon-kwan LAM Census and Statistics Department Hong.
1 e-Arts and Humanities Scoping an e-Science Agenda Sheila Anderson Arts and Humanities Data Service Arts and Humanities e-Science Support Centre King’s.
1 of 22 INTRODUCTION TO SURVEY SAMPLING October 6, 2010 Linda Owens Survey Research Laboratory University of Illinois at Chicago
E-Science Security Roadmap Grid Security Task Force From original presentation by Howard Chivers, University of York Brief content:  Seek feedback on.
OGC/OGF usage in UK e-Social Science OGF 21, Seattle, USA Paul Townend School of Computing, University of Leeds.
Census Office Fernando Casimiro Geneva, July 2010 Portugal – Census results tailored to user needs «
Introduction Sample surveys involve chance error. Here we will study how to find the likely size of the chance error in a percentage, for simple random.
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
DATA FOR EVIDENCE-BASED POLICY MAKING Dr. Tara Vishwanath, World Bank.
Dissemination of ONS Data - Future Channels and Tools Callum Foster, Web Data Access Project ONS 1.
Role of Metadata in dissemination of census data Regional Seminar on dissemination and spatial analysis of census data, Nairobi, September, 2010.
AICT5 – eProject Project Planning for ICT
The role of metadata in census data dissemination
Presentation transcript:

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Starts for the Promised Land Andy Turner Outline –Introduction –Population Modelling Progress –Next Steps –Feedback

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Introduction

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 A religious story? Lost in the Desert? Our heading? –The Promised Land –SIM-UK –GeoSIM

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Modelling and Simulation for e-Social Science ( MoSeS ) Mark Birkin, Martin Clarke, Phil Rees, Andy Turner, Belinda Wu (School of Geography) Haibo Chen (Institute for Transport Studies) Justin Keen (Institute for Health Sciences) John Hodrien, Paul Townend, Jie Xu (School of Computing)

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS is a Node of the National Centre for e-Social Science NCeSS aims to investigate, promote and support the use of eScience in social science research

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 eScience Based on Grid Computing and collaboration What is Grid Computing? –Many definitions… –A move towards ubiquitous computing –A service/protocol for sharing Information Technology (IT) resource over the Internet Computer scientists are building the next generation of computational infrastructure –‘[The Grid] intends to make access to computing power, scientific data repositories and experimental facilities as easy as the Web makes access to information.’ (Tony Blair, 2002)

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 eScience Grid Computing Environments and The Grid –Enhance capabilities for IT resource sharing for research –Is about providing easy and secure access to massive computational resources, software and data promoting collaborative working of virtual organisations e-Social Science is eScience targeted and geared for applications more specific to social science including a major part of geography

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Aims and Objectives Raise awareness of eScience and eResearch Develop practical geographical e-Social Science applications demonstrating the potential of Grid Computing Model the UK human population at individual and higher organisational levels –households, communities, regions –disparate and/or geographically diffuse organisations and society –service orientated government Develop and package a suit of modelling tools which allows specific research and policy questions to be addressed with demonstrator applications for: –Health –Business –Transport

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Initial Tasks Develop methods to generate individual human population data for the UK from 2001 UK human population census data Develop a Toy Model –Dynamic agent based microsimulation modelling toolkit and apply it to simulate change in the UK Develop applications for –Health –Business –Transport

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Challenges Grid enabling the data and tools Visualisation –Google Earth –Computer Games Collaboration Retaining a problem focus Design and Development

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Current Parallel Developments Belinda Wu is working on the applications beginning with a Toy Model for Leeds Paul Townend is working on Grid Enabling Andy Turner is focussing on the population modelling The MoSeS team are meeting regularly and plan a launch some time next year when we hope to have something impressive to show off to NCeSS colleagues and invited guests from the eScience community, government and business

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Human Population Model Current focus on the contemporary situation looking forwards over the next 25 years Primarily data wanted for individuals grouped into households Need to develop a method to synthesise and enrich data since available census and social survey data is not sufficient in coverage and detail A method was outlined in the proposal –This is being implemented and results are being tested

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Population Modelling Method To select a fitting set of individual records from the 2001 UK Population Census 3% Individual Sample of Anonymised Records (ISAR) to represent the individuals for regions given by 2001 UK Population Census Area Statistics (CAS) Initial focus is for regions called Output Areas –Smallest Census Output Areas –Typically about 300 people, 100 households Begin with Leeds and scale up to the UK

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Combination Given the population (p) of an Output Area we want to select a sub-sample of this size from the n = records in the ISAR The general formula for finding the number of permutations of size p taken from n objects npPermutations is: Approximately n p

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Computation Number of potential solutions too great to find the best fitting solution by a brute force search? –Probably, yes, even using all the computational power of The Grid –Interestingly the number of potential solutions is even greater for larger regions than Output Areas (although there are less of them) Fortunately we are only interested in specific types of solution and can constrain our search For some criteria hard constraints are appropriate and for other variables optimisation is the key within these constraints

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Constraints What can we constrain to? –There are limits The more detailed the constraint criteria the less likely it can be met –The ISAR is only a 3% sample –Specific CAS tabulations The aggregations of variables are bespoke Beware of errors especially systematically introduced disclosure control measures –Census data are estimates and contain unknown level of error What is most important to ensure is right? –Age/Gender profile –Number of Household Reference People –Household Composition –Social Class –Health status etc…

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Getting to Grips with ISAR and CAS data 2001 UK Census data is unusual (like most census data) –Details are lost by aggregation and accuracy is deliberately worsened via the application of disclosure control measures –This is done for confidentiality reasons and as users we are forced to appreciate this –On the one hand this generates jobs, on the other hand, it renders census data almost useless for supporting certain applications Details on UK Census data including ISAR and CAS are available via – Usefully 2001 CAS tables that do not currently exist can be commissioned There is an application procedure for gaining access to Controlled Access Microdata Sample (CAMS) records from the 2001 Census –The data is supposedly better –It will be hard for us to use due to the way it is controlled

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 CAS Key Statistics Tables –31 tabulations –E.G KS001 Usually Resident Population 6 cells Standard offerings –53 cross tabulations –E.g. CS001 Age/Sex/Resident Type 250 cells Themed Tables –6 cross tabulations –E.g. CT001 Theme Table On All Dependent Children 348 cells Univariate Tables –43 tabulations –E.g. UV003 Sex 3 cells

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Constraint and Optimisation using Key Statistics As a first step we have constrained by age and ensured that we have the correct number of household reference people –Makes it easier to construct households for Toy Model Our fitness function is a simple Sum of Squared Errors (SSE) for a number of aggregate variables –Measure of the difference between aggregate counts from the ISAR records and the published and aggregated CAS Key Statistics Initial focus on health and household composition

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Optimisation Variables Health variables –peopleWhoseGeneralHealthWasGood –peopleWhoseGeneralHealthWasFairlyGood –peopleWhoseGeneralHealthWasNotGood –peopleWithLimitingLongTermIllness –peopleWithoutLimitingLongTermIllness (Derived) Houshold Composition variables –oneFamilyAndNoChildren (Derived) –marriedOrCohabitingCoupleWithChildren (Derived) –loneParentHouseholdsWithChildren (Derived) (Derived) means calculated from other variables

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Optimisation and Goodness of Fit Initially for each Output Area in Leeds we generated possibly different solutions and picked the best one Now we are using a genetic algorithm to assist in finding a better solution –More strategic –Constraints form genes –Effectively each genetic bit string is an ordered boolean array for the ISAR AGE0 and HRP order Currently genetic algorithm works by breeding and mutation and survival of the fittest

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Next Steps 1 Constraints –Additional constraint by gender Should improve household formation Need to use Standard CAS cross tabulations –Problems due to confidentiality »Perhaps need to consider larger regions than Output Areas –Beginning investigating what other constraints are possible Leeds UK –Identify problem Output Areas Optimisation –Use more optimisation variables –Experiment with the genetic algortihm

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Next Steps 2 Testing –Examine results Mapping –Optimised variables –Exogenous variables Grid Enabling –Data –Provenance Toy Model Publication

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Recap We are developing a dynamic geographic microsimulation of the UK –A model comprising of individual people that occupy the UK environment and move about it through time interacting in numerous ways –Each individual will have family, household and social networks and reasonably complex characteristics and behaviour –The idea is to build a platform for simulating change in the UK for ASAP

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Thank you ! Any feedback or questions? Please

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Acknowledgements Thanks to all involved in the production of the maps that I grabbed off the internet for the start of this presentation