E-Social Science: scaling up social scientific investigations Alex Voss, Andy Turner, Rob Procter National Centre for e-Social Science Gabor Terstyanszky,

Slides:



Advertisements
Similar presentations
Complexity Settlement Simulation using CA model and GIS (proposal) Kampanart Piyathamrongchai University College London Centre for Advanced Spatial Analysis.
Advertisements

NCeSS e-Stat quantitative node Prof. William Browne & Prof. Jon Rasbash University of Bristol.
The methodology used for the 2001 SARs Special Uniques Analysis Mark Elliot Anna Manning Confidentiality And Privacy Group ( University.
C Introduction to the Geostat project Session on User needs (Geostat workshop in Bled 1-3 october 2008) Lars H. Backer
SICSA student induction day, 2009Slide 1 Multithreading RePast Models Alex Voss 1, Jing-Ya You 2, Eric Yen 2, Simon Lin 2, Ji-Ping Lin 3, Andy Turner 4.
E-Social Science: scaling up social scientific investigations Alex Voss, Andy Turner (ESRC National Centre for e-Social Science) Gabor Terstyanszky, Gabor.
Future Research Directions in Agent Based Modelling Workshop, Leeds, UK, Large Scale Social Simulation in Java and the NeISS Project Andy Turner.
The e-Social Science Research Agenda Peter Halfpenny and Rob Procter School of Social Sciences - University of Manchester UK e-Science All Hands Meeting.
GENESIS Web 2.0 Agent City Simulation: Establishing a user community and enabling collaborators to manipulate simulations and develop models Andy Turner.
Modelling and Simulation for e-Social Science Mark Birkin School of Geography University of Leeds.
MoSeS meets NEC 10 th March 2008 MoSeSMoSeS Andy Turner
Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Starts for the Promised Land Andy Turner Outline –Introduction –Population.
4 th International Conference on e-Social Science: Workshop 5: Agent-Based Modelling for the Spatial-Social Sciences Reconstruction of the.
International Symposium on Grid Computing 2010 Applications on Humanities & Social Sciences I Taipei, Taiwan ( ) GENESIS Social Simulation Modelling.
B1 -Biogeochemical ANL - Townhall V. Rao Kotamarthi.
Oxford eResearch Conference 2008 Paper Session 4A: NCeSS Oxford, UK, ( ) Experience of e-Social Science: A Case of Andy Turner and MoSeS Andy.
GEOG 1230 Lecture 2 Types and Sources of Geographical Data.
The NGS Roadshow Bath Geodemographic Modelling on the NGS Andy Turner
A Data Curation Application Using DDI: The DAMES Data Curation Tool for Organising Specialist Social Science Data Resources Simon Jones*, Guy Warner*,
Individual and Household Level Estimates Based on 2001 UK Human Population Census Data Andy Turner CSAP Seminar on Microsimulation: Problems and Solutions.
CCG 1 MoSeS Introduction and Progress Report Andy Turner
School of Geography CENTRE FOR SPATIAL ANALYSIS AND POLICY e-Infrastructure for Large-Scale Social Simulation Mark Birkin Andy Turner.
Modelling and Simulation for e-Social Science (MoSeS) Mark Birkin, Martin Clarke, Phil Rees, Andy Turner, Belinda Wu (School of Geography) Haibo Chen (Institute.
An Introduction to Software Visualization Dr. Jonathan I. Maletic Software DevelopMent Laboratory Department of Computer Science Kent State University.
An Introduction to Social Simulation Andy Turner Presentation as part of Social Simulation Tutorial at the.
4/27/2006Education Technology Presentation Visual Grid Tutorial PI: Dr. Bina Ramamurthy Computer Science and Engineering Dept. Graduate Student:
Andy Turner On MoSeS 28 th March 2007 Andy Turner On MoSeS Andy Turner
MOSES: Modelling and Simulation for e-Social Science Mark Birkin, Martin Clarke, Phil Rees School of Geography, University of Leeds Haibo Chen, Institute.
Census.ac.uk Census Area Statistics and Casweb David Rawnsley Census Dissemination Unit (CDU) Mimas University of Manchester.
Constructing Individual Level Population Data for Social Simulation Models Andy Turner Presentation as part.
Centre for Earth Systems Engineering Research Infrastructure Transitions Research Consortium (ITRC) David Alderson & Stuart Barr What is the aim of ITRC?
Supercomputing, Visualization & eScience1 e-Social Science Grid technologies for Social Science: the Seamless Access to Multiple Datasets (SAMD) project.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
FDES Meeting NYC 8-10 November 2010 The interface between core environmental statistics and other information systems: which interaction is important?
Make the World a Better Place through Reproducible Research Roger D. Peng Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Wall.
Exploring Metropolitan Dynamics with an Agent- Based Model Calibrated using Social Network Data Nick Malleson & Mark Birkin School of Geography, University.
Data Mining Process A manifestation of best practices A systematic way to conduct DM projects Different groups has different versions Most common standard.
Health Datasets in Spatial Analyses: The General Overview Lukáš MAREK Department of Geoinformatics, Faculty.
Geodemographic modelling collaboration Alex Voss, Andy Turner Presentation to Academia Sinica Centre for Survey Research
School of something FACULTY OF OTHER School of Geography FACULTY OF EARTH AND ENVIRONMENT MOSES: A Synthetic Spatial Model of UK Cities and Regions Mark.
School of Geography FACULTY OF ENVIRONMENT The Elements of a Computational Infrastructure for Social Simulation Mark Birkin 1, Rob Allan 2, Sean Beckhofer.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
© 2005 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice The China Digital Museum Project.
Innovations in Data Dissemination Thomas L. Mesenbourg, Jr. Acting Director U.S. Census Bureau United Nations Seminar on Innovations in Official Statistics.
New and easier ways of working with aggregate data and geographies from UK censuses Justin Hayes UK Data Service Census Support.
Introduction to Spatial Microsimulation Dr Kirk Harland.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
P-GRADE and GEMLCA.
Infrastructures for Social Simulation Rob Procter National e-Infrastructure for Social Simulation ISGC 2010 Social Simulation Tutorial.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial International Symposium on Grid Computing Taipei, Taiwan, 7 th March 2010.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Sharing experience Attribution (by) Licensees may copy, distribute, display and perform the work and make derivative works based on it only if they give.
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Exploring Microsimulation Methodologies for the Estimation of Household Attributes Dimitris Ballas, Graham Clarke, and Ian Turton School of Geography University.
Overview What is geography? What is geographic information?
Toward a common data and command representation for quantum chemistry Malcolm Atkinson Director 5 th April 2004.
The National Grid Service Mike Mineter.
OGC/OGF usage in UK e-Social Science OGF 21, Seattle, USA Paul Townend School of Computing, University of Leeds.
INFSO-RI SA2 ETICS2 first Review Valerio Venturi INFN Bruxelles, 3 April 2009 Infrastructure Support.
The National Grid Service User Accounting System Katie Weeks Science and Technology Facilities Council.
1 Porting applications to the NGS, using the P-GRADE portal and GEMLCA Peter Kacsuk MTA SZTAKI Hungarian Academy of Sciences Centre for.
Open Science Grid OSG Resource and Service Validation and WLCG SAM Interoperability Rob Quick With Content from Arvind Gopu, James Casey, Ian Neilson,
HPC University Requirements Analysis Team Training Analysis Summary Meeting at PSC September Mary Ann Leung, Ph.D.
SCI-BUS project Pre-kick-off meeting University of Westminster Centre for Parallel Computing Tamas Kiss, Stephen Winter, Gabor.
Bob Jones EGEE Technical Director
Spark Presentation.
P-GRADE and GEMLCA.
Developing a daily time step individual level demographic simulation model Andy Turner Presentation slides prepared for a CSAP research cluster meeting,
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

e-Social Science: scaling up social scientific investigations Alex Voss, Andy Turner, Rob Procter National Centre for e-Social Science Gabor Terstyanszky, Gabor Szmetanko, Tamas Kiss Westminster Presentation at ISGC 2009, Taipei, Taiwan, ISGC 2009

Overview Background Introduction MoSeS GENESIS Demographic Modelling –Population Reconstruction (Initialisation) –Dynamic Simulation Experience Organisation Scaling Issues Future Work Acknowledgements

Background Much social science does not use advanced ICT but emergence of new analytical methods is driven by: –Increased availability of data about social phenomena –Issues with data management and integration –Challenges to analyse social phenomena at scale –Challenges to inform practical policy and decision making (e.g., evidence-based policy making) National Centre for e-Social Science (NCeSS) in the UK is investigating ways to respond to these challenges. EUAsiaGrid is supporting e-Social Science amongst other application domains

Introduction Virtual worlds rich in detail are being developed –Digital representations (of parts) of Earth are being developed –Necessarily generalised models –Interact with the ‘real’ world –Socio-economic There can always be more detail –Higher spatial and temporal resolution –More and more detailed attributes –Geography and social science is no different to any other type of science in this respect We are all geographers to some extent and we all interact in some way with the object of study

MoSeS Modeling and simulation approaches for social science First phase research node of NCeSS Core contemporary demographic model of the UK based on UK census data and other datasets Using agent-based simulation to project population forward in time by 25 years Explicitly model births, deaths, migration, changes in health status etc… Applications in transportation research, health and social care planning and business applications

GENESIS Uses and builds on MoSeS A team involving experts in geovisualisation from UCL Two development strands –Theoretic models based on restriction free data –Models seeded with more restricted access data More theoretical –Computational limits –Investigating what visualisations are useful –Considering how to do validation models Less emphasis on developing specific applications –Applications being considered in transportation planning –Respond ad hoc to what is in the public interest Daily activity models

Demographic Modelling I Generation of an individual level population data for the UK –Based on 2001 census data –Works with ‘public release’ versions of census that are restricted, Census Aggregate Statistics (CAS) at Output Area Level 1% of population (anonymisation) –Reconstructed data has same attributes as real population and same number of individuals but is still anonymised –Uses a genetic algorithm to select a well fitting set of sample of anonymised records to assign to an output area –Need for attributes in the SAR to be matched with those in the CAS This is often complicated because of different categories –Aggregation to a lowest common categorisation

Demographic Modelling II Dynamic modelling Daily activity modelling –Commuting –Retail modelling –Transportation Population Forecasting –Annual time step –Birth –Death –Migration

Experiences Integrating existing code into grid environment required some changes to source code –management of input arguments –code scalability –log management –error handling Finding the right input size and parameters for testing to keep execution times low Making sense of execution failures –lack of ways to debug code in distributed environments

Experiences II Step-wise process works well, –ensures we encounter problems piece by piece –allows us to comply with data protection / licensing Population reconstruction is resource intensive –may run up against limits on wall clock time Importance of ‘at elbow’ support –but hindered by data protection/licensing issues Licensing means we need to limit execution to UK resources Setting up VO to support secure sharing of data

Organisation

Simulations –Need to find ways to map to different architectures, both HPC and HTC –Need to deal with large memory requirements and limitations imposed by OS, JVM and Java libraries –Exploring Terracotta Distributing computation Virtual heap space Dependability Advice would be very welcome… Scaling Issues I

Population model size and sophistication –From town size to country size (and beyond) –Number of variables –Number of constraints Number of cores used –To reduce runtime –Need to go beyond using only one site Community –Open development – needs tool support –Number of users – requires hardening of code & documentation Scaling Issues II

Future Work Next steps until code runs in Taiwan with Taiwanese data –Proof of concept execution on Quanta cluster at ASGC –Definition of data outputs from Develop submission to exploit multiple NGS nodes and EGEE Compute Elements Improving data and code staging Moving from population reconstruction to supporting the simulation process Integration into ‘science gateway’ for the social sciences and developing a repository for models

Acknowledgements National Centre for e-Social Science –MoSeS Node: Mark Birkin (PI) –GENeSIS Node: Mike Batty (PI) –NCeSS Hub: Peter Halfpenny and Rob Procter EUAsiaGrid Consortium –Marco Paganoni (Project Director) CPC at Westminster University –Gabor Szmetanko –Gabor Terstyanszky –Tamas Kiss GridPP –Jens Jensen and Jeremy Coles National Grid Service –Jason Lander and Shiv Kaushal (Leeds), Steven Young (Oxford), Mike Jones (Manchester)