GEODE - Durban ISA RC33, July 2006 Utilising a Grid Enabled Occupational Data Environment GEODE – www.geode.stir.ac.ukwww.geode.stir.ac.uk Paper presented.

Slides:



Advertisements
Similar presentations
UK DATA ARCHIVE Louise Corti, ODAF April UK Data Archive an internationally-renowned centre of expertise in data acquisition, preservation, dissemination.
Advertisements

The Economic and Social Data Service (ESDS) Karen Dennison UK Data Archive Improving access to government datasets 18 January 2007.
GEODE - NeSC workshop, Oct 2006 GEODE: Grid Enabled Occupational Data Environment Paul Lambert and Larry Tan University of Stirling
For the e-Stat meeting of 27 Sept 2010 Paul Lambert / DAMES Node inputs.
For the e-Stat meeting of 6-7 April 2011 Paul Lambert / DAMES Node inputs 1)Updates on DAMES 2)Bringing DAMES inputs to e-Stat 3)Misc. feedback - Stat-JR.
DAMES - Data Management through e-Social Science 1 DAMES: Data Management through e-Social Science NCeSS Research Node University of Stirling / University.
UKRDS: the policy context 26 February 2009 Paul Hubbard Head of Research Policy, HEFCE.
Discove r Humanities and Social Science Electronic Thesaurus - HASSET Faceted search HASSET is the subject thesaurus that the UK Data Service uses to index.
Peter Granda Archival Assistant Director / ICPSR and the Gerald R. Ford Presidential Library: Two Decades of Collaboration.
Planning for Flexible Integration via Service-Oriented Architecture (SOA) APSR Forum – The Well-Integrated Repository Sydney, Australia February 2006 Sandy.
GEODE Project introduction and summary, 12/12/05 GEODE: Grid Enabled Occupational Data Environment GEODE Project introduction and summary, 12/12/05 Motivation.
Search Engines and Information Retrieval
Xyleme A Dynamic Warehouse for XML Data of the Web.
Data format translation and migration Future possibilities Alasdair Crockett, Data Standards Manager UK Data Archive.
A Data Curation Application Using DDI: The DAMES Data Curation Tool for Organising Specialist Social Science Data Resources Simon Jones*, Guy Warner*,
Shirley Crompton Source: Rob Allan. Institutional Repository Subject Repository Data Producer Repository share resources solve bigger problems integrate.
Embedding NVivo in postgraduate social research training Howard Davis & Anne Krayer 6 th ESRC Research Methods Festival 8-10 July 2014.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
NCRM, Session 27, 1 July Handling data on occupations, educational qualifications, and ethnicity Paul Lambert & Vernon Gayle, Univ. Stirling Talk.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
Developing Health Geographic Information Systems (HGIS) for Khorasan Province in Iran (Technical Report) S.H. Sanaei-Nejad, (MSc, PhD) Ferdowsi University.
GEOG3025 Census and administrative data sources 3: Integration and future development.
World Bank, Africa Region, Africa Household Survey Databank - The World Bank - Africa.
Good practice in Research Data Management Module 6: Tools, training and support.
GEODE, March 2007 Handling Occupational Information and Introduction to GEODE GEODE – Grid Enabled Occupational.
ESRC - NCRM - Apr Concepts and Measures in occupation-based social classifications Presentation to: ‘Interpreting results from statistical modelling.
Search Engines and Information Retrieval Chapter 1.
Database Taskforce and the OGSA-DAI Project Norman Paton University of Manchester.
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
1 Dr. Markus Hillenbrand, ICSY Lab, University of Kaiserslautern, Germany A Generic Database Web Service for the Venice Service Grid Michael Koch, Markus.
Research Data at NCAR 1 August, 2002 Steven Worley Scientific Computing Division Data Support Section.
GEODE, 16 Jan 2007 Curating Occupational Information GEODE – Grid Enabled Occupational Data Environment Session.
GEODE, 16 Jan 2007 Handling Occupational Information and Introduction to GEODE GEODE – Grid Enabled Occupational.
SeLeNe - Architecture George Samaras Kyriakos Karenos Larnaca – April 2003 THE UNIVERSITY OF CYPRUS.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
GEODE - eSS Manchester, June 2006 Development of a Grid Enabled Occupational Data Environment GEODE – Paper presented.
Usability Issues Documentation J. Apostolakis for Geant4 16 January 2009.
Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.
Wireless Networks Breakout Session Summary September 21, 2012.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Relationships July 9, Producers and Consumers SERI - Relationships Session 1.
Delivering business value through Context Driven Content Management Karsten Fogh Ho-Lanng, CTO.
19/10/20151 Semantic WEB Scientific Data Integration Vladimir Serebryakov Computing Centre of the Russian Academy of Science Proposal: SkTech.RC/IT/Madnick.
GEODE / SSSN, 23 Jan 2008 Handling Occupational Information GEODE – Presentation to Scottish Social Survey Network,
Some comments on using research data in the social sciences Paul Lambert, School of Applied Social Science, University of Stirling, 25 March 2013.
United Nations Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September, 2011 Documentation and Cataloguing in Data.
GEODE - Glasgow DCC, Nov 2006 Data curation standards and the messy world of social science occupational information resources Paper presented to the 2nd.
1 The Importance of Specificity in Occupation-based Social Classifications Paper presented to the Cambridge Stratification Seminar, September 2006.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Introduction to the Semantic Web and Linked Data
Foundations of Information Systems in Business. System ® System  A system is an interrelated set of business procedures used within one business unit.
Organising social science data – computer science perspectives Simon Jones Computing Science and Mathematics University of Stirling, Stirling, Scotland,
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
GEODE – Sharing Occupational Data Through The Grid Dr. Paul Lambert, Dr. Vernon Gayle, Prof. Ken Prandy, Prof. Richard Sinnott, Prof. Ken Turner, Koon.
HETUS Pilot Group 8 Privacy procedures and ethical issues Kimberly Fisher, Centre for Time Use Research – co-ordinator External consultant Kai Ludwigs.
: LSS1 Longitudinal Studies Seminars: Longitudinal Analyses Using STATA Stirling University, Data and Variable Management Paul Lambert.
Virtual Organisations for Trials and Epidemiological Studies (VOTES) Overview VOTES is a pioneering project investigating the application of Grid technology.
Shibboleth Use at the National e-Science Centre Hub Glasgow at collaborating institutions in the Shibboleth federation depending.
ETICS An Environment for Distributed Software Development in Aerospace Applications SpaceTransfer09 Hannover Messe, April 2009.
Online survey analysis tools Paul Lambert, University of Stirling Presentation to the Scottish Civil Society Data Partnership Project (S-CSDP), Webinar.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
Tools of data analysis Paul Lambert, University of Stirling Presentation to the Scottish Civil Society Data Partnership Project (S-CSDP), Webinar 2 on.
Open Ag Data : Landscape Analysis ●Who is involved in collecting data on agricultural investments, and from whom? ●How is data publicly shared? Which.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
An Overview of Data-PASS Shared Catalog
Terms: Data: Database: Database Management System: INTRODUCTION
Presentation transcript:

GEODE - Durban ISA RC33, July 2006 Utilising a Grid Enabled Occupational Data Environment GEODE – Paper presented to the XVI th ISA World Congress, Durban, July 2006 – RC33 session 07, ‘New Technologies and Data Collection in the Social Sciences’ Paul Lambert, Larry Tan, Ken Turner, & Vernon GayleUniversity of Stirling Ken PrandyCardiff University Richard SinnottUniversity of Glasgow

GEODE - Durban ISA RC33, July 2006 ‘The Grid’ and New Technologies of Data Collection ‘The Grid’ and ‘eScience’: 1. Online Coordination of electronic resources and collaborations  (Distributed computing)  Large scale  Collaborative  Heterogeneous 2. Standard protocols / information management systems UK eSocial Science: 1) Investment in assessing / implementing technology 2) Computationally demanding data analysis 3) Qualitative and quantitative data collection technologies 4) **Data sharing, processing and access**

GEODE - Durban ISA RC33, July 2006 GEODE: Survey records’ occupational data The importance of occupational micro-data(!) Collecting occupational data 1) Initial occupational records (textual description) 2) Processing occupational records: Good practice: Preservation of original, OUG and substantive variables NSI’s favour transparent occupational data coding (1) and translation systems (2) Text descriptions →(1) Standardised Occupational Unit Group (OUGs) →(2) Substantive occupational summary (e.g.,social class code)

GEODE - Durban ISA RC33, July 2006 Occupational data collection and processing (1) Text records → OUG data Currently: Text coding software (e.g. CASCOT) Manual look-up GEODE: Linkage to existing resources Further facilities possible but not planned (users typically have adequate resources) (2) OUG data → summary indicators Currently: Numerous aggregate occupational information resources Bespoke data programming requirements GEODE: Core provision: management and access of these data resources Service to large volumes of users

GEODE - Durban ISA RC33, July 2006 Some illustrative occupational information resources Index units# distinct files (average size kb) Updates? CAMSIS, Local OUG*(e.s.) 200 (100)y CAMSIS value labels Local OUG50 (50)n ISEI tools, home.fsw.vu.nl/~ganzeboom Int. OUG20 (50)y E-Sec matrices Int. OUG*(e.s.) 20 (200)n Hakim gender seg codes (Hakim 1998) Local OUG2 (paper)n

GEODE - Durban ISA RC33, July 2006 What’s the problem? Indexed mainly by Occupational Unit Group (OUG). But… Numerous alternative occupational data files (time; country; format) Alternative OUG schemes; other index factors (‘employment status’) Inconsistent translations to social classifications – ‘by file or by fiat’ Dynamic updates to occupational data resources Low uptake of existing occupational information resources Strict security constraints on users’ micro-social survey data External user (micro-social data) Occ info (index file) (aggregate) User’s output (micro-social data) idougsex.ougCS-MCS-FEGPidougCS I II VIIa

GEODE - Durban ISA RC33, July 2006 GEODE: Grid Enabled Occupational Data Environment Strategy: 1)Occupational data index service (depository) i.Semantic data curation (DDI) ii.Data storage (OGSA-DAI) iii.Data indexing / access (OGSA-DAI) 2) User-friendly ‘portal’ access Entry to an international virtual organisation for data depositors and users (GridSphere, GT4, OGSA-DAI) Facilitate linking occupational information to users’ datasets (OGSA-DAI) (initial focus on CAMSIS resources)

GEODE - Durban ISA RC33, July 2006 GEODE - architecture

GEODE - Durban ISA RC33, July 2006 Occupational information depository 1.1) Semantic curation of occupational information  Establish a ‘GEODE-M’ meta- data subset (.xml) Founded on Michigan Data Documentation Initiative Minimise curation requirements Web proforma entry [via Portal using Gridsphere] Release date Country Time period Author Format Missing data Data extensions OUG variable Other identifier variables Output variables

Occupational information depository 1.2) Storing occupational information resources  GEODE-M documentation(2-stages)  Storage: OGSA-DAI framework to link index files (dynamic) Considerations: All data stored at GEODE v’s Linkage to external data Proprietary software (plain text / SPSS / STATA) Rectangular index file? plurality of supply  Universality or Specificity?

GEODE - Durban ISA RC33, July 2006 Occupational information depository 1.3) Virtual Organisation for Occupational Information Depository MDS (via GT4) to manage VO access to and distribution of occupational information resources International virtual community Dynamic data supply OGSA-DAI efficient data indexing / searching / connecting Grid: Create a community where members have abstract access to heterogeneous resources securely, and achieve wider collaboration

GEODE - Durban ISA RC33, July ) Access to Occupational Data 2.1) File linkage mechanisms Multiple occupational variables on (A) Strict security constraints on (A) Inconsistent OUG formats on (A)  Prototype linkages (e.g. CAMSIS) require full access to (A)  Cater to limited access to (A):  Investigate digital certification (X.509) to allow restricted data transfer A_[OUGs] + A_[context]  Requirements analysis Minimal user certification process Avoid application installation by users Users’ complex survey data (e.g. multiple occupational records) Micro-social data (A) ↔ Occupational information resources (B)

GEODE - Durban ISA RC33, July 2006 GEODE portal access 2.2) Analytical queries Process analytical tasks on aggregate occupational information resources  Summary data –Coverage searches –Summary statistics ? Consider more complex analyses? –CAMSIS derivations –Involve interactive data management tasks –[cf. Nesstar / Data Web]

GEODE - Durban ISA RC33, July 2006 Summary: GEODE services, Data collection service hinges upon curation of occupational information User-friendly depository for occupational information resources Data processing service User-friendly file matching facilities Use of Grid to address file security concerns Improved standards in occupational information utilisation Generalisability other information services, e.g., geographical; educational eSocial Science Piloting of OGSA-DAI (with messy application) Promotion of eScience facilities Promising role with data construction process