Accessing Biodiversity Resources in Computational Environments from Workflow Application J. S. Pahwa, R. J. White, A. C. Jones, M. Burgess, W. A. Gray,

Slides:



Advertisements
Similar presentations
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Advertisements

CVRG Presenter Disclosure Information Tahsin Kurc, PhD Center for Comprehensive Informatics Emory University CardioVascular Research Grid Core Infrastructure.
At Reading Frank Bisby, Alistair Culham, Paul Valdes, Neil Caithness, Tim Sutton, Peter Brewer At Cardiff Alec Gray, Andrew Jones, Nick Fiddian, Nick Pittas,
EDIT General Meeting Carvoeiro, January 2008.
1 Semantic Grid Services for Video Analysis Gayathri Nadarajan, Yun-Heh Chen-Burger, James Malone Centre for Intelligent Systems and their Applications.
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Resource wrappers, web services, grid services Jaspreet Singh School of Computer.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
1 Richard White Design decisions: architecture 1 July 2005 BiodiversityWorld Grid Workshop NeSC, Edinburgh, 30 June - 1 July 2005 Design decisions: architecture.
SOAPI: a flexible toolkit for implementing ingest and preservation workflows Mark Hedges Centre for e-Research, King’s College London Arts and Humanities.
Introduction and Overview “the grid” – a proposed distributed computing infrastructure for advanced science and engineering. Purpose: grid concept is motivated.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
A Virtual Laboratory for Global Biodiversity Analysis.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
1 Semantic-Based Workflow Composition for Video Processing in the Grid Gayathri Nadarajan, Yun-Heh Chen-Burger, James Malone Centre for Intelligent Systems.
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Metadata Agents and Semantic Mediation Mikhaila Burgess Cardiff University.
SEEK: Enabling Ecology and Biodiversity Science Through Cyberinfrastructure.
Designing and Building a Biodiversity Grid: the Biodiversity World Project A talk in the workshop “e-Research - Meeting New Research Challenges” at the.
Prepared for the 3rd SBB telecon 20 Mar 2012 Michele Walters, BI-01 task coordinator.
DISTRIBUTED COMPUTING
Service-enabling Legacy Applications for the GENIE Project Sofia Panagiotidi, Jeremy Cohen, John Darlington, Marko Krznarić and Eleftheria Katsiri.
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
Nationally Significant Databases and Collections Providers’ Group Emma Kelly Environmental Information Advisor Environmental Monitoring and Reporting Team.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
A performance evaluation approach openModeller: A Framework for species distribution Modelling.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.
Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July Design Decisions Interoperability.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Interoperability Grids, Clouds and Collaboratories Ruth Pordes Executive Director Open Science Grid, Fermilab.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
1 Media Grid Initiative By A/Prof. Bu-Sung Lee, Francis Nanyang Technological University.
ANKITHA CHOWDARY GARAPATI
© 2006 STEP Consortium ICT Infrastructure Strand.
Infrastructures for Social Simulation Rob Procter National e-Infrastructure for Social Simulation ISGC 2010 Social Simulation Tutorial.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
At Reading Frank Bisby, Alistair Culham, Neil Caithness, Tim Sutton, Peter Brewer, Chris Yesson At Cardiff Alec Gray, Andrew Jones, Nick.
Enabling e-Research in Combustion Research Community T.V Pham 1, P.M. Dew 1, L.M.S. Lau 1 and M.J. Pilling 2 1 School of Computing 2 School of Chemistry.
Data Integration in Bioinformatics Using OGSA-DAI The BioDA Project Shirley Crompton, Brian Matthews (CCLRC) Alex Gray, Andrew Jones, Richard White (Cardiff.
Report of the Architecture and Data Committee (ADC) R.Shibasaki (ADC, Japan)
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
1 openModeller Presentation Plan: Overview of openModeller OMWS: an open standard for distributed ecological niche modelling openModeller in relation to.
Globus: A Report. Introduction What is Globus? Need for Globus. Goal of Globus Approach used by Globus: –Develop High level tools and basic technologies.
ATLAS Database Access Library Local Area LCG3D Meeting Fermilab, Batavia, USA October 21, 2004 Alexandre Vaniachine (ANL)
Satisfying Requirements BPF for DRA shall address: –DAQ Environment (Eclipse RCP): Gumtree ISEE workbench integration; –Design Composing and Configurability,
Example projects using metadata and thesauri: the Biodiversity World Project Richard White Cardiff University, UK
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 BDWorld Workshop at NeSC Edinburgh Welcome & Introduction.
The University of Reading Frank Bisby, Alistair Culham, Neil Caithness, Tim Sutton, Peter Brewer, Chris Yesson Cardiff University Alec Gray, Andrew Jones,
© Geodise Project, University of Southampton, Workflow Support for Advanced Grid-Enabled Computing Fenglian Xu *, M.
NVS New Zealand National Vegetation Survey. What is NVS? NVS (National Vegetation Survey) – New Zealand’s largest archive facility for plot-based vegetation.
Vertical Integration Across Biological Scales A New Framework for the Systematic Integration of Models in Systems Biology University College London CoMPLEX.
A Collaborative e-Science Architecture towards a Virtual Research Environment Tran Vu Pham 1, Dr. Lydia MS Lau 1, Prof. Peter M Dew 2 & Prof. Michael J.
The AstroGrid-D Information Service Stellaris A central grid component to store, manage and transform metadata - and connect to the VO!
ACGT Architecture and Grid Infrastructure Juliusz Pukacki ‏ EGEE Conference Budapest, 4 October 2007.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Daniele Lezzi Execution of scientific workflows on federated multi-cloud infrastructures IBERGrid Madrid, 20 September 2013.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Lifemapper 2.0 Using and Creating Geospatial Data and Open Source Tools for the Biological Community Aimee Stewart, CJ Grady, Dave Vieglais, Jim Beach.
BDWorld Alex Gray, Andrew Jones, Frank Bisby, Alastair Culham, Alex Gray, Nick Fiddian, Andrew Jones, Malcolm Scoble, Paul Valdes, Richard White, Peter.
Expanding and Scaling Lifemapper Computations Using CCTools
Grid Portal Services IeSE (the Integrated e-Science Environment)
Model-Driven Analysis Frameworks for Embedded Systems
Presentation transcript:

Accessing Biodiversity Resources in Computational Environments from Workflow Application J. S. Pahwa, R. J. White, A. C. Jones, M. Burgess, W. A. Gray, N. J. Fiddian, T. Sutton, P. Brewer, C. Yesson, N. Caithness, A.Culham, F. A. Bisby, M. Scoble, P. Williams and S. Bhagwat WORKS 2006, Paris

Overview The Biodiversity World (BDW) Project The three exemplars chosen for BDW BDW Architectural Components a)Resource Wrappers b)BiodiversityWorld-GRID Interface (BGI) Communications Layer c)BDW Datatypes d)The Metadata Repository (MTR) Using BDW for bioclimatic modelling Access to computational resources in BDW environment Further Work & Conclusions

The BDW System A framework for biodiversity problem-solving provides access to widely dispersed, disparate data sources and analytical tools Intended particularly for analysis and modelling of biodiversity patterns Provides access to resources originally designed for use in isolation Resources may be composed into complex workflows

BDW Exemplars A.Biodiversity richness analysis and conservation evaluation B.Bioclimatic modelling and global climate change C.Phylogenetic analysis and biogeography

Biodiversity Richness Analysis and Conservation Evaluation Aim: analysis of biodiversity richness patterns for a particular taxon (e.g. group of species) around the world The BDW System enables: Taxonomic verification using the Species 2000 Catalogue of Life service Composition of distribution datasets for the chosen taxon from various sources around the world Use of the WorldMap System to visualise the distribution datasets, and help identify priority areas for biodiversity conservation

Bioclimatic Modelling and Global Climate Change Aim: Understand impact of global climate change on distribution and diversity of plant & animal species Identify climatic & ecological conditions under which a single species lives, extrapolating from known occurrences Hence calculate a potentially wider set of areas where the species might occur, or predict future distribution under anticipated climatic conditions A bioclimatic modelling workflow example follows later

Phylogenetic Analysis and Biogeography Aim: Discover ancestral relationships between groups of organisms using methods of phylogenetic analysis Estimate ages of species Use estimates of historical climate to produce plausible estimates of geographical distributions Assess historical relationships between changing climate and development of new species

The BDW System provides (1): A flexible and extensible problem solving environment (PSE) Means of bringing together heterogeneous, globally distributed, biodiversity-related resources & analytical tools assembling resources into workflows to perform complex scientific analyses Consistent mechanisms to achieve interoperability of system components

The BDW System provides (2): Uniform interfaces for heterogeneous resources (resource wrappers) Mechanism for data packaging & transfer Compatibility with the Triana Workflow System for assembling and executing workflows Web Services-based Grid middleware for accessing remote computational resources

The BDW System Architecture

BDW architectural components (1) Resource Wrappers Provide consistent interface to local & remote resources, and standard resource access/invocation mechanism Insulate the core BDWorld System from resource heterogeneity Wrap various kinds of resources and analytical tools and can be deployed in Grid/Web Services environment. Give consistent form to data retrieved by encapsulating them into BDWorld data types Resources wrapped include AVH, GBIF, OpenModeller, etc.

Resource Wrapper Architecture

BDW architectural components (2) BDW-GRID Interface (BGI) Layer Provides standard mechanisms for invoking operations on heterogeneous resources Acts as an integrated mechanism for accessing all resource wrappers Isolates resource wrapper implementation to a separate layer to enable the use of web services/grid technologies

BDW architectural components (3) BDW Datatypes Encapsulate different types of data and sub-datatypes for transporting data between end points Can be transformed into xml representations which can be easily serialised Flexible enough to encapsulate user-defined xml documents or data in a string representation Extensible; new datatypes can be incorporated

BDW Datatypes

BDW architectural components (4) BDW Metadata Repository A specialised BDWorld resource Provides information such as: Available resources Operations supported by each resource Data types used by operations Location of resource wrapper Stores semantic information in the BDWorld ontology, to answer questions such as ‘Which resources can provide me with species data?’ ‘Which available operations can accept the outputs from a specific operation?’

Bioclimatic Modelling (1) By using the known localities of a species, a climate preference profile is produced by cross-referencing with present day climate data This climate preference profile is then used to locate other areas where such a climate exists, indicating areas climatically suitable for the species

Bioclimatic Modelling (2) Using present-day climate: assess areas under threat from invasive species, or those that may benefit from the introduction of a new crop Using climate predictions for the future: assess possible effects of global climate change on the distribution of study species Using climate predictions for the past: assess changes caused by natural factors in the past

Bioclimatic Modelling Workflow performed by Triana workflow package in BDW system

Example model output for the clover species Trifolium patens Schreber (a member of the bean family). The map shows areas (shaded regions across Central and Eastern Europe, South America, Asia and Australia) predicted to be suitable for the species in the 2050’s using the bioclimatic modelling algorithm GARP and the Hadley Centre climate model using the SRES A1F climate scenario.

The Current BDW Architecture: Enables execution of BDW workflow tasks in remote nodes but with a limited scope. - Lacks in giving sufficient control and flexibility to the user. - Does not provide the functionality of distributing user jobs across several nodes. - Dependent on libraries at the client side.

The new BDW System architecture (1): Provides user with access to: - Biodiversity resources. - Computational resources. Use the existing mechanism of invoking operations on remote resources via resource wrapper web services. It also uses condor middleware for utilising computational resources and distributing workload across available nodes.

The new BDW System architecture (2): Provides access to the condor pool via the web service interface. Gives user to flexibility to choose available computational node by using Ganglia cluster monitoring toolkit. Enables matching of workflow task with preferred resource(s).

The new BDW System architecture (2):

Conclusions and Further Work BDW brings together varied, distributed resources and analytical tools for biodiversity researchers and analyse biodiversity patterns Disparate resources can be accessed in the Web-Service enabled BDW PSE. The BDW PSE has uniform access to heterogeneous resources BDW allows linking of tools and resources in a workflow to automate different activities of an experiment Three current exemplar study areas The new BDW architecture also provides access to computational resources. Security – Shibboleth/chroot

Acknowledgements BDW team Species 2000 OpenModeller Community (including CRIA) BBSRC …