BDWorld Alex Gray, Andrew Jones, Frank Bisby, Alastair Culham, Alex Gray, Nick Fiddian, Andrew Jones, Malcolm Scoble, Paul Valdes, Richard White, Peter.

Slides:



Advertisements
Similar presentations
At Reading Frank Bisby, Alistair Culham, Paul Valdes, Neil Caithness, Tim Sutton, Peter Brewer At Cardiff Alec Gray, Andrew Jones, Nick Fiddian, Nick Pittas,
Advertisements

Cardiff School of Computer Science & Informatics Biodiversity Informatics at COMSC Andrew Jones & Richard White School of Computer Science & Informatics.
BioGIS: A Web-Based Environment for Analyzing, Modelling and Visualizing Biodiversity Data Ronen Kadmon The Hebrew University of Jerusalem.
Knowledge Sharing and Collaborative Problem Solving in Biodiversity Informatics Andrew C. Jones Cardiff University, UK.
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Resource wrappers, web services, grid services Jaspreet Singh School of Computer.
Common Data Models and Protocols Richard White, Cardiff University Talk given at “Making Species Databases Interoperable”,
1 Richard White Design decisions: architecture 1 July 2005 BiodiversityWorld Grid Workshop NeSC, Edinburgh, 30 June - 1 July 2005 Design decisions: architecture.
Jennifer A. Dunne Santa Fe Institute Pacific Ecoinformatics & Computational Ecology Lab Rich William, Neo Martinez, et al. Challenges.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
University of Kansas Data Discovery on the Information Highway Susan Gauch University of Kansas.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
A Virtual Laboratory for Global Biodiversity Analysis.
Accessing Biodiversity Resources in Computational Environments from Workflow Application J. S. Pahwa, R. J. White, A. C. Jones, M. Burgess, W. A. Gray,
Considerations for the Construction of Lichen Databases Data Management.
115 October 2005Richard White - Sp2000/ENBI - Stockholm Litchi: interlinking species information systems Richard White, Andrew Jones, Ed Donovan Computer.
Drivers for a PRAGMA Biodiversity Science Expedition Reed Beaman Florida Museum of Natural History University of Florida.
Harnessing the Power of Environmental Data for Decision-Making IABIN Phase II.
The German Centre for Documentation and Information in Agriculture PGR-Forum European Crop Wild Relative Diversity Assessment and Conservation Forum
Richard White Biodiversity Data. Outline Biodiversity: what is it? – Definitions: is biodiversity: A resource? Something which can be measured? How to.
SEEK: Enabling Ecology and Biodiversity Science Through Cyberinfrastructure.
Designing and Building a Biodiversity Grid: the Biodiversity World Project A talk in the workshop “e-Research - Meeting New Research Challenges” at the.
Prepared for the 3rd SBB telecon 20 Mar 2012 Michele Walters, BI-01 task coordinator.
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
Pipelines and Scientific Workflows with Ptolemy II Deana Pennington University of New Mexico LTER Network Office Shawn Bowers UCSD San Diego Supercomputer.
[] Where Did Those GBIF Occurrences Come From? Providing Digital Access to NatureServe's Reference Database: Report on a Project in the Early Stages of.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.
Andrew Jones Interop. in changing infrastructure BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July Design Decisions Interoperability.
The Saguaro Digital Library for Natural Asset Management Dr. Sudha RamSudha Ram Advanced Database Research Group Dept. of MIS The University of Arizona.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
Definition of an Observation In general, an observation represents the measurement of some attribute, of some thing, at a particular time and place. Observations.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Meredith A. Lane CODATA/ERPANET Workshop: Scientific Data Selection &
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
At Reading Frank Bisby, Alistair Culham, Neil Caithness, Tim Sutton, Peter Brewer, Chris Yesson At Cardiff Alec Gray, Andrew Jones, Nick.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
1October 2006Richard White, Andrew Jones & Frank Bisby - TDWG - St Louis Federating taxonomic databases: progress with the Catalogue of Life Dynamic Checklist.
Data Integration in Bioinformatics Using OGSA-DAI The BioDA Project Shirley Crompton, Brian Matthews (CCLRC) Alex Gray, Andrew Jones, Richard White (Cardiff.
The role of persistent identifiers in tracking taxon changes Andrew C. Jones, Richard J. White, Ewen R. Orme, School of Computer Science, Cardiff University,
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
Riccardi: DIALOGUE Workshop August 1, 2005 Supported by NSF BDI 1 Representing and Using Phylogenetic Characters in Morphbank Greg Riccardi, David Gaitros,
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
Example projects using metadata and thesauri: the Biodiversity World Project Richard White Cardiff University, UK
The University of Reading Frank Bisby, Alistair Culham, Neil Caithness, Tim Sutton, Peter Brewer, Chris Yesson Cardiff University Alec Gray, Andrew Jones,
Charles Copp, Neil Caithness & Richard White.  Evaluation, selection and acquisition of existing thesauri  Thesaurus modelling - logical and physical.
ACGT Architecture and Grid Infrastructure Juliusz Pukacki ‏ EGEE Conference Budapest, 4 October 2007.
DataGrid France 12 Feb – WP9 – n° 1 WP9 Earth Observation Applications.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Lifemapper 2.0 Using and Creating Geospatial Data and Open Source Tools for the Biological Community Aimee Stewart, CJ Grady, Dave Vieglais, Jim Beach.
Research using Registries
Environmental Intelligence Platform – Monitoring Nutrients Pollution with Earth Observation Data for Sustainable Agriculture and Clean Waters Blue.
Flanders Marine Institute (VLIZ)
Development of the Amphibian Anatomical Ontology
Middleware independent Information Service
Ch > 28.4.
Design, prototyping and construction
Introduction to D4Science
CNES Service for Data Referencing and Archiving
Geospatial and Problem Specific Semantics Danielle Forsyth, CEO and Co-Founder Thetus Corporation 20 June, 2006.
WIS Strategy – WIS 2.0 Submitted by: Matteo Dell’Acqua(CBS) (Doc 5b)
2. An overview of SDMX (What is SDMX? Part I)
European Forest Data Centre & European Soil Data Centre Progress report Jesús San-Miguel Databases, early waringn, remote sensing, simulation models,
Web Mining Department of Computer Science and Engg.
Module P4 Identify Data Products and Views So Their Requirements and Attributes Can Be Controlled Learning Objectives: Understand the value of data. Understand.
how users and data producers interact on WIS
A Research Data Catalogue supporting Blue Growth: the BlueBRIDGE case
Design, prototyping and construction
Presentation transcript:

BDWorld Alex Gray, Andrew Jones, Frank Bisby, Alastair Culham, Alex Gray, Nick Fiddian, Andrew Jones, Malcolm Scoble, Paul Valdes, Richard White, Peter Brewer, Oliver Bromley, Neil Caithness, Nick Pittas, Tim Sutton, Xuebiao Xu BiodiversityWorld: a GRID-based problem solving environment for global biodiversity

BDWorld THE CHALLENGE Some difficult Biodiversity questions How should conservation efforts be concentrated? –(example of Biodiversity Richness & Conservation Evaluation) Where might a species be expected to occur, under present or predicted climatic conditions? –(example of Bioclimatic modelling and Climate Change) Is geography a good predictor of relationship between lineages? (e.g. are the more closely related species found near each other?) –(example of Phylogenetic Analysis & Biogeography)

BDWorld BDWorld origins LITCHI improving quality of catalogue of life SPICE establishing a catalogue of life GRAB proof of concept demonstrator

BDWorld CURRENT STATUS of ORIGINS LITCHI NEEDS RE-ENGINEERING SPICE NEEDS EXTENSION OF DATA GRAB DEMONSTRATOR WORKS CONTINUING IN NEW PROJECTS –ENBI, EUROCAT, BDWorld

BDWorld Our vision Biodiversity Problem Solving Environment – –Heterogeneous diverse resources –Flexible workflows –Main challenges centre around metadata, interoperability, ontologies, etc; –High-performance computing secondary (though relevant) –‘collaboratorium’: collaborative environment in which each resource community can collaborate with the others –‘excessively distributed resources’: biodiversity data systems are scattered in all countries and in multiple agencies within each country

BDWorld Need for a Catalogue of Life Scientific names change Synonyms – common plus previous E.g. Faba faba is a synonym of Vicia faba Species 2000 provides a distributed catalogue –Each ‘sector’ maintained by suitable taxonomic specialists

BDWorld Basic uses of catalogue Checking taxonomy Access or store data about species –Observations etc –Links accepted name & sysnonyms –Exhaustive retrieval use all names –Intelligent linkage

BDWorld GRAB (GRid And Biodiversity) 6 month DTI-funded demonstrator project Cardiff University –Investigators: Alex Gray, Andrew Jones & Nick Fiddian –Research associates: John Robinson & Jonathan Giddy Project aim: –illustrate the GRID’s potential for collaborative research, discovering & using diverse biodiversity-related databases

BDWorld Test scenario Find species in the catalogue Retrieve species information: –Synonyms & Common Names –Geography –Images Retrieve climate information Search for species within specified climate envelope Do the above iteratively if desired

BDWorld GRAB resource types Catalogue of life SISClimate GRAB resource clients GRAB interface SIS... Catalogue of life –Scientific & common names Species Information System (SIS) –Images; geography Climate –Max/min temperature; annual precipitation

BDWorld Typical GRAB display Web browser ‘front-end’ to the GRAB server Applet monitoring communication between GRAB server and GRAB databases

BDWorld Using Globus … We have used Globus to give us: –Invokable services (GRAM) and retrieval of results (GASS) –Security (single log-on – GASS) –(Elementary!) resource discovery; exploitation of metadata (MDS) Potentially: –Seamless interface to computationally intensive modelling; load balancing, etc.

BDWorld Sample metadata (for SIS) Search for a particular kind of database: (&(objectClass=GrabTaxonDatabase)(Grab-Taxon- name=ILDIS)) MDS data {dn=Grab-Taxon-name=ILDIS, Mds-Vo-name=local, o=Grid objectClass=GrabTaxonDatabase Grab-Taxon-name=ILDIS Mds-keepto= Z Mds-validto= Z Mds-validfrom= Z Grab-Taxon-contact=grab.biol.soton.ac.uk Grab-Taxon-higherTaxon=Leguminosae Grab-Taxon-executable= /home/globus/mybin/ILDISImageInterfaceServer} Could easily alter to search for database by higherTaxon instead of name

BDWorld THE CHALLENGE Some difficult Biodiversity questions How should conservation efforts be concentrated? –(example of Biodiversity Richness & Conservation Evaluation) Where might a species be expected to occur, under present or predicted climatic conditions? –(example of Bioclimatic modelling and Climate Change) Is geography a good predictor of relationship between lineages? (e.g. are the more closely related species found near each other?) –(example of Phylogenetic Analysis & Biogeography)

BDWorld Next steps: BiodiversityWorld (BDWorld) on the GRID: –Universities of Reading, Cardiff and Southampton; Natural History Museum (BBSRC-funded) –Development of appropriate middleware, linking to: existing partial catalogue of life (SPICE) thematic data sources, and analytic tools –3 exemplar application areas: biodiversity richness analysis bioclimatic modelling and climate change phylogenetic and biogeography analysis In parallel, EU-funded Species 2000 Europa will be augmenting and improving the Catalogue of Life

BDWorld Some relevant resource types Data sources: –Species 2000 ‘Catalogue of Life’ –Species Information Sources (SISs) Species geography Descriptive data Specimen distribution –Geographical Boundaries of geographical & political units Climate surfaces –Genetic sequences Analytic tools: –Biodiversity richness assessment – various metrics –Bioclimatic modelling – bioclimatic ‘envelope’ generation –Phylogenetic analysis (generation of phylogenetic trees)

BDWorld Some challenges … Finding the resources Knowing how to use these heterogeneous resources –Originally constructed for various reasons –Often little thought was given to standards or interoperability One important specific issue: using appropriate scientific name for SIS queries …

BDWorld Taxonomic index (SPICE Catalogue of Life) Analyti c tool Thematic Data source BioD- GRID Ontology:  Metadata  Intelligent links  Resource & Analytic tool descriptions  Maintenancetools Proxy Abiotic Data source Analyti c tool Proxy User Local tools Problem Solving Environment User Interface GSD Problem Solving Environment:  Broker agents  Facilitator agents  Presentation agents

BDWorld START STAGE 1 STAGE 2 STAGE 3 Analytical Toolbox Reference to Abiotic datasets Species 2000 Catalogue of Life Distributed Array of GSD’s Enquiry name(s) Returns list of accepted taxa, synonyms and common names Distributed array of thematic data sources Enquiry: select ‘data’ for ‘taxon set’ Return dataset composed of homologous responses from multiple thematic data sources Presentation and storage of results

BDWorld

Case Study - Leucaena leucocephala Leucaena leucocephala (Lam.) De Wit Native of Central America Widely introduced around the tropics Widely utilised around the globe for: –- Wood –- Forage –- Soil enrichment and erosion control Regarded as an invasive weed in some areas

BDWorld Point data from various herbaria

BDWorld Distribution data from ILDIS database

BDWorld GARP prediction of climatic suitability

BDWorld  Hadley Circulation Model - HadCM3 – IS92a Scenario “Population rises to 11.3 billion by 2100 and economic growth averages 2.3% per annum between 1990 and 2100 with a mix of conventional and renewable energy sources being used.” Global view Global view Leucaena leucocephala – future predictions

BDWorld

Big Questions for EVBL and decision making support Consequences from rapid change to the Common Agricultural Policy Consequences from climate change Consequences from population movement What protected areas would be needed to bring species loss to a halt over 10yrs

BDWorld CURRENT STATUS EARLY DAYS ITERATIVE DEVELOPMENT – STAGE 1 DEFINING BASIC METADATA INITIAL WORKFLOW RESOURCE DISCOVERY & LINKAGE RICHER METADATA – DESCRIPTIVE, PROVENANCE ONTOLOGY

BDWorld Initial test workflow SPICE Localities Climate Space Model Base Maps Climate Prediction Submit scientific name; retrieve accepted name & synonyms for species Retrieve distribution maps for species of interest Climate surfaces Model of climatic conditions where species is currently found Possibly different climate surfaces (e.g. predicted climate) World or regional maps Prediction of suitable regions for species of interest

BDWorld Metadata “key area” (h) Human – intended for user (m) Machine – intended for software Functions of metadata – descriptions of: –data –process/tool –interface protocols –workflow –representations

BDWorld Types of metadata (i) Descriptive – a description of elements –h – file of climate data –m – relational database Navigational – how to find elements –h – where data is held –m – URL Representational – how elements are held –h – units of representation (e.g. metric) –m – style of representation (e.g. real number) Identification – unique descriptor –h – data set name –m – file name

BDWorld Types of metadata (ii) Quality & reputation –h – description of quality procedures –m – integrity checks Presentation – display details –h – styles of display (e.g. visual, tabular) –m – details needed for display Provenance –h – description of elements and how they were created –m – details of software processes used

BDWorld Role/use of metadata Descriptive Create electronic book for user Create workflows –necessary transformations –provenances –interoperability Locate appropriate elements Restart/do processing

BDWorld Use of ontology Location of: –terms –data resources –available processes –transformation tools –available workflows –styles of representation

BDWorld