Semantically-Assisted Geospatial Workflow Design Gobe Hobona, David Fairbairn, Philip James ACM GIS – 8 th November Seattle
Overview Background OGC Web Services Workflow Enactment A Role for Semantics Prototype Implementation Conclusions
Background Open Geospatial Consortium (OGC) JISC Grid/OGC Collision Programme Security : SEE-GEO Workflow : SAW-GEO OGC Interoperability Experiments (OWS-4 and beyond) Challenge: How to support the user in the construction of workflows to address a variety of problems?
OGC Web Services Web Map Services (WMS) Generates geovisualisations/maps from any geo-data source Web Feature Services (WFS) Disseminates vector geospatial data Web Coverage Services (WCS) Disseminates raster geospatial data Web Processing Services (WPS) Runs geocomputational models or geospatial operations on user-supplied datasets
Workflow Enactment Recognised by ISO19119 Several options for workflow enactment SCUFL, BPEL, Keppler etc Selected BPEL because OASIS Standard, i.e. WS-BPEL 2.0 Multi-vendor support including IBM, Sun Microsystems, ActiveEndpoints, Oracle etc Availability of open source enactors
An Example Geospatial Workflow Based on OGC OWS-4 GeoProcessing Workflow Scenario client Workflow Enactor WPS 1 Generalise WPS 2 Clip WFS
Possible Applications for Geo-Workflows Emergency Management Where each activity depends on the result of a previous activity Geographic modelling Where several steps are needed before a final model is produced e.g. ESRI Model Builder Climate Change scenarios Where a number of possible routes for workflows are possible depending on the state of certain variables
A Role for Semantics in Orchestration SOA Problem Domain Concept 1 Concept 2 Concept 3 Concept 4 ServicesResources Workflow enactor Concept 1 Concept 3 Concept 5 EXACT SUBSUMPTION
Calculating Workflow Similarity (1) Thing A B C D E I J K L M A path through several other concepts X A concept Key: A path linking two concepts. The concept on the arrow-head subsumes the other
Calculating Workflow Similarity (2) High similarity (request) Low similarity InputAAAAAI ActivityBBBBJJ CCCKKK DDLLLL OutputEMMMMM
Proposed Formula α is an application-specific weight applied to each activity in the workflow n is the number of activities in the requested workflow P k is the number of edges between the concept representing the k th activity in the workflow and the concept tagging a candidate service
A Geospatially aware Ontology Earth and environmental problem domains present problems of space Requiring a geo-aware ontology SWEET* Earth Realm Physical Process Physical Property Non-Living Substances Living Data Human Activity Numerics Natural Phenomena Space Time Units * Raskin, R. G. and Pan, M. J. Knowledge representation in the semantic web for Earth and environmental terminology (SWEET).Computer & Geosciences, 31, 9 (2005),
SWEET
Implementation SWEET ontology uploaded from OWL documents into PostgreSQL Metadata held in a conventional DBMS with a catalogue service interface For each query, semantically related concepts found using Jena Additional methods implemented to calculate the number of edges
Architecture Catalogue Service Metadata Eclipse IDE SAW-GEO Plug-In BPEL Editor Client-side Server-side Ontology Jena
Response from Search Search Concept Number of edges between Search Concept and tag
Ontology and Suggested Flow views plugged into ActiveBPEL Designer
Evaluation 15 OGC web services with 180 resources compiled from a Google search for GetCapabilities documents Resources tagged with references to OWL Concepts Tags assigned according to specialist words in resource titles Resource titles obtained from GetCapabilities methods Average response time for discovery and edge count Greatest cost where several resources are tagged with concepts far away from the search concept Possibilities for parallelisation as each concept can be searched for on a separate machine
Conclusions and Future Work Proposed formula offers an algorithmic approach for comparing linear workflows Limitations to complete automation due to variations between properties of service inputs and outputs Future work should investigate non-linear workflows and the inclusion of conditional activities Need for a test corpus for evaluating catalogues of OGC web services Thank You