Service Oriented Architecture for Geographic Information Systems Supporting Real Time Data Grids Galip Aydin Department Of Computer Science Indiana University.

Slides:



Advertisements
Similar presentations
Page 1 CSISS LCenter for Spatial Information Science and Systems 03/19/2008 GeoBrain BPELPower Workflow Engine Liping Di, Genong Yu Center.
Advertisements

The Next Generation Network Enabled Weather (NNEW) SWIM Application Asia/Pacific AMHS/SWIM Workshop Chaing Mai, Thailand March 5-7, 2012 Tom McParland,
Web Service Ahmed Gamal Ahmed Nile University Bioinformatics Group
Integrating Geographical Information Systems and Grid Applications Marlon Pierce Contributions: Ahmet Sayar, Galip Aydin, Mehmet Aktas, Harshawardhan Gadgil.
Service Oriented Sensor Web Xingchen Chu and Rajkumar Buyya University of Melbourne, Australia Presented by: Gerardo I. Simari CMSC828P – Fall 2006 Professor.
The Problem: Integrating Data, Applications, and Client Devices The key issue we try to solve is building the distributed computing infrastructure that.
Distributed components
Latest techniques and Applications in Interprocess Communication and Coordination Xiaoou Zhang.
Technical Architectures
Indiana University QuakeSim Activities Marlon Pierce, Geoffrey Fox, Xiaoming Gao, Jun Ji, Chao Sun.
Systems Architecture, Fourth Edition1 Internet and Distributed Application Services Chapter 13.
Using AJAX Galip Aydin, Ahmet Sayar, and Marlon Pierce Community Grids Lab Indiana University.
1 Alternate Title Slide: Presentation Name Goes Here Presenter’s Name Infrastructure Solutions Division Date GIS Perfct Ltd. Autodesk Value Added Reseller.
Web Service What exactly are Web Services? To put it quite simply, they are yet another distributed computing technology (like CORBA, RMI, EJB, etc.).
Principles for Collaboration Systems Geoffrey Fox Community Grids Laboratory Indiana University Bloomington IN 47404
U.S. Department of the Interior U.S. Geological Survey U.S. National Water Census “Cyber – Platform” Update Progress and challenges to overcome in realizing.
Managing Data Interoperability with FME Tony Kent Applications Engineer IMGS.
Integrating Geographical Information Systems and Grid Applications Marlon Pierce Contributions: Yili Gong,
GIS technologies and Web Mapping Services
Interoperability ERRA System.
Managing Service Metadata as Context The 2005 Istanbul International Computational Science & Engineering Conference (ICCSE2005) Mehmet S. Aktas
, Implementing GIS for Expanded Data Accessibility and Discoverability ASDC Introduction The Atmospheric Science Data Center (ASDC) at NASA Langley Research.
High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
An Introduction To Building An Open Standard Web Map Application Joe Daigneau Pennsylvania State University.
material assembled from the web pages at
Microsoft Visual Studio 2010 Muhammad Zubair MS (FAST-NU) Experience: 5+ Years Contact:- Cell#:
Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
QuakeSim Work: Web Services, Portlets, Real Time Data Services Marlon Pierce Contributions: Ahmet Sayar,
Integrated Collaborative Information Systems Ahmet E. Topcu Advisor: Prof Dr. Geoffrey Fox 1.
1 Grids for Real-time and Streaming Applications GCC2005 Beijing China December Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology.
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
AUKEGGS Architecturally Significant Issues (that we need to solve)
Using Topic-Based Publish/Subscribe for Managing Real Time GPS Streams Marlon Pierce, Galip Aydin, Zhigang Qi Community Grids Lab Indiana University 1.
SBIR Final Meeting Collaboration Sensor Grid and Grids of Grids Information Management Anabas July 8, 2008.
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
GPS Sensor Web Time Series Analysis Using SensorGrid Technology Robert Granat 1, Galip Aydin 2, Zhigang Qi 2, Marlon Pierce 2 1 Science Data Understanding.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
SensorGrid Galip Aydin June SensorGrid A flexible computing environment for coupling real-time data sources to High Performance Geographic Information.
Web Services and Geologic Data Interchange Simon Cox CSIRO Exploration & Mining
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
SensorGrid High Performance Web Service Architecture for Geographic Information Systems Thesis Proposal Galip Aydin
RSISIPL1 SERVICE ORIENTED ARCHITECTURE (SOA) By Pavan By Pavan.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
Integrating Geographical Information Systems and Grid Applications Marlon Pierce Contributions: Ahmet Sayar,
HPSearch for Managing Distributed Services Authors Harshawardhan Gadgil, Geoffrey Fox, Shrideep Pallickara Community Grids Lab Indiana University, Bloomington.
November Geoffrey Fox Community Grids Lab Indiana University Net-Centric Sensor Grids.
1 MESSAGE EXCHANGE FOR Web Service-Based Mapping Services AHMET SAYAR INDIANA UNIVERSITY COMMUNITY GRIDS LAB. COMPUTER SCIENCE DEPARTMENT August 17, 2005.
Update on GPS/SCIGN REASoN CAN GPS Data Products for Solid Earth Science (GDPSES) Sponsored by NASA F Webb, Y Bock, D Dong, B Newport, P Jamason, M Scharber,
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
1 Service Oriented Architecture SOA. 2 Service Oriented Architecture (SOA) Definition  SOA is an architecture paradigm that is gaining recently a significant.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Grid Builder Status Rui Wang July 16, Grid Builder The Grid Builder uses a management console to deploy grids dynamically and remotely –The user.
Web Services Blake Schernekau March 27 th, Learning Objectives Understand Web Services Understand Web Services Figure out SOAP and what it is used.
1 Web Service Information Systems and Applications GGF16 Semantic Grid Workshop Athens Greece February Geoffrey Fox Computer Science, Informatics,
Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.
The Virtual Observatory and Ecological Informatics System (VOEIS): Using RESTful architecture and an extensible data model to provide a unique data management.
Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture Hasan Bulut Advisor: Prof. Geoffrey Fox Ph.D. Defense Exam.
Software Architecture Patterns (3) Service Oriented & Web Oriented Architecture source: microsoft.
Implementing distributed geoscience information systems using Open GIS Web Services Simon Cox CSIRO Exploration & Mining
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Scripting based architecture for Management of Streams and Services in Real-time Grid Applications Authors Harshawardhan Gadgil, Geoffrey Fox, Shrideep.
Integrating Geographical Information Systems and Grid Applications
Integrating Geographical Information Systems and Grid Applications
CHAPTER 3 Architectures for Distributed Systems
Information Services for Dynamically Assembled Semantic Grids
Presentation transcript:

Service Oriented Architecture for Geographic Information Systems Supporting Real Time Data Grids Galip Aydin Department Of Computer Science Indiana University 1 1/15/2007

Geographic Information Systems  A Geographic Information System is a system for creating, storing, sharing, analyzing, manipulating and displaying spatial data and associated attributes.  GIS history saw the evolution from mainframe GIS to Desktop GIS to Distributed GIS.  Modern GIS require:  Distributed data access for spatial databases  Utilizing remote analysis, simulation or visualization tools.  Problems with traditional distributed GIS approaches:  Distributed nature of the geo-data; various client-server models, databases, HTTP, FTP, RDBs, XML DBs etc.  Data format problems, conversion overheads  Data processing issues, hardware and software requirements, COM+/ActiveX, CORBA/IIOP frameworks 2

Open Geographic Standards  Open GIS Standards bodies aim to make geographic information and services neutral and available across any network, application, or platform.  Two major standard bodies: OGC and ISO/TC211, former being most popular  OGC Specifications are widely accepted:  Data Format Specs: GML, SensorML, O&M  Service Specs: WFS, WMS, WCS  OGC Services are HTTP GET/POST based; limited data transport capabilities.  Request-response type services; centralized, synchronous applications. 3

PBO and CRTN GPS Stations 4 Plate Boundary Observatory (PBO) GPS Stations in North America California Real-Time GPS Network (CRTN).

Requirements for a GIS/Sensor Grid  Requirements of service orchestration capabilities  Complex problems require GIS applications to collaborate.  Coupling data sources to scientific applications  Data transport requirements  Proliferation of Sensors  Ability to analyze data on-the-fly, continuous streaming support, scalable systems for addition of new sensors.  High performance and high rate messaging  Real-time data access, rapid response systems, crisis management etc.  From the Grids perspective the Motivations are  To apply general Grid/Distributed computing principles to GIS  Investigate how to integrate with geophysical and other scientific applications with data sources 5

Motivating Use Cases  Very successful and highly acclaimed earthquake science applications  Pattern Informatics (PI) - UC Davis Earthquake forecasting code developed by Prof. John Rundle (UC Davis) and collaborators, uses seismic archives.  Regularized Dynamic Annealing Hidden Markov Method (RDAHMM) – NASA/JPL Time series analysis code, can be applied to GPS and seismic archives. It can be applied to real-time and archival data.  SOPAC GPS Networks provide real-time messages – UCSD/SIO 8 networks for 80 stations produce 1Hz high resolution data. The signatures of GPS Sensors are used in Earthquake forecasting.  Interdependent Energy Infrastructure Simulation System (IEISS) - LANL  Models infrastructure networks (e.g. electric power systems and natural gas pipelines) and simulates their physical behavior, interdependencies between systems. 6

Research Issues 1  Applying Web Service principles to GIS data services  Orchestration of Services, workflows. We need services suitable for large data sets and where quick response is required.  High Performance s upport in GIS services  The performance problem must be addressed in a complete and general framework supporting different data requirements  Interoperability  The system should bridge GIS and Web Service communities by adapting standards from both.  Other GIS applications should be able to consume data without having to do costly format conversions. 7

Research Issues 2  Scalability  The system should be able to handle high volume and high rate data transport and processing.  Plugging new sensors, data sources or geo-processing applications should not degrade system’s overall performance.  Flexibility and extendibility  How to develop real-time services to process sensor data on the fly.  Ability to add new filters without system failures.  Quality of Service Issues  Is latency introduced by services in processing real-time sensor data acceptable? 8

SOA for GIS – Geophysical Data Grid  To create a GIS Data Grid (Geophysical Grid) Architecture we utilize  Web Services to realize Service Oriented Architecture  OGC data formats and application interfaces to achieve interoperability at both data and service levels.  GIS Data Grid Features  Depending on the source, geospatial data can be archival or real- time. The architecture provides standard control and access interfaces for both types.  Supports alternate transport and representation schemes, uses topic based messaging infrastructure for data and message exchange.  Streaming and non-streaming services to access archived data.  Real-Time and near real-time filter services for accessing sensor metadata and sensor measurements. 9

GIS Grid Usage Model – Earthquake Science  Supporting geophysical repositories and real-time sensors is essential  To analyze a typical earthquake it is important to access to precise measurements of the initial earthquakes and aftershocks  To support earthquake forecasting and the time and spatial positions of the forecasts  PI can be used with existing data  RDAHMM can be used with the real-time data  Earth Science field is moving from a previously data poor field to a data rich world. We will have thousands of sensors spread around the world. (i.e. GPS sensors, InSAR satellites) 10

GIS Grid Components  Filter Services for Real-Time data support  OGC Web Feature Service (WFS) for archival data support  Web Service version  Streaming version, which introduces data and control channel separation  All control goes through SOAP messages, data is transferred by a variety of transport mechanisms which are implied by the control message.  Publish-Subscribe system for message and data exchange  UDDI based service registry (by Mehmet Aktas) 11

Geophysical Data Grid Architecture Archival Data Grid Real-Time Data Grid 12

GIS Grid Part 1 - Real-Time Data Services  Sensors and sensor networks are being deployed for measuring various geo-physical entities.  Sensors and GIS are closely related. Sensor measurements are used by GIS for statistical or analytical purposes.  With the proliferation of the sensors, data collection and processing paradigms are changing.  Most scientific geo-applications are designed to work with archived data.  Critical Infrastructure Systems and Crisis Management environments require  fast and accurate access to real-time sources  a flexible/pluggable architecture for coupling geo- processing applications with the data. 13

SensorGrid Architecture  Major components :  Real-Time filters  Publish-Subscribe System  Information Service  Filters can be run as Web Services to create workflows.  Filter Chains can be deployed for complex processing.  Streaming messaging provides high-performance transfer options. 14

Real-Time Filters  Real-time data processing is supported by employing filters around publish/subscribe messaging system.  The filters are extended from a generic class to inherit publish and subscribe capabilities.  They can be connected in parallel or serial as chains to solve complex problems. 15 Input Signal Filter Output Signal Parallel Operation Serial Operation

Use Case - GPS Sensors  GPS is used to identify long-term tectonic deformation and static displacements. SCIGN has 250 Real-Time GPS Stations.  SOPAC GPS networks:  8 networks for 80 stations produce 1Hz high resolution data.  Socket based real-time binary-RYO format access is available.  We developed filters to provide multiple format (RYO, ASCII, GML) real-time streaming access.  OHIO principle (a general principle required by DOD) and chain of filters.  Our Architecture  Uses publish/subscribe based NaradaBrokering for managing real- time GPS streams  Utilizes topics for hierarchical organization of the sensors  Deploys successive data filters ranging from format translators to data analysis codes  Could potentially be used to run RDAHMM clones to monitor state changes in the entire GPS network  We are partner in a pioneering project to use the real-time GPS data for the first time in this context. 16

Processing Real-Time GPS Streams 17 ryo2nb Raw Data RYO Ports NB Server ryo2asciiascii2gmlascii2pos Single Station Displacement Filter Station Health Filter RDAHMM Filter Scripps RTD Server Scripps RTD Server ryo2nb Raw Data ryo2asciiascii2pos Single Station RDAHMM Filter A Complete Sensor Message Processing Path, including a data analysis application. /SOPAC/GPS/CRTN01/RYO /SOPAC/GPS/CRTN01/ASCII /SOPAC/GPS/CRTN01/POS /SOPAC/GPS/CRTN01/DSME GPS Networks

Application Integration with Real-Time Filters Station Monitor Filter records real-time positions for 10 minutes and calculates position changes Graph Plotter Application creates visual representation of the positions. RDAHMM Filter records real-time positions for 10 minutes and invokes RDAHMM application which determines state changes in the XYZ signal. Graph Plotter Application creates visual representation of the RDAHMM output. 18

Recording and Replaying Sensor Streams  Filters can be used to record and replay scenarios, such as Earthquakes in GPS case.  We developed RYO Recorder and RYO Publisher Filters.  The RYO Recorder creates daily archives of the GPS Streams.  RYO Publisher can be used to play daily or certain segments of the records.  We replayed the 2004 Southern California Earthquake using Parkfield GPS network archive  These filters are used in the performance and scalability tests. 19

SensorGrid Performance Tests  Two Major Goals: System Stability and Scalability  Ensuring stability of the distributed Filter Services for continuous operation.  Finding the maximum number of publishers (sensors) and clients that can be supported with a single broker.  Investigate if system scales for larger number of sensors and clients. 20

Test Methodology  The test system consists of a NaradaBrokering server and a three-filter chain for publishing, converting and receiving RYO messages.  We take 4 timings to calculate mean end-to-end delivery times of GPS measurements.  The tests were run at least for 24 hours.  GridFarm servers are used in these tests. T transfer = (T 2 – T 1 ) + (T 4 – T 3 ) 21 NB Server 1 RYO To ASCII Converter RYO Publisher Simple Filter

1- System Stability Test  The basic system with three filters and one broker.  The figure shows average results for every 30 minutes.  The average transfer time shows the continuous operation does not degrade the system performance. 22

NB Server RYO To ASCII Converter Simple Filter RYO Publisher 1 RYO Publisher 2 RYO Publisher n 2 – Multiple Publishers Test  We add more GPS networks by running more publishers.  The results show that 1000 publishers can be supported with no performance loss. This is an operating system limit. 23 Topic 1A Topic 1B Topic 2 Topic n

3 – Multiple Clients Test  We add more clients by running multiple Simple Filters which subscribe to the same ASCII topic.  The system can support as many as 1000 clients with very low performance decrease. 24 NB Server RYO To ASCII Converter Simple Filter 1 RYO Publisher 1 Simple Filter 2 Simple Filter n Topic 1A Topic 1B 1000 Clients

Extending Scalability  The limit of the basic system appears to be 1000 clients or publishers.  This is due to an Operating System restriction of open file descriptors (1024 for Red Hat Linux) which can be increased by changing OS parameters.  To overcome this limit we create NaradaBrokering networks with linking multiple brokers. NB supports scalable linkage of the brokers for building tree like architectures.  We run 2 brokers to support 1500 clients.  Number of brokers can be increased indefinitely, so we can potentially support any number of publishers and subscribers. 25

4 – Multiple Brokers Test  NaradaBrokering allows creation of Broker networks.  We create a two-broker network.  Messages published to first broker can be received from the second broker.  We take timings on each broker.  We connect 750 clients to each broker and run for 24 hours. We chose 750 clients to stay well below the saturation limit.  The results show that the performance is very good and similar to single broker test. 26 NB Server 1 NB Server 2 RYO To ASCII Converter Simple Filter 1 RYO Publisher Topic 1A Topic 1B Simple Filter 2 Simple Filter 750 Simple Filter 751 Simple Filter 752 Simple Filter 1500 Topic 1B NB Server 2

4 – Multiple Brokers Test 27

Real-Time Filters Test Results  The RYO Publisher filter runs at 1Hz and publishes 24-hour archive of the CRTN_01 GPS network, which contains 9 GPS stations.  The single broker configuration can support 1000 clients or publishers (GPS networks individual stations).  The system can be scaled up by creating NaradaBrokering broker networks.  Message order was preserved in all tests. 28

GIS Grid Part 2 - Archival Data Grid  Web Feature Service is the default OGC specification for vector data.  We have built Web Service version of WFS for accessing geospatial data on distributed databases.  Requirements  Various Feature data should be stored in the databases  Queries are in OGC Common Query Language (GML) format  Results are GML Feature Collections  Operations to support are Get Capabilities, Describe Feature Types, Get Features  To connect to multiple databases we have implemented a DB federation scheme  Adding features is easy with using XML configuration files  We have Implemented OGC Filter Encoding for Query Translation  Dynamic Capability generation allow federation of the services  The first Web Service version of WFS has been successfully used in several scientific workflows with other services (WMS, HPSearch, UDDI). 29

WFS Performance Improvements Streaming WFS  Issues with Web Service version of the WFS  Synchronous request-response style  Handling non-trivial data transfers, large data requests, SOAP overhead.  XML Encoding: Size of the geospatial data increases with GML encoding which increases transfer times, or may cause exceptions  To improve performance of the WFS:  Utilized publish/subscribe messaging system for high performance data transfer. Similar to WFS but introduces data and control channel separation which allows one to many data distribution.  Used streaming database connection (MySQL) for faster retrieval of the query results, and lower GML creation overhead.  Binary XML Frameworks are integrated for reducing XML payload size which improves transfer times. We used BNUX and Fast Infoset frameworks in our tests.  Binding data transfer to publish-subscribe messaging system reduces SOAP overhead.  Database processing, GML creation and data transport is streaming 30

GIS Grid Example –IEISS Integration (LANL) 31 NB Server IEISS Web Map Service NB Server 2 UDDI Registry Service Context Service Web Feature Service MySQL Feature Database Service Interface NB Interface GML Builder WMS User Interface WMS – Ahmet Sayar UDDI, Context Service – Mehmet Aktas

Streaming WFS + AJAX Real-Time positions on Google maps 32

Streaming WFS Performance Tests  The Goal is to find the performance of the Streaming-WFS with and without the Binary XML integration.  We test the system performance against message size with up to features by changing number of features per request.  We use BNUX and Fast Infoset Binary XML Frameworks for compressing the GML FeatureCollection documents  The BNUX and FI timings include encoding and decoding costs 33 NB Publisher Binary XML Encoder GML Builder Request Handler WSDLWSDL DB Query Builder DB Manager NB Subscriber Binary XML Decoder Client App NB Server

34 Streaming WFS Performance Tests

Contributions  Proposed and implemented a SOA architecture to provide a common platform supporting both archival and real-time geospatial data in data-centric Grids.  Integrated Web Services with Open Geographic Standards for supporting interoperability at both data and application levels.  Shown that the GIS Services can be implemented as streaming services.  Integration of Binary XML Frameworks with the Streaming Services shows performance gains for long network distances.  We have shown that the Sensor Grids can be built on top of the publish/subscribe middleware.  Continuous real-time data support is achieved in Service Architecture.  Scalable architecture implementation for large number of sensor networks.  Detailed investigation of the scalability and performance of the system. 35

Acknowledgement  Mehmet Aktas: UDDI and WS-Context  Ahmet Sayar: WMS Server and Client  ZhiGang Qi: SensorGrid Performance Tests  We thank Prof. Yehuda Bock and his group at SIO for their help with real-time GPS data streams.  The work described in this presentation is part of the QuakeSim project which is supported by the Advanced Information Systems Technology Program of NASA's Earth-Sun System Technology Office.  This collaboration is part of the NASA ACCESS ROSES funded project, Modeling and On-the-fly Solutions in Solid Earth Science. 36

Additional Slides 37

Future Work  Exploring the use of UDP transport for sensor streams, which could potentially increase the NB related performance.  Investigating real-time sensor workflows with Grid workflow tools such as Taverna.  A smart selection tool for choosing best Binary XML format for particular geographic features. This could be based on Case Based Reasoning (CBR) approach. 38

Related Work  Linked Environments for Atmospheric Discovery (LEAD), addressing fundamental IT and meteorology research challenges to create an integrated framework for analyzing and predicting the atmosphere.  Open-source Project for a Network Data Access Protocol (OPeNDAP) is a framework that aims to simplify all aspects of scientific networking, allows access to scientific data over the internet from applications that were not specifically designed for that purpose.  The Real-time Observatories, Applications, and Data management Network (ROADNet), focuses on resolving challenges related to building wireless sensor networks for various types of observations and the information management system which will deliver this sensor observation in real-time to the users.  Laboratory for Advanced Information and Technology Standards (LAITS) at George Mason University, researches GRID (based on Globus Technology) in Earth and Space Science. 39

Processing Real-Time GPS Streams 40 ryo2nb Raw Data RYO Ports NB Server ryo2asciiascii2gmlascii2pos Single Station Displacement Filter Station Health Filter RDAHMM Filter RTD Server ryo2nb Raw Data ryo2asciiascii2pos Single Station RDAHMM Filter A Complete Sensor Message Processing Path, including a data analysis application. /SOPAC/GPS/CRTN01/RYO /SOPAC/GPS/CRTN01/ASCII /SOPAC/GPS/CRTN01/POS /SOPAC/GPS/CRTN01/DSME