SensorGrid High Performance Web Service Architecture for Geographic Information Systems Thesis Proposal Galip Aydin

Slides:



Advertisements
Similar presentations
Worldwide Messaging Support for High Performance Real-time Collaboration Pete Burnap, Hasan Bulut, Shrideep Pallickara, Geoffrey Fox, David Walker, Ali.
Advertisements

Web Service Ahmed Gamal Ahmed Nile University Bioinformatics Group
Web Services Nasrullah. Motivation about web service There are number of programms over the internet that need to communicate with other programms over.
FOSS4G 2009 Building Human Sensor Webs with 52° North SWE Implementations Building Human Sensor Webs with 52° North SWE Implementations Eike Hinderk Jürrens,
Integrating Geographical Information Systems and Grid Applications Marlon Pierce Contributions: Ahmet Sayar, Galip Aydin, Mehmet Aktas, Harshawardhan Gadgil.
Service Oriented Sensor Web Xingchen Chu and Rajkumar Buyya University of Melbourne, Australia Presented by: Gerardo I. Simari CMSC828P – Fall 2006 Professor.
The Problem: Integrating Data, Applications, and Client Devices The key issue we try to solve is building the distributed computing infrastructure that.
Service Oriented Architecture for Geographic Information Systems Supporting Real Time Data Grids Galip Aydin Department Of Computer Science Indiana University.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Distributed components
Latest techniques and Applications in Interprocess Communication and Coordination Xiaoou Zhang.
Workshop on Cyber Infrastructure in Combustion Science April 19-20, 2006 Subrata Bhattacharjee and Christopher Paolini Mechanical.
1 Alternate Title Slide: Presentation Name Goes Here Presenter’s Name Infrastructure Solutions Division Date GIS Perfct Ltd. Autodesk Value Added Reseller.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Getting Started with WCF Windows Communication Foundation 4.0 Development Chapter 1.
Secure Systems Research Group - FAU Web Services Standards Presented by Keiko Hashizume.
Principles for Collaboration Systems Geoffrey Fox Community Grids Laboratory Indiana University Bloomington IN 47404
A Scalable Framework for the Collaborative Annotation of Live Data Streams Thesis Proposal Tao Huang
ESB Guidance 2.0 Kevin Gock
NaradaBrokering for CTS05 GlobalMMCS Tutorial CTS05 St. Louis May Geoffrey Fox CTO Anabas Corporation and Computer Science, Informatics, Physics.
Introducing Axis2 Eran Chinthaka. Agenda  Introduction and Motivation  The “big picture”  Key Features of Axis2 High Performance XML Processing Model.
1 On the Creation & Discovery of Topics in Distributed Publish/Subscribe systems Shrideep Pallickara, Geoffrey Fox & Harshawardhan Gadgil Community Grids.
Managing Service Metadata as Context The 2005 Istanbul International Computational Science & Engineering Conference (ICCSE2005) Mehmet S. Aktas
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Integrating Geographical Information Systems and Grid Applications Marlon Pierce Contributions: Ahmet Sayar, Galip Aydin, Mehmet Aktas, Harshawardhan Gadgil.
An XMPP (Extensible Message and Presence Protocol) based implementation for NHIN Direct 1.
Microsoft Visual Studio 2010 Muhammad Zubair MS (FAST-NU) Experience: 5+ Years Contact:- Cell#:
DISTRIBUTED COMPUTING
High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin.
material assembled from the web pages at
Microsoft Visual Studio 2010 Muhammad Zubair MS (FAST-NU) Experience: 5+ Years Contact:- Cell#:
1 Grids for Real-time and Streaming Applications GCC2005 Beijing China December Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology.
XML Web Services Architecture Siddharth Ruchandani CS 6362 – SW Architecture & Design Summer /11/05.
Using Topic-Based Publish/Subscribe for Managing Real Time GPS Streams Marlon Pierce, Galip Aydin, Zhigang Qi Community Grids Lab Indiana University 1.
Service Oriented Sensor Web: NOSA Approach Rajkumar Buyya and Xingchen Chu Grid Computing and Distributed Systems (GRIDS) Laboratory Dept. of Computer.
SBIR Final Meeting Collaboration Sensor Grid and Grids of Grids Information Management Anabas July 8, 2008.
NaradaBrokering for DS-RT 2005 Grid Tutorial IEEE DS-RT 2005 Montreal Canada Oct Geoffrey Fox CTO Anabas Corporation and Computer Science, Informatics,
SensorGrid Galip Aydin June SensorGrid A flexible computing environment for coupling real-time data sources to High Performance Geographic Information.
1 Seminar on Service Oriented Architecture Principles of REST.
Web Services and Geologic Data Interchange Simon Cox CSIRO Exploration & Mining
XML and Web Services (II/2546)
Ipgdec5-01 Remarks on Web Services PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce, Shrideep Pallickara, Choonhan Youn Computer Science,
RSISIPL1 SERVICE ORIENTED ARCHITECTURE (SOA) By Pavan By Pavan.
Integrating Geographical Information Systems and Grid Applications Marlon Pierce Contributions: Ahmet Sayar,
November Geoffrey Fox Community Grids Lab Indiana University Net-Centric Sensor Grids.
1 MESSAGE EXCHANGE FOR Web Service-Based Mapping Services AHMET SAYAR INDIANA UNIVERSITY COMMUNITY GRIDS LAB. COMPUTER SCIENCE DEPARTMENT August 17, 2005.
A Demonstration of Collaborative Web Services and Peer-to-Peer Grids Minjun Wang Department of Electrical Engineering and Computer Science Syracuse University,
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
AMQP, Message Broker Babu Ram Dawadi. overview Why MOM architecture? Messaging broker like RabbitMQ in brief RabbitMQ AMQP – What is it ?
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Framework for High Performance Grid and Web Services GGF15 October Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories.
1 Web Service Information Systems and Applications GGF16 Semantic Grid Workshop Athens Greece February Geoffrey Fox Computer Science, Informatics,
Distributed Handler Architecture (DHArch) Beytullah Yildiz Advisor: Prof. Geoffrey C. Fox.
Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture Hasan Bulut Advisor: Prof. Geoffrey Fox Ph.D. Defense Exam.
A service Oriented Architecture & Web Service Technology.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Scripting based architecture for Management of Streams and Services in Real-time Grid Applications Authors Harshawardhan Gadgil, Geoffrey Fox, Shrideep.
Service Oriented Architecture (SOA) Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Sabri Kızanlık Ural Emekçi
Integrating Geographical Information Systems and Grid Applications
Integrating Geographical Information Systems and Grid Applications
Implementing a service-oriented architecture using SOAP
Wireless Reliable Messaging Protocol for Web Services (WS-WRM)
Tools for Composing and Deploying Grid Middleware Web Services
WEB SERVICES From Chapter 19, Distributed Systems
Information Services for Dynamically Assembled Semantic Grids
New Tools In Education Minjun Wang
Presentation transcript:

SensorGrid High Performance Web Service Architecture for Geographic Information Systems Thesis Proposal Galip Aydin

Outline Introduction Motivations SensorGrid Architecture Research Issues and Goals Contributions

Geographic Information Systems A geographic information system (GIS) is a system for creating and managing spatial data and associated attributes. A computer system capable of integrating, storing, editing, analyzing, and displaying geographically-referenced information. A "smart map" tool that allows users to create interactive queries (user created searches), analyze the spatial information, and edit data. Maps are created by overlaying various geospatial features.

Traditional GIS approach Mostly desktop applications, require expertise and high amount of resources. Centralized server-client models for web- based GIS environments. Cross-vendor or cross-product interoperability is not possible without costly format conversions. Most of the applications consume archived data but with the advancements of the sensors new applications that consume real-time data are appearing in abundance.

Traditional GIS approach (contd.) Limitations Distributed nature of geospatial data. Proprietary data formats, and service methodologies. Lack of interoperable services. Problems Assembling data from distributed sources Format conversions Amount of resources for geoprocessing

Open GIS Standards Several standards bodies started developing data standards and implementation specifications for geospatial and location based services. The goal is to make geographic information and services neutral and available across any network, application, or platform. Two major organizations are Open Geospatial Consortium (OGC) and ISO/TC211.

OGC Supports interoperable solutions that "geo- enable" the Web. Several specifications: Geospatial Data: Geography Markup Language (GML) Sensors: Metadata – SensorML Measurements – Observations & Measurements (GML extension) Services: Web Feature Service Web Map Service Web Coverage Service etc.

Issues with Open Standards HTTP GET/POST based services; limited data transport capabilities (HTTP, FTP, , files etc.) Not Web Services; tightly coupled, point to point communication results in centralized, synchronous applications. High-end scientific and complex GIS apps require: Asynchronous communication models to cope with the high number of participants and long-running codes. Transfer of large data between services. Coupling data sources and high performance tools. Orchestrating multiple services for solving complex problems.

Motivation 1 Complex problems require GIS applications and services to collaborate. Lack of service orchestration capabilities Lack of service oriented practice causes hard to manage distributed practices especially when large number of participants are involved. Coupling data sources to GIS applications There are various types of distributed geospatial data sources used by the GIS applications and we need a flexible computing environment for seamless integration.

Motivation 2 Data transport requirements GIS require large amount of data to be transported between sources and consumers. Current approaches do not provide a scalable and flexible solution. High performance It is a must, not an option for most scientific GIS applications. For instance evaluating pre-seismic real- time messages may lead to early warnings. Proliferation of Sensors Sensors introduce new challenges to the current GIS applications in terms of data collection, management and processing.

Motivating Examples Pattern Informatics Earthquake forecasting code developed by Prof. John Rundle (UC Davis) and collaborators. Uses seismic archives. Regularized Dynamic Annealing Hidden Markov Method (RDAHMM) Time series analysis code by Dr. Robert Granat (JPL). Can be applied to GPS and seismic archives. Can be applied to real-time data. Interdependent Energy Infrastructure Simulation System (IEISS) Models infrastructure networks (e.g. electric power systems and natural gas pipelines) and simulates their physical behavior, interdependencies between systems.

SOA for GIS Utilize Web Services to realize Service Oriented Architecture, Open GIS standards for “data format and service interfaces” for interoperability. We have built WS versions of: WFS – access to geospatial data on various databases WMS (A. Sayar) – visualization of feature data Extended UDDI and WS-Context (M. Aktas) - supporting dynamic service metadata and services registry. Problems with simple WS version Basic WFS; request-response, not asynchronous. Performance: GI Services are not designed to handle non- trivial data transfers. XML: Size of the geospatial data increases with XML encoding.

GIS Data Grids Data is in the heart of every GIS. Easy and fast access to distributed geospatial data is crucial especially in time of crisis or disasters. Points to consider: High performance transport Real-time observations from distributed sensors. Unified access to geospatial data stored in relational DBs, XML DBs and ESRI Shape files. Leverage OGC Web Feature Service to provide standard access and query interfaces. Develop Web Service version of WFS and modify/extend for high performance. Fast population of GML Feature Collections from data in the various DBs.

GIS Data Services WFS Specification; transporting high volume geospatial data encoded in GML is not trivial with HTTP methods or pure Web Services. Researching use of publish/subscribe based messaging system for large data transport and fast response. Issues : Support for multiple clients, creating topics on the fly. Dynamic session metadata: Keeping session state and metadata for each client and request. Use of WS-Context. Prioritize client requests.

Real-Time Sensors Sensors are everywhere; they are being deployed as sensor networks for more accurate measurements. With the proliferation of the sensors, data collection and processing paradigms are changing. Most scientific geo-applications are designed to work with archived data. Critical Infrastructure Systems and Crisis management environments require fast and accurate access to real-time sources and a flexible/pluggable architecture for geoprocessing of the data.

Use Case - GPS Sensors A good example for scientific sensors are GPS station networks. GPS measurements are used for determining seismic events, understanding long-term crustal movement etc. We have access to SOPAC GPS networks: Currently only socket based RYO format access is available, but not utilized! We provide multiple format (RYO, ASCII, GML) real-time streaming access by using NaradaBrokering topics. OHIO and chain of filters. We are investigating use of topic based messaging systems for managing real-time data streams.

SensorGrid Architecture Support both archived and real-time geospatial data access. Support alternate transport and representation schemes. Use topic based messaging infrastructure for large volume data transport. WS-Context for managing dynamic service metadata. UDDI based FTHPIS as services registry. Streaming WFS for serving archived data. Streaming SCS for serving sensor metadata and sensor measurements.

Framework for HP WS Research improving Web Service performance by using better transport protocol and XML representation scheme. Virtualize representation and protocol by binding SOAP to message-oriented middleware. Handlers will negotiate protocol and convert messages between different representations. WS-Context for keeping session metadata related to methodology and specific parameters.

Negotiation Protocol Design a negotiation protocol for web services to negotiate: Transport protocol HTTP over TCP, Parallel TCP, UDP … Efficient representation of XML BXSA, bnux, BXML, MTOM, Fast Infoset, Millau, XOP, DFDL, Fast Web Services, … Other (Security etc.) Try to develop strategies for determining Best available protocol Best representation for a given communication. We will investigate use/extend of WS-Policy to build a negotiation protocol. We will not develop a binary representation method but build a framework that supports multiple binary formats.

Research Issues 1 Applying Web Service principles to GIS data services We have built a WS version of WFS Not suitable for large data sets and where quick response is required High Performance Should support HP data transport for GIS services. Interoperability The system should bridge GIS and Web Service communities by adapting standards from both. Other GIS applications should be able to consume data without having to do costly format conversions. Security

Research Issues 2 Scalability The system should be able to handle high volume and high rate data transport and processing. Plugging new sensors, data sources or geoprocessing applications should not degrade system’s overall performance. Flexibility and extendibility Setting architectural principles for real-time Filters to process sensor data on the fly. Ability to add new filters without system failures. Quality of Service Is latency introduced by filter chains in processing real-time sensor data acceptable? Is the system fault tolerant?

Scaling Measurements TimeRYOASCIIGML 1 SOPAC Network (SDCRTN - 9 Stations) 1 sec1.5KB4.03KB48.7KB 1 hr5.31MB14.18MB171.31MB 1 day127.44MB340.38MB4.01GB 1 month3.8GB9.97GB123.3GB 1 yr45.8GB119.67GB1.41TB Entire SOPAC Network 5 Networks (47 stations) 1yr229GB598.35GB7.05TB Entire SCIGN Network (250 stations) 1yr1.23TB16.18TB160TB

Research Goals Design a High Performance Web Service architecture for distributed GIS services to support archived and real-time geospatial data. Build GIS Data Services for coupling scientific applications with various types of distributed geospatial databases. Implement Web Service versions of Web Feature Service for archived data Sensor Collection Service for real-time geospatial data and sensor metadata. Utilize publish-subscribe based messaging infrastructure to deploy distributed filters for processing real-time sensor data. Develop a negotiation protocol for Web Services for supporting high performance data transport.

Contribution of This Thesis Merges two important software worlds: GIS and Web Service Architectures. Allows unified access to data by developing Web Services and Open GIS standards based services to access and manage archived and real-time geospatial data. Develops a novel way of deploying filter chains on a topic based messaging system for processing real-time streaming sensor data. Identifies a novel approach for negotiating various characteristics of communication between Web Services for High Performance messaging.

Appendix

-83,25 -80,31 City Gate #10 CG E J27. Sample GML Document

Sample GML visualization

RYO Message Format

High Performance XML I (G. Fox) There are many approaches to efficient “binary” representations of XML Infosets MTOM, XOP, Attachments, Fast Web Services DFDL is one approach to specifying a binary format Assume URI-S labels Scheme and URI-R labels realization of Scheme for a particular message i.e. URI-R defines specific layout of information in each message DFDL from GGF quite interesting for this Assume we are interested in conversations where a stream of messages is exchanged between two services or between a client and a service i.e. two end-points Assume that we need to communicate fast between end-points that understand scheme URI-S but must support conventional representation if one end-point does not understand URI-S

High Performance XML II (G. Fox) First Handler Ft=F1 handles Transport protocol; it negotiates with other end-point to establish a transport conversation which uses either HTTP (default) or a different transport such as UDP with WSRM implementing reliability URI-T specifies transport choice Second Handler Fr=F2 handles representation and it negotiates a representation conversation with scheme URI-S and realization URI-R Negotiation identifies parts of SOAP header that are present in all messages in a stream and are ONLY transmitted ONCE Fr needs to negotiate with Service and other handlers illustrated by F3 and F4 below to decide what representation they will process F1F2F3 F4 Container Handlers

High Performance XML III (G. Fox) Filters controlled by Conversation Context convert messages between representations using permanent context (metadata) catalog to hold conversation context Different message views for each end point or even for individual handlers and service within one end point Conversation Context is fast dynamic metadata service to enable conversions NaradaBrokering will implement Fr and Ft using its support of multiple transports, fast filters and message queuing; H1H4H3H2Body Service Conversation Context URI-S, URI-R, URI-T Replicated Message Header Transported Message Handler Message View Service Message View Container Handlers FtFrF3 F4

RDAHMM: GPS Time Series Segmentation (M. Pierce) Slide Courtesy of Robert Granat, JPL Complex data with subtle signals is difficult for humans to analyze, leading to gaps in analysis HMM segmentation provides an automatic way to focus attention on the most interesting parts of the time series GPS displacement (3D) length two years. Divided automatically by HMM into 7 classes. Features: Dip due to aquifer drainage (days ) Hector Mine earthquake (day 626) Noisy period at end of time series

Multiple protocol transport support In publish-subscribe Paradigm with different Protocols on each link Transport protocols supported include TCP, Parallel TCP streams, UDP, Multicast, SSL, HTTP and HTTPS. Communications through authenticating proxies/firewalls & NATs. Network QoS based Routing Allows Highest performance transport Subscription FormatsSubscription can be Strings, Integers, XPath queries, Regular Expressions, SQL and tag=value pairs. Reliable delivery Robust and exactly-once delivery in presence of failures Ordered delivery Producer Order and Total Order over a message type. Time Ordered delivery using Grid-wide NTP based absolute time Recovery and Replay Recovery from failures and disconnects. Replay of events/messages at any time. Buffering services. Security Message-level WS-Security compatible security Message Payload options Compression and Decompression of payloads Fragmentation and Coalescing of payloads Messaging Related Compliance Java Message Service ( JMS ) 1.0.2b compliant Support for routing P2P JXTA interactions. Grid Feature SupportNaradaBrokering enhanced Grid-FTP. Bridge to Globus GT3. Web Services supportedImplementations of WS-ReliableMessaging, WS-Reliability and WS-Eventing. Traditional NaradaBrokering Features (G. Fox)