Download presentation
Presentation is loading. Please wait.
Published byOsborne Phelps Modified over 9 years ago
1
High Performance Web Service Architecture for Sensors and Geographic Information Systems Galip Aydin
2
Geographic Information Systems A Geographic Information System is a system for creating, storing, sharing, analyzing, manipulating and displaying spatial data and associated attributes. GIS history saw the evolution from mainframe GIS to Desktop GIS to Distributed GIS. Modern GIS require: Distributed data access for spatial databases Utilizing remote analysis, simulation or visualization tools.
3
Traditional Distributed GIS Approach Problems with traditional approaches : Distributed nature of the geo-data; various client-server models, databases, HTTP, FTP, RDBs, XML DBs etc. Data format problems, conversion overheads Data processing issues, hardware and software requirements, COM+/ActiveX, CORBA/IIOP frameworks Which introduce three challenges Assembling data from distributed repositories Adoption of universal standards for format interoperability Interoperable services for better utilization of computational resources
4
Open Geographic Standards Open GIS Standards bodies aim to make geographic information and services neutral and available across any network, application, or platform. Two major standard bodies: OGC and ISO/TC211, former being most popular OGC Specifications are widely accepted: Data Format Specs: GML, SensorML, O&M Service Specs: WFS, WMS, WCS OGC Services are HTTP GET/POST based; limited data transport capabilities (HTTP, FTP, files etc.) Not Web Services; tightly coupled, point to point communication results in centralized, synchronous applications.
5
Motivations Lack of service orchestration capabilities Complex problems require GIS applications to collaborate. Coupling data sources to scientific applications Data transport requirements Proliferation of Sensors Ability to analyze data on-the-fly, continuous streaming support, scalable systems for addition of new sensors. High performance and high rate messaging Real-time data access, rapid response systems, crisis management etc. From the Grids perspective To apply general Grid/Distributed computing principles to GIS Investigate how to integrate with geophysical and other scientific applications
6
Motivating Use Cases Pattern Informatics Earthquake forecasting code developed by Prof. John Rundle (UC Davis) and collaborators, uses seismic archives. Regularized Dynamic Annealing Hidden Markov Method (RDAHMM) Time series analysis code, can be applied to GPS and seismic archives, can be applied to real-time data. Interdependent Energy Infrastructure Simulation System (IEISS) Models infrastructure networks (e.g. electric power systems and natural gas pipelines) and simulates their physical behavior, interdependencies between systems. SOPAC GPS Networks provide real-time messages.
7
Research Issues 1 Applying Web Service principles to GIS data services Orchestration of Services, workflows, simple services are not suitable for large data sets and where quick response is required High Performance s upport in GIS services. Interoperability The system should bridge GIS and Web Service communities by adapting standards from both. Other GIS applications should be able to consume data without having to do costly format conversions.
8
Research Issues 2 Scalability The system should be able to handle high volume and high rate data transport and processing. Plugging new sensors, data sources or geoprocessing applications should not degrade system’s overall performance. Flexibility and extendibility How to develop real-time services to process sensor data on the fly. Ability to add new filters without system failures. Quality of Service Issues Is latency introduced by services in processing real-time sensor data acceptable?
9
SOA for GIS – Geophysical Data Grid We utilize Web Services to realize Service Oriented Architecture, OGC data formats and application interfaces for interoperability at both levels. GIS Data Grid Properties Based on the sources geospatial data can be seen as archival and real-time data. The architecture provides standard control and access interfaces for both types. Supports alternate transport and representation schemes, uses topic based messaging infrastructure for large volume data transport. UDDI based FTHPIS as services registry. Streaming and non-streaming services to access archived data. Real-Time and near real-time services for accessing sensor metadata and sensor measurements.
10
Geophysical Data Grid Architecture Archival Data Grid Real-Time Data Grid
11
GIS Grid 1 - Archival Data Services Web Feature Service is the default OGC specification for vector data. We have built Web Service version of WFS for accessing geospatial data on distributed databases. The first Web Service version of WFS has been successfully used in several scientific workflows with other services (WMS, HPSearch, FTHPIS). WFS can access multiple distributed databases, can query other WFSs for remote features. Problems with Web Service version of the WFS Request-response, not asynchronous, Performance: GI Services are not designed to handle non-trivial data transfers. Large data requests, SOAP overhead. XML Encoding: Size of the geospatial data increases with GML encoding which increases transfer times, or may cause exceptions
12
WFS Performance Improvements Streaming WFS To improve performance of the WFS : Utilized publish/subscribe messaging system for high performance data transfer. Similar to WFS but data and control channel separation, allows one to many data distribution. Used streaming database connection (MySQL) for faster retrieval of the query results, and lower GML creation overhead. Binary XML Frameworks are integrated for reducing XML payload size which improves transfer times. Binding data transfer to Grid messaging middleware reduces SOAP creation overhead.
13
WFS Interaction with services and data sources
14
GIS Grid Example – IEISS Integration WMS – Ahmet Sayar UDDI, Context Service – Mehmet Aktas
15
Streaming WFS Performance We test the system for up to 10.000 features The tests reveal the performance of the streaming service with and without Binary XML integration We use BNUX and Fast Infoset Binary XML Frameworks for compressing the GML FeatureCollection documents The BNUX and FI timings include encoding and decoding costs
16
GIS Grid 2 - Real-Time Data Services Sensors and sensor networks are being deployed for measuring various geo-physical entities. Sensors and GIS are closely related. Sensor measurements are used by GIS for statistical or analytical purposes. With the proliferation of the sensors, data collection and processing paradigms are changing. Most scientific geo-applications are designed to work with archived data. Critical Infrastructure Systems and Crisis Management environments require fast and accurate access to real-time sources and a flexible/pluggable architecture for geoprocessing of the data.
17
SensorGrid Architecture Major components : Real-Time filters Grid Messaging Substrate Information Service Filters can be run as Web Services to create workflows. Filter Chains can be deployed for complex processing. Streaming messaging provide high-performance transfer options.
18
Real-Time Filters Real-time data processing is supported by employing filters around publish/subscribe messaging system. The filters are extended from a generic class to inherit publish and subscribe capabilities. They can be connected in parallel or serial as chains to solve complex problems. Input SignalOutput Signal Filter
19
Filter Metadata and Chains Parallel Operation Serial Operation
20
Use Case - GPS Sensors A good example for scientific sensors are GPS station networks. GPS measurements are used for determining post- seismic deformation, understanding long-term crustal movement etc. SOPAC GPS networks : 8 networks for 80 stations produce 1Hz high resolution data. Socket based real-time binary-RYO format access is available, but not utilized! We developed filters to provide multiple format (RYO, ASCII, GML) real-time streaming access. OHIO principle and chain of filters. We use publish/subscribe based NaradaBrokering for managing real-time streams, topics for hierarchical organization of the sensors.
21
SOPAC Real-Time Filters for GPS Streams
22
Application Integration with Real-Time Filters Station Monitor Filter records real-time positions for 10 minutes and calculates position changes Graph Plotter Application creates visual representation of the positions. RDAHMM Filter records real-time positions for 10 minutes and invokes RDAHMM application which determines state changes in the XYZ signal. Graph Plotter Application creates visual representation of the RDAHMM output.
23
AJAX and Real-Time positions on Google maps
24
Recording and Replaying Sensor Streams Filters can be used to record and replay scenarios, such as Earthquakes in GPS case. We developed RYO Recorder and RYO Publisher Filters. The RYO Recorder creates daily archives of the GPS Streams. RYO Publisher can be used to play daily or certain segments of the records. We replayed the 2004 Southern California Earthquake using Parkfield GPS network archive
25
SensorGrid Performance Tests Two Major Goals: System Stability and Scalability Ensuring stability of the distributed Filter Services for continuous operation. Finding the maximum number of publishers (sensors) and clients that can be supported with a single broker. Investigate if system scales for large number of sensors and clients.
26
Test Methodology The test system consists of a NaradaBrokering server and a three-filter chain for publishing, converting and receiving RYO messages. We take 4 timings for determining mean end-to-end delivery times of GPS measurements. The tests were run at least for 24 hours. GridFarm001-008 servers are used in these tests. Ttransfer = (T2 – T1) + (T4 – T3)
27
1- System Stability Test The basic system with three filters and one broker. The figure shows average results for every 30 minutes. The average transfer time shows the continuous operation does not degrade the system performance.
28
2 – Multiple Publishers Test We add more GPS networks by running more publishers. The results show that 1000 publishers can be supported with no performance loss. This is an operating system limit.
29
3 – Multiple Clients Test We add more clients by running multiple Simple Filters which subscribe to the same ASCII topic. The system can support as many as 1000 clients with very low performance decrease. Adding clients 1000 Clients
30
Extending Scalability The limit of the basic system appears to be 1000 clients or publishers. This is due to an Operating System restriction of open file descriptors (1024 for Red Hat Linux). To overcome this limit we create NaradaBrokering networks with linking multiple brokers. We run 2 brokers to support 1500 clients. Number of brokers can be increased indefinitely, so we can potentially support any number of publishers and subscribers.
31
4 – Multiple Brokers Test Messages published to first broker can be received from the second broker. We take timings on each broker. We connect 750 clients to each broker and run for 24 hours. The results show that the performance is very good and similar to single broker test.
32
4 – Multiple Brokers Test 750 Clients
33
Real-Time Filters Test Results The RYO Publisher filter runs at 1Hz and publishes 24-hour archive of the CRTN_01 GPS network, which contains 9 GPS stations. The single broker configuration can support 1000 clients or publishers (GPS networks - 9000 individual stations). The system can be scaled up by creating NaradaBrokering broker networks. Message order was preserved in all tests.
34
Contributions A SOA approach to create a common platform to support both archival and real-time geospatial data in data-centric Grids. Merging Web Services and Open Geographic Standards for supporting interoperability at both data and application levels. We have shown that the GIS Services can be implemented as streaming services. Integration of Binary XML Frameworks with the Streaming Services shows performance gains for long network distances. We have shown that the Sensor Grids can be built on top of the publish/subscribe middleware. Real-Time continuous data support is realized in a Service Architecture. Scalable architecture implementation for large number of sensor networks.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.