Integrating Geographical Information Systems and Grid Applications Marlon Pierce Contributions: Yili Gong, Ahmet Sayar, Galip Aydin, Mehmet Aktas, Harshawardhan Gadgil, Zhigang Qi Community Grids Lab Indiana University Project Funding: NASA AIST
Some Project Organizational Details We need a project-wide mailing list or two. I can set these up really quickly at IU. Code repositories? I have started using SourceForge and SVN for several projects, have been generally happy. I think its good for visibility of the project, good to show program managers. SF also has project management stuff like bugzillas. But licensing model may not work for JPL. I’ve also become a recent convert to using Wikis for group editable web pages. We should do this or its equivalent. I have one at but it is trivial to make a new one on also. Very low maintenance.
Something New: Using the TeraGrid The NSF TeraGrid is an administrative federation of supercomputing facilities across the country. SDSC, NCSA, IU, ANL, PU, ORNL, TACC, PSC. Four useful TG facts Almost any US researcher can apply to get 30,000 hours (somewhat painful web forms to fill out). You can get more hours if you apply. This researcher can share his allocation with others (1 page form--I used to give John, Gleb, Terry and others accounts). All TG machines try to have the same software environments. All come with Globus installed.
Problems with TeraGrid TeraGrid is still broken up into fiefdoms Articles of confederation instead of constitution. There is no way to do the following query: “Dear TG, I want to run the follow GeoFEST simulation. It will require the following resources. Please submit to the best available machine. Love, Marlon.” You still have to login to a specific machine and submit to its specific queuing system. PBS, LoadLeveler, LSF, etc.
Our Solution: Condor-G Condor is a famous scheduler/cycle scavenger from U Wisconsin. To use it, run condor software on all nodes. Has a “matchmaker” component that matches a user’s request to available resources. “Classads” in condor-speak. Condor-G is a bit different It is a condor client interface that can submit Globus jobs. Globus in turn can hide differences between queuing systems. You ony need Condor-G installed on one machine Can be anywhere. Both Condor and Condor-G have a Web Service interface called Birdbath. We have built portlets out of these things.
Condor Master Condor Condor Only Condor-G and Globus (Portal) Client Condor -G LSFPBS Globus (Portal) Client
What’s the Problem? The problem is that the Condor matchmaker only works for Condor. Condor daemons on various machines report back to the collector at regular intervals. Condor-G needs an external provider since Condor is only installed in one place. We are solving this problem by using GPIR (a resource monitoring tool installed on the TG) to construct classads and publish to the matchmaker. We have prototyped this for GeoFEST, but need to take it to some sort of “production” level.
Bigger Research Issue: Generalized Matchmaking Condor matchmaking is only good for running jobs. More generally you want to do Web Service matchmaking on a Grid. May be “find me best machine to run GeoFEST”. May be “find me QuakeTables service with Australian faults” Workflow also needs matchmaking, and matchmaking should be decoupled from workflow execution.
Workflow and Matchmaking QuakeTables Service NCSA’s Cobalt QuakeTables California QuakeTables Australia VC Service IU’s Big Red User Layer Workflow Matchmaking
QuakeSim/SERVO IT/CS Development Overview Portlet-based portal components allow different portlets to be exchanged between projects. Form-based portlets --> Interactive Maps These are clients to services Sensor Grid: Topic based publish-subscribe systems support operations on streaming data. Web services allow request/response style access to data and codes. GIS services (WMS, WFS) “Execution grid” services for running codes and moving files. Information services (WS-Context) and Web Service workflow (HPSearch)
Portlets and Portals
Portlets are a standard way for Java web applications to be shared between different portal containers. A portlet may be a web application such as a Google map client that I want to put into container. Will inherit login, access control, layout management, etc. We will show some demos for RDAHMM and ST-Filter later. We use Java Server Faces for development, so there may be some solvable interoperability issues. The main point is that portlets allow REASON and QuakeSim to exchange user interface components. We still need to develop client libraries and Web Services
Sensor Grid Overview
QuakeTables and Web Feature Service provide access to archival data. Faults, GPS time series, Seismic records Our Sensor Grid architecture supports access to real-time data. Integrated with all 70 stations of CRTN. Consists of chains of filters communicating on a network through a publish/subscribe broker. Each filter does a single task and passes the data along. Filters are also web services, but the communication is currently proprietary. Could be adapted to use SOAP and Axis 2 one way communication model, but this is an academic exercise. Filters can be applications, like RDAHMM. Scripps collaborators have a prototype command line client if you want to pipe and grep. Or you can develop your own stream sink.
SensorGrid Architecture
Real-Time Services for GPS Observations Real-time data processing is supported by employing filters around publish/subscribe messaging system. The filters are small applications extended from a generic Filter class to inherit publish and subscribe capabilities. Input SignalOutput Signal Filter
Filter Chains
NaradaBrokering Topics
Real-Time positions on Google maps
Real-Time Station Position Changes
RDAHMM + Real-Time GPS Integration