Building Grid Portlets with GTLAB Mehmet A. Nacar and Marlon E. Pierce Community Grids Lab Indiana University
TeraGrid represents large and growing collection of supercomputing and data storage resources. 20 Petabytes of storage, 280 Teraflops computing power Unified accounting and allocation system
The “Grid” in TeraGrid: View from Space The Coordinated TeraGrid Software and Services (CTSS) represents a common software/middleware stack on TeraGrid machines. Globus and other parts of CTSS give a common programming environment for remotely interacting with TeraGrid resources. Services include GRAM: resource access GridFTP: data management MyProxy: remote authentication infrastructure Information Services (GPIR, QBETS, MDS, etc) You Are Here
Science Portals and Gateways Science Gateways and Web portals build on the CTSS stack. Aggregating clients, user interfaces Ex: VLAB portal TeraGrid Science Gateway program: Many Java-based gateways are based on the portlet component model. Portlets are reusable portal parts that can be shared between development groups. Open Grid Computing Environments project provides grid portlets encapsulating common features. Our work to support VLAB is to build reusable libraries for building portlets. GTLAB
Motivation OGCE Grid portlets typically wrap each single Grid capability in a separate portlet GridFTP-->GridFTP Portlet Gateway portlets encapsulate sophisticated but specialized functionality. Submitting PWSCF jobs We need a middle way We need a component model for portlets: reusable portlet parts Java Server Faces (JSF) is our starting point Remove dependencies on the Servlet API. Backing beans are just beans, so can be reused more easily outside of web and portlet applications. JSF also provides an extensible framework (tag libraries) Apache JSF portlet bridge allows you to convert standalone JSF applications (development phase) into portlets (deployment phase).
Introduction to GTLAB Encapsulates clients to common Grid services as XML tag libraries and backing Java beans. Embedded by portlet developers in their portlet pages to invoke common tasks Specification of the composite action you want to occur when a user hits the submit button. Allows portal developers to concentrate on the user interface components. Tags can be arranged in directed acyclic graphs (dependency chains). These represent simple workflows. Based on extensions to Java Server Faces and the Java CoG Kit.
GTLAB Example <o:jobsubmit id=”task” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4” executable=”/bin/ls” stdout=”tmp/result stderr=”tmp/error” /> Grid tags are associated with Grid services via Grid beans Grid Beans wrap the Java COG Kit (version 4) We show an example JSF page section below. This allows you to develop new Grid portlets with no additional Java code.
Grid TagsAssociated Grid BeansFeatures ComponentBuilderBean Creating components, job handlers, submitting jobs. This is visually rendered as HTML MonitorBean Handling monitoring page actions MultitaskBean Constructing simple workflow MultitaskBean Defining dependencies among sub jobs MyproxyBean Retrieving myproxy credential FileOperationBean Providing Gridftp operations JobSubmitBean Providing GRAM job submissions FileTransferBean Providing Gridftp file transfer (Other JSF UI Tags) ResourceBean Describes common properties among all tags and beans. Passing values given by standard visual JSF components.
Complex Operations GTLAB can be used to associate multiple Grid tasks with a single action click. We call this a “multitask” This is a form of workflow (DAG) We build on top of CoG workflow capabilities. We are investigating how to abstract this to use other workflow engines. Each multitask should be associated with a submit button or command link. This allows many multitasks in a JSF form. It’s useful in some cases to bind relatively different multitask with the same user input parameters.
Encoding DAGs in Portlets Multitask provides a simple Directed Acyclic Graph (DAG) This example demonstrates a composite Grid job using multi- staged multitask GTLAB handles lifecycle of DAG within JSF application
<o:fileoperation id=”taskA” command=”mkdir” hostname=”cobalt.ncsa.teragrid.org” path=”/home/manacar/tmp/” /> <o:filetransfer id=”taskB” from=”gridftp://gf1.ucs.indiana.edu:2811/home/manacar/input_file” to=”gridftp://cobalt.ncsa.teragrid.org:2811/home/manacar/input_file” /> <o:jobsubmit id=”taskC” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4” executable=”/bin/execute” stdin=”tmp/input_file” stdout=”tmp/result” stderr=”tmp/error” /> <o:filetransfer id=”taskD” from=”gridftp://cobalt.ncsa.teragrid.org:2811/home/manacar/tmp/result” to=” gridftp://gf1.ucs.indiana.edu:2811/home/manacar/result” /> DAG Example JSF Page This encodes the DAG on the previous page.
A Simple Use Case You (the portal developer) have a code that you want to run on the TeraGrid through your portal. You will need to develop these parts: User interface layout in JSF/JSP so user can provide input information and select a host to run on. The JSF/JSP file will also include the GTLaB tags on the following slide. Some Java code to construct an input file. You will NOT have to reimplement all of the Grid code that describes your action methods.
JSF Page with Grid Tags <o:jobsubmit id="make" arguments="/home/manacar/disloc-work" executable="/bin/mkdir" hostname="gf1.ucs.indiana.edu" provider="GT2" stdout="/home/manacar/tmp/out-make"/> <o:jobsubmit id="disloc” arguments="/home/gateway/GEMCodes/Disloc/input.txt /home/manacar/disloc-work/disloc.out" executable="/…/disloc" hostname="gf1.ucs.indiana.edu" provider="GT2" stdout="/home/manacar/disloc-work/out-disloc"/> Note the specific values would typically come from the user’s form inputs through the Resource Bean.
Integration with JSF Form Elements Developers embed Grid tags snippet into JSF page These components are non-visual and are not displayed in HTML. Resource bean provides bridging with form inputs and GTLAB framework. <o:multitask id="multi" persistent="true" taskname="#{resource.taskname}" /> Dynamic values to Grid tag attributes are provided by Resource bean.
Tracking and Managing Jobs GTLAB manages lifecycles of jobs and monitor their status. Grid operations are usually batch processes We provide callback mechanism to follow up the jobs GTLAB creates callback handlers for jobs and persistently stores them via object serialization. Can be later de-serialized. GTLAB handlers manages the job events such as stopping, canceling or resuming the running jobs. GTLAB provides archive for job metadata and allows managing the archive Handler tag helps to organize user’s job repository
Managing Metadata Repositories Job metadata is stored in metadata repositories Metadata includes: submission parameters and files, execution host and information, output parameters, files and their location Job data is stored on the specified servers. As a result data files are transferred by using GridFTP Metadata of the data files such as URL location or GridFTP locations are saved in the metadata repository Portal users see the job metadata that has links to the exact location of the data Metadata repository is built by using WS-Context repository Users see job metadata with the time stamped hierarchical repository labels /vlab/userA/sessionC/ /12:22/jobID
Summary and Conclusions We briefly described the TeraGrid, Grid middleware, and Science Gateways. Motivation for our work: portlet-based gateways need to be composed out of reusable parts. Our solution: GTLAB tag libraries and backing beans XML tag libraries backed by Java beans that encapsulate common (Globus) Grid tasks. Extend JSF framework GTLAB tags can be arranged into DAGs to express composite tasks
More Information GTLAB version 1.0 Beta release available at Contact Marlon: Contact Mehmet:
Relation to Web 2.0 Web 2.0 can be divided into several major categories, including primarily Rich client interfaces (RCI) such as AJAX REST-style Web services (typically transmitting RSS or Atom XML). GTLaB and RCI GTLaB does not directly address this either way RCI can be achieved through JSF implementations (which generate the necessary JavaScript). Or you can mix and match, using YUI or Scriptaculous Javascript libraries with your JSF. GTLaB and REST Conceivably we could also build REST clients for consuming and manipulating RSS feeds. (c.f. Yahoo Pipes) This would require a new workflow engine implementation. For a more thorough survey, see
Related Work Grid Portlets 1.3 of GridSphere Now they are trying to decouple with GridSphere. It’s called Vine (Portlet Vine) as separate project Grid Portlets 1.3 provide API and UI tags to build Grid portlets RSF (Reasonable Server Faces) Derived from JSF, but it separates HTML pages and backing beans RSF provides non-visual components unlike JSF Beans can be contained by Spring like containers. Lifecycles of beans managed by Spring OGCE portlets Packages Velocity, JSP and JSF portlets Provides portlet package for several Grid applications such as Globus, Condor, SRB and GPIR