Francisco Hernandez, Purushotham Bangalore, Jeff Gray, and Kevin Reilly Department of Computer and Information Sciences University of Alabama at Birmingham Birmingham, AL, USA
2 Overview Provide an abstract high-level layer to model the Grid Workflows. Automate the specification of Grid workflows. Generate Globus specific code from the graphical models with the help of the Java CoG Kit.
3 Outline Related Work Domain-Specific Modeling Meta-Model Modeling Process Interpreter Limitations Future Work Conclusions
4 Related Work (1) Idea of composing applications from reusable components is not new: (e.g., Webflow, Unicore, DAGMan, Symphony, Triana). Workflows have gained increased attention for their application in composing a flow of tasks in a Grid environment: GridAnt
5 Related Work (2) Amin et al. 1, proposes a technology and architecture-independent abstraction layer to provide interoperability across multiple Grid implementations, resulting in an Open Grid Computing Environment (OGCE). Concept is comparable to using meta-models that abstract the underline Grid technologies but is realized at a lower level of abstraction. 1.Amin, K., Hategan, M., von Laszewski, G., and Zulezec, N., “Abstracting the Grid,” Proceedings of the 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2004), February 2004, La Coruña, Spain.
6 Domain-Specific Modeling (1) Domain-specific modeling (DSM) is a technology that focuses on higher levels of abstraction at the problem space and avoids low-level details at the solutions space. Allowing a user to manipulate graphical models of the problem in hand. A special type of generator called a model interpreter can translate the models into executable specifications used to automatically synthesize software. GME is a domain-specific modeling environment that can be configured and adapted from meta-level specifications that describe the domain.
7 Domain-Specific Modeling (2) DSM has been useful in automating different kinds of applications in which the environment is dynamic and tightly integrated with the physical environment including: –embedded systems 1, –automotive manufacturing 2, –complex QoS applications 3. 1.Neema, S., Bapty, T., Gray, J., and Gokhale, A., “Generators for Synthesis of QoS Adaptation in Distributed Real-Time Embedded Systems,” First ACM SIGPLAN/SIGSOFT Conference on Generative Programming and Component Engineering (GPCE ’02), Springer- Verlag LNCS 2487, Pittsburgh, PA, October 6-8, 2002, pp Long, E., Misra, A., and Sztipanovits, J., “Increasing Productivity at Saturn,” IEEE Computer, August 1998, pp Bapty, T., Neema, S., and Gray, J., “Model-Integrated Computing For Composition of Complex QoS Applications Using The Generic Modeling Environment (GME),” OMG Workshop on Real-Time and Embedded Distributed Object Computing, Washington, DC, July 15-18, 2002.
8 Domain Specific Modeling (3) General Meta-Meta-Model Domain Meta-Model Domain Models Application Interpreter 1Interpreter 2 Application Specify Construct Generate Specific Instance
9 Meta-Model (1) Workflows describe the execution of complex applications built from individual application’s components. The basis of the meta-model is the way in which a user specifies a sequence of tasks in an application’s workflow. Upload a File Download a File Execute a Job
10 Meta-Model (2) Experimental knowledge of the domain Four aspects needed to define the meta-model: –Resources –Transfers end-points –Jobs specifications –Workflows
11 Meta-Model (3) workflows Resources
12 Modeling Process (1) hernandf hernandf specify the location of the user’s security credentials. The location of the data file should be specified for each end-point in a file transfer. hernandf hernandf authorizes the use of the remote hosts (cherokeeData and cherokeeCompute).
13 Modeling Process (2) The generator creates a RSL string from the attributes specified by the user. In this case for the job HMM. The user initiates the execution of the application by first uploading the raw input file. The output file is finally downloaded to the local host.
14 Interpreter The interpreter parses the model and generates the control code that manages the application execution. GME provides an API that traverses the internal representation of the models. A model interpreter uses this API to translate the models into an application that manages the execution of the workflow. Model In GME Domain: -Models -Atoms -Connections Model in Globus Domain (Jobs, File Transfers, etc) Workflow Model Grid Application GME API Translator API Code Generation
15 Example of generated code 9: // create the rsl string 10: GlobusRSL hmmRSL = new GlobusRSL(); 11: 12: hmmRSL.setArg( "HMM inHMMFile.txt outHMMFile.txt" ); 13: hmmRSL.setEnvironmentVariables ( "(INPUT_DIR=/lhome/hernandf) (OUTPUT_DIR=/lhome/hernandf)" ); 14: hmmRSL.setStdOut( "/lhome/hernandf/sttOutHMM.txt" ); 15: hmmRSL.setNumProc(2); 16: hmmRSL.setDir( "/usr/bin" ); 17: hmmRSL.setExec( "java" );
16 Limitations Work on the modeling environment is in the initial phase. Currently, the environment can handle only a limited set of sequential tasks. Scalability problems due to the generation of specific code for each workflow task. Not all of the Globus capabilities are currently supported by the meta-model.
17 Future Work (1) Improve the scalability problem by generating a reusable workflow engine and generate the appropriate configurations from the graphical models. Modify the meta-model in order to support capabilities like: –Hierarchical workflows –Task’s parallelism –Check pointing and error recovery –Query Grid information services
18 Future Work (2) Generate different output specifications: –Grid Services –Grid Ant –PyGlobus –New version of Java CoG Kit.
19 Conclusions (1) The benefits of using domain-specific modeling techniques for creating Grid workflows are: –Domain modeling removes the accidental complexities of creating workflows in a Grid by focusing on higher levels of abstraction at the problem space rather than solution space. –Modeling tools and their interpreters facilitate the more rapid ability to change the workflow details. That is, it is easier to manipulate and change domain models rather than the associated code. –Model-driven techniques possess the ability to generate multiple artifacts from the same model. Thus, different output representations can be generated from the same domain knowledge.
20 Conclusions (2) Using these techniques, a user manipulates graphical models that represent the different components from the Globus Toolkit. From these models the user generates the corresponding Java code that manage the execution of the workflow. This work is an attempt to abstract the Grid environment into a high-level layer such that the essence is not bound to a specific Grid environment.
21 Thank you