Deployment of Flows Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu
Outline Hands-On
Meandre: ZigZag Script Language ZigZag is a simple language for describing data- intensive flows Modeled on Python for simplicity. ZigZag is declarative language for expressing the directed graphs that describe flows. Command-line tools allow ZigZag files to compile and execute. A compiler is provided to transform a ZigZag program (.zz) into Meandre archive unit (.mau). Mau(s) can then be executed by a Meandre engine. The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Meandre: ZigZag Script Language As an example the Flow Diagram The flow below pushes two strings that get concatenated and printed to the console The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Meandre: ZigZag Script Language ZigZag code that represents example flow: # # Imports the three required components and creates the component aliases import <http://localhost:1714/public/services/demo_repository.rdf> alias <http://test.org/component/push_string> as PUSH alias <http://test.org/component/concatenate-strings> as CONCAT alias <http://test.org/component/print-object> as PRINT # Creates four instances for the flow push_hello, push_world, concat, print = PUSH(), PUSH(), CONCAT(), PRINT() # Sets up the properties of the instances push_hello.message, push_world.message = "Hello ", "world!" # Describes the data-intensive flow @phres, @pwres = push_hello(), push_world() @cres = concat( string_one: phres.string; string_two: pwres.string ) print( object: cres.concatenated_string ) The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Meandre: ZigZag Script Language Automatic Parallelization Multiple instances of a component could be run in parallel to boost throughput. Specialized operator available in ZigZag Scripting to cause multiple instances of a given component to used Consider a simple flow example show in the diagram The dataflow declaration would look like # # Describes the data-intensive flow @pu = push() @pt = pass( string:pu.string ) print( object:pt.string ) The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Meandre: ZigZag Script Language Automatic Parallelization Adding the operator [+AUTO] to middle component [+AUTO] tells the ZigZag compiler to parallelize the “pass component instance” by the number of cores available on system. [+AUTO] may also be written [+N] where N is an numeric value to use for example [+10]. # Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string ) [+AUTO] print( object:pt.string ) The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Meandre: ZigZag Script Language Automatic Parallelization Adding the operator [+4] would result in a directed grap # Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string ) [+4] print( object:pt.string ) # Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string ) [+4!] print( object:pt.string ) The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Meandre: Flows to MAU Flows can be executed using their RDF descriptors Flows can be compiled into MAU MAU is: Self-contained representation Ready for execution Portable The base of flow execution in grid environments The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Meandre: The Architecture The design of the Meandre architecture follows three directives: provide a robust and transparent scalable solution from a laptop to large-scale clusters create an unified solution for batch and interactive tasks encourage reusing and sharing components To ensure such goals, the designed architecture relies on four stacked layers and builds on top of service-oriented architectures (SOA) The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Meandre: Basic Single Server The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Meandre MDX: Cloud Computing Servers can be instantiated on demand disposed when done or on demand A cluster is formed by at least one server The Meandre Distributed Exchange (MDX) Orchestrates operational integrity by managing cluster configuration and membership using a shared database resource. The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Meandre MDX: The Picture MDX Backbone The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Meandre MDX: The Architecture Virtualization infrastructure Provide a uniform access to the underlying execution environment. It relies on virtualization of machines and the usage of Java for hardware abstraction. IO standardization A unified layer provides access to shared data stores, distributed file-system, specialized metadata stores, and access to other service-oriented architecture gateways. The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Meandre MDX: The Architecture Data-intensive flow infrastructure Provide the basic Meandre execution engine for data-intensive flows, component repositories and discovery mechanisms, extensible plugins and web user interfaces (webUIs). Interaction layer Can provide self-contained applications via webUIs, create plugins for third-party services, interact with the embedding application that relies on the Meandre engine, or provide services to the cloud. The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation
Demonstration Usage of ZigZag Compiling and executing flows using ZigZag Usage of ZigZag for Zotero-enabled flows Usage of ZigZag for Fedora flows
Learning Exercises Open an existing ZigZag flow Convert your flow from yesterday to ZigZag Compile the script Execute the script
Discussion Questions Which environment would you most likely use, the Meandre Workbench or the ZigZag scripting language?