Installation - Plus Loretta Auvil National Center for Supercomputing Applications University of Illinois at Urbana-Champaign lauvil@illinois.edu
Outline Installation Meandre servers and clusters Development Tools: Eclipse Plugin Hands-On
Considerations Do you want to use SEASR-powered services? May not need to install anything (besides a browser) Do you want to run analytics on your laptop? Quick 3 step process Do you want to provide SEASR-powered services? Start simple Scale as needed Deploying all the extra goodies
Using SEASR-Powered Services SEASR provides some demo services Requires a browser You can access them from Community Hub to execute a flow http://seasr.org Zotero to analyze your collections with existing flows Meandre Server Client to execute a flow; or tune properties and execute a flow Hosted at http://demo.seasr.org:1714 Meandre Workbench to execute a flow; or tune properties and execute a flow; or create a flow Hosted at http://demo.seasr.org:1712
I Need To Run SEASR on my laptop I want to run on my laptop (server) I have copyrighted information I have collection for analysis that is too big to be moved I just want to test it and have fun with it Getting a Meandre server up and running in 3 steps Install Java http://www.java.com/en/download/ Download the Meandre into a new directory http://seasr.org/meandre/download/ Use the “Start-Infrastructure” or type “java –jar meandre-server- 1.4.9.jar” Access your new installation at http://localhost:1714/public/services/ping.html
Specialized Downloadable Bundles On the SEASR/Meandre download site http://seasr.org/meandre/download Installation bundles available for: Mac OS Linux Windows Bundles contain: Zip file that includes executable files Set of demo components and flows Requires Java (1.5 or greater) to be installed
Bundles Include The bundle comes with Provides simple scripts to Meandre Server ZigZag console/compiler/runtime Meandre Workbench (also provided as a war file) Provides simple scripts to Start/stop the Meandre server Start/stop the Meandre Workbench
What About Setting Up My Own Server? You can also deploy the bundles on a server using the same approach.
Customization of My Server This will support Moderated traffic ??we don’t know what this means?? Persistent web services can be provided using this server Application Server Workbench can be deployed alongside the Meandre Server in the embedded Jetty Application Server Workbench can be deployed using your favorite application server using the .war file Database Options Meandre uses an embedded Derby as the database Meandre can be also be setup to use MySQL for the database
Backend Using Derby Database default meandre-config-store.xml <entry key="DB_USER"></entry> <entry key="DB_DRIVER_CLASS"> org.apache.derby.jdbc.EmbeddedDriver</entry> <entry key="DB">Derby</entry> <entry key="DB_PASSWD"></entry> <entry key="DB_URL”> jdbc:derby:./MeandreStore;create=true;logDevice=./DerbyLog </entry>
Backend Using MySQL meandre-config-store.xml to <entry key="DB_USER">USERNAME</entry> <entry key="DB_DRIVER_CLASS”>com.mysql.jdbc.Driver</entry> <entry key="DB">MySQL</entry> <entry key="DB_PASSWD">PASSWORD</entry> <entry key="DB_URL”> <![CDATA[jdbc:mysql://your- server.com/YOURDB?useUnicode=yes&characterEncoding=utf8&aut oReconnect=true]]> </entry> Changing from Derby to MySQL Stop the server Change the meandre-config-store.xml file Restart the server Now your server is backend on MySQL
Scaling Up Two possible routes Deploy a farm of self-contained services (via zigzag) Use the Meandre Cluster solution Both require your sysadmin/netadmin to provide a highly available load balancer (some virtual appliances available) To create a cluster Use the previous MySQL set up Point all the servers to the same database The server interface pages will allow you to monitor of the servers
Installing The Workbench Use the installation bundles Use the war file Install your favorite application server Deploy the war file against the application server
Installing the Community Hub Deploy the Wordpress plugin Unzip file into Wordpress plugins directory
Community Hub Customization (1) Makes flows available for exploration and execution Renders the description of the flow information Provides a simple execute button to allow visitors to run the flow Exploration via wordpress shortcode to expose the Keyword Cloud functionality [MeandreTagCloud store='http://demo.seasr.org:1714/public/services/repository.rdf’] [MeandreListSelectedTags] [MeandreListFlowsByTags store='http://demo.seasr.org:1714/public/services/repository.rdf’]
Community Hub Customization (2) Pages and posts Add the wordpress shortcode to display the meta information and execute buttons [MeandreDescribeFlow] [MeandreListFlowsByFlowTags] Add custom fields for specifying meta information FlowURI: specifies the uri of the flow ImageURI: specifies a uri to associate with this flow StoreURI: specifies the uri of the flow rdf CustomFlowURI: specifies the uri of the flow that allows user to input their data ExecuteURI: specifies the uri of the server where flow should run
Eclipse Plugin for Developers On the SEASR/Meandre download site http://seasr.org/meandre/download/ Steps for installation Exit Eclipse Download zip file into Eclipse/dropins directory Unzip file Restart Eclipse
Developers: Eclipse Plugin Uploads components to the Meandre Server Lists components installed Allows for removal of components Shows additional data of interest to a programmer
Meandre: The Architecture The design of the Meandre architecture follows three directives: provide a robust and transparent scalable solution from a laptop to large-scale clusters create an unified solution for batch and interactive tasks encourage reusing and sharing components To ensure such goals, the designed architecture relies on four stacked layers and builds on top of service-oriented architectures (SOA)
Meandre: Basic Single Server
Meandre MDX: Cloud Computing Servers can be instantiated on demand disposed when done or on demand A cluster is formed by at least one server The Meandre Distributed Exchange (MDX) Orchestrates operational integrity by managing cluster configuration and membership using a shared database resource.
Meandre MDX: The Picture MDX Backbone
Meandre MDX: The Architecture Virtualization infrastructure Provide a uniform access to the underlying execution environment. It relies on virtualization of machines and the usage of Java for hardware abstraction. IO standardization A unified layer provides access to shared data stores, distributed file-system, specialized metadata stores, and access to other service-oriented architecture gateways.
Meandre MDX: The Architecture Data-intensive flow infrastructure Provide the basic Meandre execution engine for data-intensive flows, component repositories and discovery mechanisms, extensible plugins and web user interfaces (webUIs). Interaction layer Can provide self-contained applications via webUIs, create plugins for third-party services, interact with the embedding application that relies on the Meandre engine, or provide services to the cloud.
Meandre: ZigZag Script Language ZigZag is a simple language for describing data- intensive flows Modeled on Python for simplicity. ZigZag is declarative language for expressing the directed graphs that describe flows. Command-line tools allow ZigZag files to compile and execute. A compiler is provided to transform a ZigZag program (.zz) into Meandre archive unit (.mau). Mau(s) can then be executed by a Meandre engine.
Meandre: ZigZag Script Language As an example the Flow Diagram The flow below pushes two strings that get concatenated and printed to the console
Meandre: ZigZag Script Language ZigZag code that represents example flow: # # Imports the three required components and creates the component aliases import <http://localhost:1714/public/services/demo_repository.rdf> alias <http://test.org/component/push_string> as PUSH alias <http://test.org/component/concatenate-strings> as CONCAT alias <http://test.org/component/print-object> as PRINT # Creates four instances for the flow push_hello, push_world, concat, print = PUSH(), PUSH(), CONCAT(), PRINT() # Sets up the properties of the instances push_hello.message, push_world.message = "Hello ", "world!" # Describes the data-intensive flow @phres, @pwres = push_hello(), push_world() @cres = concat( string_one: phres.string; string_two: pwres.string ) print( object: cres.concatenated_string )
Meandre: ZigZag Script Language Automatic Parallelization Multiple instances of a component could be run in parallel to boost throughput. Specialized operator available in ZigZag Scripting to cause multiple instances of a given component to used Consider a simple flow example show in the diagram The dataflow declaration would look like # # Describes the data-intensive flow @pu = push() @pt = pass( string:pu.string ) print( object:pt.string )
Meandre: ZigZag Script Language Automatic Parallelization Adding the operator [+AUTO] to middle component [+AUTO] tells the ZigZag compiler to parallelize the “pass component instance” by the number of cores available on system. [+AUTO] may also be written [+N] where N is an numeric value to use for example [+10]. # Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string ) [+AUTO] print( object:pt.string )
Meandre: ZigZag Script Language Automatic Parallelization Adding the operator [+4] would result in a directed grap # Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string ) [+4] print( object:pt.string ) # Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string ) [+4!] print( object:pt.string )
Scaling Genetic Algorithms with Meandre Intel 2.8Ghz QuadCore, 4Gb RAM. Average of 20 runs.
And Beyond with Hadoop 60 Dual Quad Core Xeons with 8GB RAM. GB Ethernet Resources exhaustion
Meandre: Flows to MAU Flows can be executed using their RDF descriptors Flows can be compiled into MAU MAU is: Self-contained representation Ready for execution Portable The base of flow execution in grid environments
Compile and Run MAU Compile zigzag to mau creating my_file.mau java -jar ~/meandre/zzc-1.4.9.jar my_file.zz Run the mau file java -jar ~/meandre/zzre-1.4.9.jar -port 1816 my_file.mau
Demonstration Installation of Meandre Meandre Eclipse Plugin JIRA, Confluence, Bamboo - what they are and what we use them for Usage of ZigZag Compiling and executing flows using ZigZag Usage of ZigZag for Zotero-enabled flows Usage of ZigZag for Fedora flows
Learning Exercises Open an existing ZigZag flow Convert your flow from yesterday to ZigZag Compile the script Execute the script Have participants download and install SEASR on their personal computers Have participants sign up for accounts to access the SEASR suite of Atlassian tools Use JIRA to log a support request
Discussion Questions What challenges (if any) would scholars have installing the SEASR software? Do you see your institution's IT department running the SEASR environment or would it be your research group? Which environment would you most likely use, the Meandre Workbench or the ZigZag scripting language?