Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.egi.eu EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number 654142 WG on Python and WG on Workflows.

Similar presentations


Presentation on theme: "Www.egi.eu EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number 654142 WG on Python and WG on Workflows."— Presentation transcript:

1 www.egi.eu EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number 654142 WG on Python and WG on Workflows Miguel Caballer, Ignacio Blanquer (*) Alain Franc, Jean-Marc Frigerio (**) (*) Universitat Politècnica de Valènica (**) INRA – French National Institute for Agricultural Research

2 205/04/2016 Contents WG on Workflows – Workflow-based web portals. – Experience on Galaxy (written in python...) – Virtual elastic cluster approach – Other requirements? WG on Python – A language: Versatile, easy to learn, and good performances – Object oriented, and many libraires – Adapted to scientific computing: NumPy, SciPy, MatPlotLib, Sci-kit Learn, etc … etc …. – Permits to write DSL called from a notebook Lifewatch-CC Amsterdam 6/4/2016

3 305/04/2016 WG on Workflows Want to address porting of workflow-based portals on Fedcloud Potential scenarios (combinable) – All components (Front/end or workflow engine and working nodes) deployed on the cloud or only the working nodes. – Working Nodes statically or dynamically deployed on the cloud. Lifewatch-CC Amsterdam 6/4/2016

4 405/04/2016 WG on Workflows Challenges Shared directories – Typically, Front/ends and Working Nodes need to share a directory. – NFS currently used, although this could prevent across-sites configurations. Data Access – Data copying cost along the network can be prohibitive in terms of performance. – Check “Big Data friendly” approaches. Reconfiguration – Adding/removing nodes requires reconfiguring existing nodes. – This should be automated. Elasticity – Automatic decision on the reconfiguration is desirable. – This could require high-level services or new orchestrators. Insert footer here

5 505/04/2016 Experiences: Galaxy Galaxy can be installed locally on a machine or as a front/end to a batch queue. Galaxy exposes a web interface and executes all the interactions (including data uploading) as jobs in a batch queue. Requires a shared directory among the working nodes and the front/end. It supports a separate storage area for different users, managing them through the portal. Insert footer here

6 605/04/2016 Galaxy Architecture Insert footer here Front/End Portal WN queue Availability of processing tools (e.g. Bowtie2, samtools, blast, etc.) and reference data. Torque, SGE, SLURM Tools interface specification files Shared Disk

7 705/04/2016 Galaxy on the cloud Galaxy on the cloud (https://wiki.galaxyproject.org/Cloud) is a branch using CloudMan (https://wiki.galaxyproject.org/CloudMan) for the automation of the deployment. It can be easily deployed on Amazon AWS (there are already pre-configured VMIs) – It can work with Ostack and ONE but you need to create the VMI. – However it still requires the manual configuration of the individual VMs and elasticity is not fully automatic. – No support for OCCI.

8 805/04/2016 Virtual Elastic Cluster: Components EC3 (Elastic Compute Cluster – www.grycap.upv.es/ec3) & IM (Infrastructure Manager – www.grycap.upv.es/im) are tools compatible with EGI Federated Cloud that facilitate the configuration and management of virtual clusters – Fully automation of the configuration process. – Automatic Elasticity based on the LRMS queue size. – Compatible with multiple providers (including OCCI interface). – It can use plain “vanilla” Linux images and configure on the fly Therefore the same configuration recipe works for different cloud providers. Insert footer here

9 905/04/2016 Virtual Elastic Cluster – EC3 Insert footer here

10 1005/04/2016 Virtual Elastic Cluster: Architecture for Galaxy It uses a single VMI image (plain Ubuntu 14.04 LTS). It requires to have the ec3 client and the galaxy configuration recipes – They can be downloaded from https://github.com/grycap/ec3. – Currently we support main galaxy framework + bowtie2 + samtools + reference data (Drosophila Melanogaster) for demonstration More tools can be added to the automation system (preferred) or manually after the deployment. – The version for INRA also includes 3 processing tools from the previous workflow. – Including the back-end processing tools and the XML interface files for the Front / End. Credentials must be provided in the proper file. Insert footer here

11 1105/04/2016 Advances since Bari meeting Extended testing with EGI FedCloud – Some “official“ VMIs were not properly configured so “scp” could not be used It blocked the automatic configuration of Ansible. – Proxy expiration prevented from continuing adding more resources to the queue Existing VMs are not terminated, but no valid credentials mean no new VMs. A solution based on myproxy was implemented so the cluster can renew the credentials if a long-living proxy is stored in a repository – Available from the EC3 SaaS. Insert footer here

12 1205/04/2016 Derive a DSL (Domain Specific Language) For data analysis WG Python : + Domain Specific Language Data structure functio n

13 1305/04/2016 fasta_dat = read(fastafile) distance_mat = disseq(fasta_dat) comp_mat = mds(distance_mat, n_axis=3) i=1, j=2 plot(compmat, axis_1=i, axis_2=j) read fasta fastafile do disseq do mds 3 plot

14 1405/04/2016

15 1505/04/2016 Scientific libraries (python, R) XML files Galaxy server CM2 library Jupyter notebook EC3 Federated cloud EGI grid Cluster (MPI)... Flowchart between interfaces, libraries and computing elements

16 1605/04/2016 Why a notebook with a DSL? A concise way to design workflows... One row is one Galaxy workflow One file is one pipeline Much higher flexibility

17 1705/04/2016

18 www.egi.eu Thank you for your attention. Questions? This work by Parties of the EGI-Engage Consortium is licensed under a Creative Commons Attribution 4.0 International License.

19 1905/04/2016 Video demo www.youtube.com/watch?v=qJz5HRsApSI Insert footer here


Download ppt "Www.egi.eu EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number 654142 WG on Python and WG on Workflows."

Similar presentations


Ads by Google