Design your e-infrastructure. https://indico. egi Design your e-infrastructure! https://indico.egi.eu/indico/event/2895/ Use case: PhenoMeNal Break out group coordinator: Enol Fernández (EGI.eu)
Group members Marco Capuccini (UU) Uros Stevanovic (KIT) Radim Pesa (MU) Giovanni Morelli (CINECA-EUDAT) Enol Fernández (EGI.eu) Second break-out: Ricardo Graciani (UB)
First break-out Background and Users
Users: medical doctors, biologists, bio- informatics Characteristics: Who will be the user? Can the users be characterised? How many are they? Users: medical doctors, biologists, bio- informatics Characteristics: Unexperienced users: do not create analysis, but run them (MD, Biologists) Power users: workflow/script creators (BIO-informatics) Data producers: need computing skills Number of users really depends on the success of the platform and the easiness to access
What will be the value of the infrastructure for them What will be the value of the infrastructure for them? What will the system exactly deliver to them? PhenoMeNal will simplify the execution of metabolomics analysis for medical doctors/biologists Parallelising code to reduce execution times (typically 1 week for an analysis) Provide software that can be deployed from hospitals to cloud infrastructures EGI-EUDAT can help to build a virtual infrastructure that connects several data centres so: Enabling Sharing of data and analysis pipelines Providing Long Term Preservation? Can provide User authorization / authentication Closer to the final user
How should they use the system? Unexperienced/Power users: web interface (jupyter/galaxy) Data producers: command line access
Development: Q1 2017 Testing: Q3 2017 Large-scale: Q1 2018 What's the timeline for development, testing and large-scale operation? (Consecutive releases can/should be considered.) Development: Q1 2017 Testing: Q3 2017 Large-scale: Q1 2018
Design and implementation plan Second break-out Design and implementation plan
What should the first version include What should the first version include? - The most basic product prototype imaginable already bringing value to the users (the so-called Minimal Viable Product - MVP) First version will deploy the virtual infrastructure on a single site where PoC workflows can be run GCE and OpenStack as targets
Which components/services already exist in this architecture? MANTL deployed with Terraform Supported on GCE and local OpenStack installation Workflows executed in the MANTL as microservices
Which components/services are under development (and by who)? PoC under development EBI developing Galaxy pipeline UU developing Jupyter/Spark pipeline
Deployment into “EGI” resources Which components/services should be still brought into the system? Can EGI/EUDAT partners do it? Deployment into “EGI” resources Requires Terraform support on EGI Quick option: Add EGI AAI support to OpenStack driver in Terraform Ideal solution: Add new OCCI driver in Terraform User authentication/authorization EGI/EUDAT/AARC can bring their expertise to help building a solution MANTL may bring this feature at some point
Are there gaps in the EGI/EUDAT service catalogues that should be filled to realise the use case? Which service provider could fill the gap?
Next Steps Integrate EGI Federated Cloud with Terraform. Quick way: Add EGI authentication into Terraform OpenStack plugin More complex way: Add OCCI support into Terraform