Download presentation
Presentation is loading. Please wait.
Published byArline Bennett Modified over 6 years ago
1
Processing of Images: Orchestrating an Elastic Cloud (https://youtu
Presented by Ignacio Blanquer UPV (Spain) INDIGO SUMMIT EGI-INDIGO workshop on community application support Catania, 10th May 2017 RIA
2
Medical Imaging Biobanks
External subproj. team Description Resource needs SW dependencies Data Request Access Request Scientific & Technical Committee Regional PACS Anonymised data Data Manager Tech. Manager Data request Updated SW deps & resource needs Evaluate access Infrastructure Brokering SW Configuration Submit Provision SW deps BIMCV (an EuroBioImaging ESFRI node) manages a population database from an area of 5 Million people. BIMCV will receive applications for projects E.g. Training a set of models for the automatic segmentation of bone tissues in osteoporotic women of an age above 70, including sound control subjects. BIMCV will provide research data for those projects and a infrastructure to register them However, BIMCV has limited resources to deal with the computations of the different pilots. BIMCV is seeking for a model to provide services rather than data and implementing a closer follow-up of the activities. ceib.san.gva.es/bimcv EGI INDIGO Summit Catania - 10/5/2017
3
EGI INDIGO Summit Catania - 10/5/2017
Use Case Requirements EB#1 Persistent (but medium-term) data storage volumes with standard POSIX file access. EB#2 ACL in the access to data. EB#3 Execution of data-driven and computing- intensive workflows. EB#4 Availability of customised software. EB#5 Deployment of own software. EB#6 Resources adaptation to workload. EB#7 Terminal access to the resources. EB#8 Online access to data. EB#9 Management of users and groups. EB#10 Long-term availability of results. EB#11 Provenance and repeatability of experiments. TOSCA SLURM EGI INDIGO Summit Catania - 10/5/2017
4
Simplified architecture of the solution
Specification of the dependencies as Ansible Yaml and application topologies as TOSCA documents. Upload of images and IPR- protected software on a OneData volume. Deployment of an elastic cluster where VMs mount the ONEData volume. Execution of jobs in the SLURM queue embedded on containers that mount locally the shared volume and run the biomarker job. Image Data Bank Extract 2 Proj Volume Proj. Volume Research Repository Upload OneData Proj. Volume Nifty & Health Metadata Submit Orchestrator IaaS mount Deploy Local Disk Project Technical Manager TOSCA IM pull Processing code Own containers 1 WN WN Front/ end Web Portal End-users (CLUES) 3 4 Generic INDIGO components EUBIOSTEO INDIGO pilot Developer EGI INDIGO Summit Catania - 10/5/2017
5
Orchestrator architecture
It provides an entry point for deploying virtual infrastructures described as TOSCA templates. It matches configuration requests with the available infrastructures Availability of VM/container images. Availability of resources. Specific site requests. Authorization. January 2017 INDIGO-DataCloud Big Data
6
Detailed architecture of the Cluster
slurm mount Virtual Machine No privileges required on Docker ubuntu 16 EGI Conf & INDIGO Summit Catania - 10/5/2017
7
CLUES and Elastic Queues
Deploy & Contextualize Front/end. Submit job to the batch queue. Add Working nodes on demand. Use case admin Submit job Cluster user TOSCA Worker Node Update Deployment Infrastructure Manager (IM) Front-End Cloud plugin LRMS Client Create Deployment LRMS conextualizer CLUES Orchestrator Cloud connector Worker Node TOSCA Ansible LRMS Client conextualizer Update Deployment Worker Node LRMS Client January 2017 INDIGO-DataCloud Big Data
8
DEPLOYING INFRASTRUCTURES
What do you need? 1) Valid IAM credentials. 2) A customised TOSCA Document with the description of the infrastructure. 3) REST client (Orchent / curl). 4) SSH client. What do you get? A Front-end with an SLURM batch queue and two working nodes deployed. The credentials for accessing the Front-end. A self-managed batch queue which could grow up to a predefined maximum number. 4) Complete configuration of the software dependencies defined in the TOSCA yaml for the working nodes (including Docker & OneClient volume mounting, among others). January 2017 INDIGO-DataCloud Big Data
9
Step 1: Obtaining the IAM Token
CLI-based approach: curl -s -L -d client_id=7873d62e-bf8d-4a1b-51b4-a9e6b7afb172 -d client_secret=ALy1LCRoEQA8tpVuOkEDVIr0cNNZecdNCiJ2PKA4HUvmCqyfKlqIQGg8C21Mh1t PgyhH1v98YVdQTOx2JaYf1gw -d grant_type=password -d username=indigo-user -d password=M6dPPnf0GiJ7Ba -d scope="openid address phone profile offline_access " Web-based approach: Integrating distributed data infrastructures with INDIGO-DataCloud
10
Step 2: Tosca template - Arguments https://github
The document describes a topology of one front end and a variable (up to an upper limit) number of nodes. The arguments describe the properties of the nodes, the input and the output of the deployment process. Inputs Nodes Outputs
11
Step 2: Tosca template – ElasticCluster Nodes
elastic_cluster_front_end tosca.nodes.indigo.ElasticCluster slurm_front_end wn_node tosca.nodes.indigo.LRMS.FrontEnd.Slurm tosca.nodes.indigo.LRMS.WorkerNode.Slurm Requires two other nodes: Frontend Worker nodes slurm_server slurm_wn tosca.nodes.indigo.Compute tosca.nodes.indigo.Compute
12
Step 2: Tosca template – Frontend
Slurm based ElasticCluster Public IP and DNS name Ubuntu 16
13
Step 2: Tosca template – Worker nodes
Maximum number of instances (from input variables) Ubuntu 16
14
Step 3: Infrastructure Deployment
The orchestrator exposes a REST API that can be invoked by any REST client (curl, Postman, etc.) or through orchent. Orchent can be obtained and compiled from source (github.com/indigo- dc/orchent) or deployed as a Docker container (indigodatacloud/orchent) Configuration ORCHENT_TOKEN=XXXX ORCHENT_URL=XXXX depcreate depls * depID Status, keys TOSCA depshow depupdate TOSCA deptemplate depdel resls resshow resID * Res Info January 2017 INDIGO-DataCloud Big Data
15
Step 4: Accessing the cluster
$ orchent --url=$ORCHENT_URL depshow 174fe82a a884-3e99e8e6d8fb Deployment [174fe82a a884-3e99e8e6d8fb]: status: CREATE_COMPLETE creation time: T10: update time: T11: ... outputs: { "cluster_creds": { "token": "-----BEGIN RSA PRIVATE KEY-----\nMIIEpAIBAAKCAQEA2FUzhwwKBpqX5RUF19O7A+fZN3BhxVh4bJSeiQverJ11+THE\nuhM/q43cc9sBEyMuSt9zOdImS66qEXBG71Shj3sji2K+GZtwRU30u21SR19dC6tk\nbWC+xvLUmpoLRUOfGmLWf+Q8kRI6uzpOzuILVhXdBoYWJUDJtJfDH51fC2x/701s\n8qjdBk1q5m7hH77NE1Ek2yopSu5eKIfcXINlV1XachpdYWlvyuKZ7YgNCVZ3eHOnUZbTFzWvZKuJq3ULaii3vr1iI87A1HXBYtNa8T2WLRMxIggFlbO/gJ0r\n/rsMviPM6pGVbvspxzuKLOGAqZo7W9ABcn9GxwIDAQABAoIBAArM9i2f5EBAJ6VA\nT3JfF88yHB...9xQl\ngFAXXjE/0bY4nKcBiXamBBougJNrjXTuhtz64lon7wzFiWrzvviu8mSq1RhAQpx1\n6LwLNhOWqA1sVimealgrQ5k/b6PHmvBi/6jkAyyjWkJ1+nX0TGUE+w==\n-----END RSA PRIVATE KEY-----\n", "token_type": "private_key", "user": "root" }, "cluster_ip": " " } links: self [ resources [ template [ Deployment Status Private Key Default username Public IP January 2017 INDIGO-DataCloud Big Data
16
Step 4: Accessing the cluster (II)
Create private key file : echo "-----BEGIN RSA PRIVATE KEY-----\n...GUE+w==\n-----END RSA PRIVATE KEY-----" > key.pem && chmod 0600 key.pem Use IP and username to ssh in: ssh -i private-key.pem Welcome to Ubuntu LTS (GNU/Linux generic x86_64) ... Integrating distributed data infrastructures with INDIGO-DataCloud
17
Integrating distributed data infrastructures with INDIGO-DataCloud
Recap for the Demo You had: Account on the IAM service . A OneData space supported by at least one provider (not necessary for any elastic cluster). An ORCHENT CLI. During the demo we will: Check the main attributes of the specific TOSCA templated we created. Self-provision the front-end and two worker nodes through the orchestrator and automatically configure it to support SLURM, Docker and the OneData space automatically mounted, out of the TOSCA template. Log into the front end and check the proper configuration of the batch queue. Integrating distributed data infrastructures with INDIGO-DataCloud
18
EGI INDIGO Summit Catania - xxx/5/2017
Demo Video Show the user interface Show how to launch the application (simplified), and where Show how the application is executed if relevant Show the output EGI INDIGO Summit Catania - xxx/5/2017
19
Summary: Benefits of the INDIGO-DC solutions
Orchestrator, orchent & IM are key components in our solution: The infrastructure configuration is coded in a standard TOSCA template so it can be deployed in most IaaS clouds. By means of the orchent CLI, we create an infrastructure to the orchestrator, which interacts with Infrastructure Manager (IM) for the translation of a TOSCA specification into the IaaS specific API calls. We avoid manual installation, being able to deal with different (even incompatible) configurations simultaneously. Success metrics: Higher availability, reduced maintenance effort, no performance penalty, higher isolation. CLUES is a key component in our solution: It provides the self-management of a cluster with an elastic queue. CLUES is coded as part of the TOSCA template, so its management becomes transparent to the user, who just has to specify the proper template. CLUES exposes a SLURM batch queue that can be easily integrated in any application. Success metrics: No need to upfront deployment of a fixed number of nodes, therefore reducing costs or improving usability. “INDIGO EuBiOsteo pilot is indeed very relevant for the interoperability with different cloud infrastructures we want to achieve with our biomarkers platform” – Angel Alberich, researcher of HUPLF and CEO of QUIBIM. EGI INDIGO Summit Catania - 10/5/2017
20
We are ready to share our experience!
Better Software for Better Science.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.