WLCG experiments FedCloud through VAC/VCycle in the EGI Cristóvão Cordeiro on behalf of the WLCG Resource Integration team CERN-IT EGI Conference 2015 19/05/2015
Architectural model from Pilot Jobs to Pilot VMs
Architectural model main areas of work Image Management Capacity Management Monitoring Accounting Pilot Job Framework Data Access and Networking Quota Management Supporting Services
Image Management providing the job environment with CernVM An OS via CVMFS HTTP replication of a reference file system CVMFS is already a requirement for deploying software Facilitating the distributed image management One common image for all environments and VOs No need to build experiment software-OS-platform image combinations Instance configuration completely relying on contextualization N different scenario = N different contextualizations But just one image! Using the modified version provided by EGI
Capacity Management provisioning with VCycle Why VCycle? Easy to setup It is not VO oriented It delivers capacity (scale up/down) Pilot VMs have defined lifecycles VCycle responsible for start/stop VMs and delivering user contextualization data Currently provisioning for 6 EGI FedCloud sites (5 ATLAS + 1 CMS) Managing the of resources is not so trivial though…
Monitoring basic monitoring with Ganglia Provides additional information for determining the VM behaviour Easy to deploy, enough default metrics and not VO dependent Acts as a reliable source of information for accounting
Accounting consumer/provider accounting comparison Providers usually generate monthly “invoices” For EGI, provider accounting is taken from http://accounting.egi.eu/egi.php Ganglia based accounting gives us consumer accounting What, where, when for VM resources Recording resource usage allows to detect issues and inefficiencies By using the same metrics as the “invoice”, one can compare consumer and provider accounting
Data access and networking Quota Management Final Considerations data access/networking, quotas and supporting services Data access and networking Have so far focused on non-data intensive workloads Quota Management Currently each provider has fixed limits Investigating how to design a flexible and transparent sharing of resources between VOs and providers Supporting Services What else is required? E.g. investigating the setup of Squid caches on each provider, reducing the amount of network traffic cause by the heavy usage of CVMFS