Claudia+OpenNebula Driving Cloud Services into the Cloud Juan A. Cáceres (caceres@tid.es) StratusLab Kick-off meeting Orsay, 14-15 June 2010 Telefónica Servicios Audiovisuales S.A. / Telefónica España S.A. Título de la ponencia / Otros datos de interés / 26-01-2010 TELEFÓNICA I+D
01 02 03 Índice The Reservoir IaaS Model SGE Use Case The Claudia Platform 02 03
01 The RESERVOIR Model Telefónica I+D
RESERVOIR Cloud Reference Architecture Service Provider Service Provider Service Provider Service Provider Service Manifest OVF+ SMI TCloud Service Manager Claudia VMI VMI TCloud/OCCI VEEM (VEE Manager) VEEM (VEE Manager) Open Nebula VHI VEE Host (hypervisor, VSJC) VEE Host (hypervisor, VSJC) VEE Host (hypervisor, VSJC) RESERVOIR Site
RESERVOIR’s Service deployment model Service Manifest (OVF) Reservoir Site 1 VEE 2CPU 1Gb Mem 10 Gb Disk C11 Logical Architecture Gold Gold VEE 4CPU 4Gb Mem 50 Gb Disk C3 C1 * C3 1 Platinum C5 1 VEE 1CPU 0,5Gb Mem 5 Gb Disk C12 C2 1 C4 1-2 Silver Internet Service Elasticity Rules C1 (2 CPU, 1 Gb, 10 GB disk) Load(C3) = 3* Load(C1) CPU(C1) = Users(C1)/1000 Replicas(C1) = RequestPerSecond(C1) /500 …. Internet VEE 1CPU 1Gb Mem 8 Gb Disk C2 VEE 2CPU 1Gb Mem 10 Gb Disk C4 VEE 10CPU 6Gb Mem 100 Gb Disk C5 Gold Silver Platinum SLA Definition SLA(C1) = Gold SLA(C2) = Bronze Users (C1) = 1000 …. Gold Gold + Deployment Directives Deploy(C11)= { Domain1, Domain 3, Domain z} SLA(RED) = GOLD CPU(C11) = 2 SPEED(RED) = 5MBS Reservoir Site 2 4 4
02 The SGE Use Case Telefónica I+D 5 5
Dynamic Scalability of the SGE Cluster Claudia deploys the SGE Cluster and manages the dynamic scalability of worker nodes KPI = pending job queue size in the Master node Elasticity Rules: if queue size / worker nodes > 20 then createReplica(WorkerVEE) if queue size / worker nodes < 15 then createReplica(WorkerVEE) OpenNebula allocates/de-allocates on-demand SGE Cluster nodes and VLAN Connections SGE Clients SGE Master SGE Worker1 SGE Worker2 SGE Worker N SCALE Jobs
Scalability example SGE Clients SGE Master SGE Worker1
Scalability example SCALE UP SGE Clients SGE Master SGE Worker1
Scalability example SCALE DOWN SGE Clients SGE Master SGE Worker1
Scalability example SGE Clients SGE Master SGE Worker1 SGE Worker 2
Scalability example SGE Clients SGE Master SGE Worker1 SCALE DOWN
03 The Claudia Platform Telefónica I+D 12 12
Claudia Architecture Claudia Cloud Dashboard (EzWeb GUI) TCloud API (REST) Monitoring (WASUP) Service Lifecycle Manager Scalability & Optimization Business Model Manager Federation/Interoperability (TCloud,OCCI, EC2, vCloud, …) Public Infrastructure Cloud (Amazon, Flexiscale, GoGrid …) Private Virtual Infrastructure Manager (OpenNebula)
Key Functionalities & Components Deployment and scalability control of services in a IaaS Cloud OVF-Based Service definition (OVF Manager Component) Multi-tier Service Architecture Required virtual resources specification (VMs, VANs, Storage, …) Elasticity Rules SLA restrictions Deployment directives Iaas Cloud API Reference Implementation of the TCloud API (extending VMWare’s vCloud, submitted to DMTF) OVF-based service and virtual resources definition Operations for provisioning, managing and monitoring services OCCI Compliant IaaS Cloud Dashboard User Management Monitoring and control GUI Based on the EzWeb mashup platform Monitoring Service Implementation of the TCloud monitoring API Event Registry Event aggregation Alarms generation Based on EzWeb/WASUP platform
Key Functionalities & Components (II) Service Lifecycle Manager Deployment, scalability control and un-deployment of services Dynamic Service Lifecycle Management (available in Q4 2010) Extended Lifecycle for non running lifecycle phases (e.g. maintenance, development) Scalability & Optimization Elasticity Rules: SLA protection Business Rules: cost control Automatic discovery of Elasticity Rules (available in Q4 2010) Billing Engine Generation of bills based on the actual usage of resources User-specific billing rules New version available in Q1 2011 Business Model CIM-base catalogue of resources and costs (available in Q4 2010) Interoperability & Federation Integration through OCCI, TCloud or specific drivers with different local infrastructure managers OpenNebula, Eucaliptus, VMware vSphere (in ropadmap), … and public clouds Amazon,GoGrid, …