Predrag Buncic (CERN/PH-SFT) Virtualizing LHC Applications
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June Provide a complete, portable and easy to configure user environment for developing and running LHC data analysis locally and on the Grid independent of physical software and hardware platform (Linux, Windows, MacOS) Code check-out, edition, compilation, local small test, debugging, … Grid submission, data access… Event displays, interactive data analysis, … Suspend, resume… Decouple application lifecycle from evolution of system infrastructure Reduce effort to install, maintain and keep up to date the experiment software Web site: 2 Virtualization R&D
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June From Application to Appliance Build types Installable CD/DVD Stub Image Raw Filesystem Image Netboot Image Compressed Tar File Demo CD/DVD (Live CD/DVD) Raw Hard Disk Image Vmware ® Virtual Appliance Vmware ® ESX Server Virtual Appliance Microsoft ® VHD Virtual Apliance Xen Enterprise Virtual Appliance Virtual Iron Virtual Appliance Parallels Virtual Appliance Amazon Machine Image Update CD/DVD Appliance Installable ISO Starting from experiment software… …ending with custom Linux specialised for a given task
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June Login to Web interface 2. Create user account 3. Select experiment, appliance flavor and preferences As easy as 1,2,3
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June “Thin” Virtual Machine The experiment are packaging a lot of code but really use only fraction of it at runtime CernVM downloads what is needed and puts it in the cache Does not require persistent network connection (offline mode)
Publishing Releases 1. Each experiment is given a VM to install and test their software using own installation tools 2. Publishing is an atomic operation
CernVM Infrastructure
Service Level Status
Where are our users? ~800 different IP adresses
Download statistics
Download image types
CVMFS Performance
AFS vs CVMFS In this setup measure the time penalty t=t AFS,CVMFS - t local resulting from having the application binaries, search paths, and include paths reside on a network file system while running ROOT stressHEPIX benchmark.
CVMFS shows consistently better performance than AFS in case of ‘cold cache’ irrespectively if latency or bandwidth constraints Results We compare the performance of AFS (1.4.8) and current version of CVMS as well as development version of CVMFS that includes extra optimization
Proxy Server Proxy Server CernVM HTTP server HTTP server Proxy Server HTTP server HTTP server Proxy Server HTTP server HTTP server Proxy Server Scaling up… Proxy and slave servers could be deployed on strategic locations to reduce latency and provide redundancy
HTTP server HTTP server Proxy Server CernVM Content Distributio n Network WAN Use existing Content Delivery Networks to remove single point of failure Amazon CloudFront ( Coral CDN ( LAN Use P2P like mechanism for discovery of nearby CernVMs and cache sharing between them. No need to manually setup proxy servers (but they could still be used where exist) LAN & WAN
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June Release status Stable release Available for download from Development release Available now for download from devel/releaseshttp://rbuilder.cern.ch/project/cernvm- devel/releases Can be run on Linux (VMware Player, VirtualBox, KVM) Windows(VMware Player, VirtualBox) Mac (VMware Fusion, VirtualBox, Parallels) Appliance can be configured and used with ALICE, LHCb, ATLAS, CMS and LCD software frameworks Future releases will probably come in two editions Basic (text development environment, suitable for ssh login, ~200MB) Desktop (full desktop environment, works on VMware & VirtualBox, ~500MB)
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June What’s next? Support for existing CernVM users Helping new ones get on board LCD - Linear Collider Detector studies NA61 (software and data access preservation) Continuing develoment of CVMFS New version feature complete Should go into production release by the end of summer Moving on to CernVM 2.0 Will be based on upstream SL5 (unlike current version based on RH4) Release by the end of the year Continuing to develop tools (CoPilot) to ease deployment of CernVM in cloud like environments Nimbus, EC2 BOINC - platform for computing using volunteered resources Possibly investigating possibilities for deployment on the Grid
GRID
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June CernVM as job hosting environemnt Ideally, users would like run their applications on the grid (or cloud) infrastructure in exactly the same conditions in which they were developed CernVM already provides development environment and can be deployed on cloud (EC2) One image supports all four LHC experiments Easily extensible to other communities
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June Possible deployment scenario Hypervisor deployed on physical nodes running instances of CernVM Number of instances under control of experiment software admin as well as instance parameters (# of cores, memory) Virtual machine images thin provisioned from a shared storage Required to be able to move VMs between physical node VMs run on a private network No incoming connectivity to VMs Only limited outgoing connectivity via gateway node Outgoing HTTP connectivity via caching proxy Access to storage via tactical storage proxy Equivalent to HTTP proxy for data files HTTP Proxy Tactical Storage Proxy Shared Storage NAT Gateway CernVM
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June Advantages Simple and should be able to fulfill the needs of experiments Exactly the same environment for development and job execution For as long as Pilot Jobs can be run in VM there should be no difference between this model and what is currently going on on the grid Software can be efficiently installed using CVMFS HTTP proxy assures very fast access to software even if VM cache is cleared Can accommodate multi-core jobs
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June Addressing site concerns Should answer to site concerns VMs are deployed on a private network, no incoming connectivity No shortage of IP numbers Tactical storage proxy should provide mechanism to VM to efficiently access bulk of data files Possible implementation using xrootd Depending on hypervisor choice monitoring of VM may or may not be different from current practices CernVM has integrated rAA agent to assure appliance software updates However, these should be less frequent as less components are installed on the system and far less critical as they run on private network All remaining network activity (beyond data access and HTTP) can be monitored and policy enforced on gateway node
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June Variations… In the simplest case VMs are simply started according to predefined share per experiment An alternative is to deploy VM provisioning infrastructure that will instantiate VMs according to the request and specification made by authorized person in the experiment Nimbus, Open Nebula or vendor tools (vSphere…) Gives more freedom to experiments to express their optional requirements in terms of memory, number of jobs If we trust what vendors are telling us, we can over commit resources and let hypervisor and/or management tools do their job in optimizing resource usage
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June Batch or not to batch now? For this to work we need a mechanism to force people to shutdown their VMs once they are not in use On EC2 this is simple - Amazon charges your credit card for every CPU (wall time) hour and this is usually sufficient incentive Do we need economy model for scheduling VMs? Should we start thinking about accounting in terms of VM slot hours instead of real CPU hours? Wouldn’t it make sense to start thinking in terms of advanced slot reservations instead of sending batch jobs to the queue? Hypervisors can again help to some extent Unused VMs can be parked and resources used by others if we over commit them They can spread and balance the load over available physical resources
Workshop on adapting applications and computing services to multi-core and virtualization, CERN, June Conclusions Lots of interest from LHC experiments and huge momentum in industry ATLAS, LHCb, CMS, ALICE, LCD Hypervisors are nowadays available for free (Linux, Mac and Windows) But managing tools and support are not CernVM approach solves the problem of efficient software distribution Using its own dedicated file system One (thin) image fits all Initially developed as user interface for laptop/desktop Already deployable on the cloud (EC2, Nimbus) Can be deployed on managed (and unmanaged infrastructure) without necessarily compromising the site security Deployment on the grid or in the computer centre environment requires changes to some of the current practices and thinking Utilizing private networks to avoid shortage of IP numbers and to hide VMs form public internet Use proxy/caches wherever possible Move away from traditional batch job scheduling towards advanced slot reservations for VMs carry out the accounting in the same terms