NA61/NA49 virtualisation: status and plans Dag Toppe Larsen Budapest 14.05.2012
NA61/NA49 meeting, Budapest Outline Quick reminder of CERNVM Tasks Each task in detail Roadmap CernVM installation Input needed 14.05.2012 NA61/NA49 meeting, Budapest
NA61/NA49 meeting, Budapest CERNVM CERNVM is a Linux-distribution Designed specifically for virtual machines (VMs) Based on SLC (currently SLC5) Compressed image size ~300MB Both 32-bit and 64-bit versions Addition software “Standard” software via Conary package manager Experiment software via CVMFS Contextualisation: images adapted to experiment requirements during boot Data preservation: all images are permanently preserved 14.05.2012 NA61/NA49 meeting, Budapest
NA61/NA49 meeting, Budapest CVMFS Distributed read-only file system for CERNVM (i.e. the same as AFS for LXPLUS) Can also be used by “real” machines (e.g. LXPLUS, grid) Files compressed and distributed via HTTP Global availability Central server, site replication via standard HTTP proxies Files decompressed and cached on (CERNVM) computer Can run without Internet access if all needed files are cached Mainly for experimental software, but also other “static” data (e.g. calibration data) Each experiment has a repository to store all versions of software Common software (e.g. ROOT) available from SFT repository 14.05.2012 NA61/NA49 meeting, Budapest
NA61/NA49 meeting, Budapest Data preservation As technology evolves, no longer possible to run legacy software on modern platforms Must be preserved and accessible: Experiment data Experiment software Operating environment (operating system, libraries, compilers, hardware) Just preserving data and software is not enough Virtualisation may preserve operating environment 14.05.2012 NA61/NA49 meeting, Budapest
CERNVM data preservation “Solution”: Experiment data stored on Castor Experiment software versions stored on CVMFS HTTP “lasting” technology Operation environments stored as CERNVM image versions Thus, a legacy version of CERNVM can be started as a VM, running a legacy version of experiment software Forward-looking approach (we start preserving now) 14.05.2012 NA61/NA49 meeting, Budapest
NA61/NA49 meeting, Budapest Tasks Make experiment software available Facilitate batch processing Validate outputs On-demand virtual clusters Production reconstruction Reference cloud cluster Data bookkeeping web interface 14.05.2012 NA61/NA49 meeting, Budapest
Make experiment software available NA61/NA49 software must be available on CVMFS for CernVM to process data NA61 Legacy software chain installed Changes to be fed back to SVN SHINE Experts are installing production version on CVMFS CernVM environment has to be made Automatic install of necessary packages via Conary SVN checkout should compile “out of the box” May be better to use 32-bit CernVM image NA49 Software has been installed (validation needed) 14.05.2012 NA61/NA49 meeting, Budapest
Facilitate batch processing LXPLUS uses PBS batch system CernVM uses Condor batch system “Philosophical” differences PBS has one job script per job Condor has common job description file with parameters for each job Existing PBS scripts have been ported to Condor 14.05.2012 NA61/NA49 meeting, Budapest
Output validation – status Run 8688 has been processed on both CernVM/CVMFS and LXPLUS/AFS, using software version v2r7g According to analysis by Grzegorz, there are relatively large discrepancies (larger than SLC4->5) Surprising, since software is the same, and CernVM IS Scientific Linux 5 (just repacked) Can there be issues with calibration files? Or some of the changes done to make software work on CVMFS? 14.05.2012 NA61/NA49 meeting, Budapest
Output validation – plan Have requested new reconstruction on LXPLUS/CVMFS Will make it possible to separate effect of CernVM/LXPLUS from CVMFS/AFS (three- way comparison) Shine is production-ready now Should we “forget” legacy chain, and focus on Shine? On the other hand: NA49 reconstruction may have same discrepancy as NA61 legacy chain; good reason to still investigate source 14.05.2012 NA61/NA49 meeting, Budapest
On-demand virtual clusters A cluster may need VMs of different configurations, depending on type of jobs Memory, CernVM version, experiment SW, etc. Thus, need for dynamic creation/destruction of virtual cluster Created command-line script for creating virtual clusters Later to be controlled by data bookkeeping web interface 14.05.2012 NA61/NA49 meeting, Budapest
Production reconstruction Production reconstruction next step Cluster of “decent” size needed Need to submit ~50 VMs to process a large data set To run on LXCLOUD (experimental CERN service) Awaiting conclusion from software validation step 14.05.2012 NA61/NA49 meeting, Budapest
Reference cloud cluster The virtual machines require a cluster of physical hosts A reference cloud cluster has been created Detailed documentation will simplify the process of replicating it at other sites Based on OpenNebula (popular cloud framework) KVM hypervisor Provides Amazon EC2 interface (de facto standard for cloud management) 14.05.2012 NA61/NA49 meeting, Budapest
Data bookkeeping web interface A web interface for bookkeeping of the data to be created List all existing data with status (e.g. software versions used for processing) Easy selection of data for (re)processing with selected OS and software version A virtual on-demand cluster is created After processing, data written back to Castor Either based on existing frameworks, or on new development Using EC2 interface for the cloud management Allows for great flexibility of processing site 14.05.2012 NA61/NA49 meeting, Budapest
NA61/NA49 meeting, Budapest Bookkeeping outlook Most important/urgent task Founding ends end of October All dependencies (software, cloud, dynamic clusters) in place Only bookkeeping system itself missing Optimistic about outlook for completion by end of October But should be wary of sources (validation?) of delays... 14.05.2012 NA61/NA49 meeting, Budapest
NA61/NA49 meeting, Budapest Roadmap Task Status/done Remaining Expected NA61 software installation Legacy framework Shine framework End of May NA49 software installation Software installed Data validation To be determined Facilitate batch system OK Validate outputs Created reference data sets for validation Understand source of discrepancies On-demand virtual cluster Production reconstruction Set-up ready Awaiting conclusion of validation discrepancies Reference cloud cluster Cluster working Documentation End of July Data bookkeeping web interface Initial planing Evaluate frameworks “First” version “Final” version End of October 14.05.2012 NA61/NA49 meeting, Budapest
NA61/NA49 meeting, Budapest Next steps Parallel task 1 Understand validation discrepancies Run large-scale processing on CernVM Parallel task 2 (critical path) Data bookkeeping web interface for CernVM processing Run large-scale processing using CernVM/web interface Transfer to NA61 14.05.2012 NA61/NA49 meeting, Budapest
CernVM for development CernVM makes it possible to run production version of legacy software/shine on laptop without local install Also possible to compile Shine from SVN on CernVM “out of the box” when the proper NA61 environment is set up Is also possible to mount NA61 software from CVMFS directly on laptop (but software dependencies may have to be resolved by user) 14.05.2012 NA61/NA49 meeting, Budapest
CernVM installation on laptop Install a hypervisor of your choice, e.g. Virtualbox: https://www.virtualbox.org/ Download a matching CernVM desktop image: http://cernvm.cern.ch/portal/downloads Open http://<ipaddress>:8004 in your web browser (user=admin, password=password) Select NA61 and PH-SFT software repositories Reboot You are now ready to use NA61 software in CernVM on your laptop! More information: http://cernvm.cern.ch/portal/cvmconfiguration 14.05.2012 NA61/NA49 meeting, Budapest
NA61/NA49 meeting, Budapest My schedule I will be around for assistance with installing CernVM on laptops However, I plan to leave Wednesday morning, since I have a CHEP-poster to print/make sure is brought to CHEP (I will not go to CHEP, only my poster will) This is “negotiable”, e.g. if there is big demand for CernVM install, Shine installation work or important discussions 14.05.2012 NA61/NA49 meeting, Budapest
NA61/NA49 meeting, Budapest Input needed NA49 validation SHINE installation NA61 legacy validation discrepancies How to practically arrange for large scale reconstruction Issues related to data bookkeeping Please keep virtualisation (CERNVM/CVMFS) in mind when making plans ... 14.05.2012 NA61/NA49 meeting, Budapest