Download presentation
Presentation is loading. Please wait.
1
Virtualisation for NA49/NA61
Dag Toppe Larsen UiB/CERN Zagreb,
2
Outline Recapitulation Reference cloud NA49/NA61 data processing
Why virtualisation? CERNVM CVMFS Reference cloud NA49/NA61 data processing Status Next steps Outlook
3
Why virtualisation? Data preservation Very flexible
Avoids complexity of grid Easy software distribution Processing not constrained to CERN Take advantage of new LXCLOUD Take advantage of commercial clouds, e.g. Amazon EC2 Can develop in same VM as data will be processed on – should reduce failing jobs
4
CERNVM: introduction Dedicated Linux distribution for virtual machines
Currently based on SLC5 Newer and updated versions will be made available for new software Old versions will still be available for legacy analysis software Supports all common hypervisors Supports Amazon EC2 clouds
5
CERNVM: layout
6
CVMFS: introduction Distributed file system based on HTTP
Read-only Distribution of binary files – no need for local compile & install All libraries & software that can not be expected to be found on “standard” Linux should be distributed Each experiment has one or more persons responsible for providing updates and resolve dependencies
7
CVMFS: software repositories
Several repositories mounted under /cvmfs/ Each repository typically corresponds to one “experiment” (or other “entity”) Experiments have “localised” names, e.g. /cvmfs/na61.cern.ch/ Common software in separate repositories, e.g. ROOT in /cvmfs/sft.cern.ch/ Several versions of software may be distributed in parallel – user can choose version to run
8
Reference cloud: introduction
Small CERNVM reference private cloud Condor batch system OpenNebula management Amazon EC2 interface Reference installation for other clouds Detailed/simple step-by-step instructions for replication at other sites will be provided Attempt to make “uniform” installations Site customisation possible for monitoring, etc.
9
Reference cloud: OpenNebula framework
Popular framework for management of virtual machines Supports most common hypervisors Choice: KVM/QEMU – fast, does not require modifications to OS Amazon EC2 interface Possible to include VMs from other clouds, and provide hosts to other clouds Web management interface
10
Reference cloud: Amazon EC2 interface
EC2 is the commercial cloud offered by Amazon EC2 also describes an interface for managing VMs Has become de-facto interface for all clouds, including private Hence, using the EC2 interface allows for great flexibility in launching VMs on both private and commercial clouds
11
Reference cloud: public vs. private clouds
12
Reference cloud: Elasticfox web user interface
VM management through browser Can configure/start/stop VM instances, add/remove VM images Through Amazon EC2 interface Similar interface needed for data processing
13
NA49/NA61 processing: status
CVMFS software installation Software for NA61 installed Issues with some set-up file options? Can also be used for processing on LXPLUS/ LXBATCH No need to adapt scripts (except for environment) NA49 software in progress Processing on CERNVM: Currently, reconstruction can be run “by hand” Batch system exists, scripts being adapted
14
NA49/NA61 processing: NA61 CVMFS installation
Available under /cvmfs/na61.cern.ch/ On CERNVM virtual machines On “ordinary” computers having CVMFS installed, including LXPLUS/LXBATCH Script to set up environment: . /cvmfs/na61.cern.ch/library/etc/na61_env.sh
15
NA49/NA61 processing: next steps
Two main tasks: Software validation CERNVM processing set-up Can be largely done in parallel Suggested steps on following slides Use LXBATCH for initial software validation Then validate on CERNVM Then set up production system
16
NA49/NA61 processing: next steps
Step 1a: software validation on LXBATCH Select reference data set already processed by LXBATCH using software on AFS Reprocess the data on LXBATCH, but using software on CVMFS (instead of AFS) Compare output from CVMFS and AFS software Correct eventual problems Decouple issues related to CVMFS software installation from CERNVM set-up Ready to start this step now
17
NA49/NA61 processing: next steps
Step 1b: CERNVM set-up, convert all processing scripts LXPLUS/LXBATCH uses PBS, CERNVM Condor Remove AFS references Castor Kerberos authentication (distribute kinit keytab file?) In progress, soon ready for reconstruction
18
NA49/NA61 processing: next steps
Step 2a: software validation on LXBATCH Use CVMFS software for “normal” processing on LXBATCH (instead of AFS) Step 2b, CERNVM set-up: Select reference data set already processed by LXBATCH using software on AFS Reprocess the same data using CERNVM on test/ reference cloud, using software on CVMFS Compare output from CERNVM and LXBATCH Correct eventual problems CVMFS issues should already be found in step 1a
19
NA49/NA61 processing: next steps
Step 3: production processing on LXCLOUD LXCLOUD is new cloud service offered by CERN IT Experience from reference cloud directly transferable Possible to set up processing facilities at other sites than CERN, based on reference cloud, if needed
20
NA49/NA61 processing: what is needed
To successfully adapt processing to CERNVM, some input is needed Overview of (all) processing types performed on LXBATCH Scripts in use How to set-up/configure Also analysis (not only reconstruction)? Reference data sets to compare output Who is responsible for the various (software) components – who can/wants to participate? Carrying out the steps on the previous slides
21
NA49/NA61 processing: web user interface
Web user interface for managing VM instances/ images exists Needed: processing-centred web user interface What data to process and what type of analysis List of available data and status, request processing Specify versions of software and VM Specify requirements for processing nodes Both VM and processing management in the same interface, or two separate interfaces? Generic experiment interface?
22
NA49/NA61 processing: web user interface
“Step 4” CERNVM processing does not depend on it But will make the user experience much improved Considering to extend an existing tool for managing VMs to also manage data/processing
23
Status/outlook Reference cloud up and running
NA61 software available on CVMFS NA49 software soon available on CVMFS Software validation ready to begin Data processing on CERNVM: Currently by hand, using batch soon Needed: better understanding of different processing task, who is responsible for what Needed: processing-centred web interface
24
Backup
25
Data preservation: motivation
Preserve historic record Even after experiment end-of-life, data reprocessing might be desirable if future experiments reach incompatible results Many past experiments have already lost this possibility
26
Data preservation: challenges
Two parts: data & software Data: preserve by migration to newer storage technologies Software: more complicated Just preserving source code/binaries not enough Strongly coupled to OS/library/compiler version (software environment) Software environment strongly coupled to hardware, platform will eventually become unavailable Porting to new platform requires big effort
27
Data preservation: solution
Possible solution: virtualisation “Freeze” hardware in software Can run legacy analysis software on legacy versions of Linux they were originally developed for in VMs Software environment preserved, no need to modify code Comes for free if processing is already done on VMs
28
CERNVM: use cases Two main use cases:
Computing centre Images for head and batch nodes Includes Condor batch system Personal computers Desktop (GUI) and basic (CL) images “Personal” use Code can be developed (desktop image) in similar environment/platform it will be processed (batch node image)
29
CERNVM: contextualisation
All CERNVM instances initially identical Experiment specific software configuration/set- up introduced via contextualisation Two types CD-ROM image – mainly site specific configuration EC2 user data – mainly experiment specific Executed during start-up of VM
30
CVMFS: design Compressed files on HTTP server
Downloaded, decompressed and cached locally on first use Possible to run software without Internet connection A hierarchy of standard HTTP proxy servers distribute the load Can also be used by non-VMs, e.g. LXPLUS/LXBATCH, other clusters, personal laptops
31
Reference cloud: virtual distributed Condor cluster
Based on VMs in cloud Can be distributed over several sites Even if nodes are on different sites, they will appear to be in the same cluster A tier 1 can include VMs provided by tier 2s in its virtual Condor cluster Can be very work-saving, as the tier 2s do not need to set up job management themselves Other possibility: local CERNVM batch system run local jobs (like normal cluster)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.