Download presentation
Presentation is loading. Please wait.
Published byFelix Wilcox Modified over 9 years ago
1
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London
2
CDF experiment located at Fermilab close to Chicago proton/anti-proton collisions at the Tevatron of an energy of 1.2 TeV CDF multipurpose detector with discovery potential for the Higgs, studies of b physics and measurement of standard model parameters luminosity of about 1fb -1 per year
3
Principle of data analysis 40MB/s 2TB/day assign particle momentum, tracks etc. raw datareco data user selection user data MC monte carlo simulation of the events user analysis analysis performed by ~800 physicists in ~60 institutes
4
CDF – data handling requirements The experiment has ~ 800 physicists of which ~ 50 are in the UK. The experiment produces large amounts of data which is stored in the US ~ 1000 Tb per year ~ 2000 Tb data stored to date and expect this to rise to 10,000 by 2008 UK physicists: need to be able to copy datasets ( ~ 0.5-10 Tb) quickly to the UK create MC data within the UK other UK physicists and other CDF physicists worldwide
5
data handling numbers CDF has acquired produces nowadays 1Pb/year, expected to rise to 10Pb by 2008 Fermilab alone is serving about 18 Tb/day 590 TBRaw data 660 TBReconstructed data 280 TBMC 1530 TBtotal Bytes read TBytes 2 6 10
6
CDF batch computing 2 types of activities –organized processing raw data reconstruction data reduction for different physics groups MC production –user analysis need to be able to copy datasets (0.5-10Tb) both use large amount of CPU use the same tools for all
7
CDF Grid Philosophy CDF has adopted Grid concepts quite late during run time while it already had a mature software look & feel of the old data handling system maintained reliability main issue use existing infrastructure as portal and change software underneath
8
CDF Analysis Farm (CAF) Submit and forget until receiving a mail Does all the job handling and negotiation with the data handling system without the user knowing CDF batch job contains a tar ball with all the needed scripts, binaries and shared libraries and send tarball to output location user need to authenticate with their kerberos ticket
9
CAF -evolution over time CDF used several batch systems and distribution mechanisms FBSNG Condor Condor with Globus gLite WMS CAF was able to be distributed, run on non-dedicated resources glite WMS helps to run on EGEE sites Grid based Used as Production systems
10
Condor-based GRID CAF Collector User priorities Negotiator User jobs Schedd Globus User Job Grid nodes Starter Negotiator assigns nodes to jobs Globus assigns nodes to VOs Glide-ins Pull Model
11
gLite WMS-based GRID CAF Push Model User jobs Schedd Globus User Job Grid nodes Resource Broker
12
Pros & Cons Condor based Grid CAF Pros: Globally managed user and job priorities within CDF Broken nodes kill condor daemons, not user jobs Resource selection done after a batch slot is secured Cons: Uses a single service proxy for all jobs to enter Grid sites Requires outgoing connectivity gLite WMS-based Grid CAF Pros: LCG-backed tools No need for external connectivity Grid sites can manage users Cons: No global fair share for CDF
13
gLite WMS-based GRID CAF at FNAL: CAF worker nodes used to have CDF software distribution NFS mounted, but not an option in the Grid world all production jobs are now self-contained trying Parrot to distribute CDF software over HTTP in analysis jobs
14
FNAL remote dedicated resources Condor based Grid CAFs LCGCaf FermiGrid avg. usable VMs (Virtual Machine) Number of jobs on the CAF Some numbers
15
Data handling system SAM SAM manages file storage –Data files are stored in tape systems at FNAL and elsewhere (most use ENSTORE at FNAL) –Files are cached around the world for fast access SAM manages file delivery –Users at FNAL and remote sites retrieve files transparently out of file storage. SAM handles caching for efficiency SAM manages file cataloging –SAM DB holds meta-data for each file transparent to the user SAM manages analysis bookkeeping –SAM remembers what files you ran over, what files you processed, what applications you ran, when you ran them and where
16
world wide distribution of SAM stations selected SAM stations FNAL CDF: 10k/20k Files declared/day 15k Files consumed/day 8 TByte of Files cons./day main consumption of data still central remote use on the rise testdeployment 300Tb Total CDF Files To User
17
summary & outlook UCL-HEP cluster deployed UCL-CCC cluster still to come need a better integration of SAM and the CAF user feedback needs to be collated
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.