The CMS CERN Analysis Facility (CAF) Peter Kreuzer (RWTH Aachen) - Stephen Gowdy (CERN), Jose Afonso Sanches (UERJ Brazil) on behalf of the CMS Offline & Computing project An Example of CAF Workflow (CMS Tracker Alignment) Alignment & Calibration results CAF pool data transfers CAF Utilisation CAF Jobs and Users 2008 User storage 1.2 Petabytes Storage disk Multiple Queues Fair share Priority Queues 700 Cores, accessed via LSF Interactive access Disk only (CASTOR) Manual Space Mgmt Distrib. Analysis CASTOR Pool : 216 nodes Group share 2TB AFS space Commissioning dedicated CASTOR user pool (50TB) Mostly 8 core WNs with 2GB memory/core Express queue for exteme priority work Memory intensive jobs use slots with 4GB/job Special batch queue which allows dedicated interactive sessions (up to 20 cores per user) User list managed via web interface by stakeholders Standard CMS analysis tool available for CAF usage Computing Infrastructure Resource Ramp up 2008 CHEP’09 Conference, PRAGUE (Czech Republic) Mar 2009 Peak data transfer rate Regularly reaching input rates >2.5 GB/s sustained during 1 hour This is a Disk to Disk rate Average data transfer rate CAF can receive transfers from the T0 and from T1 sites. During Fall 08 run, reached an average input rate of 112 MB/s [Month 08] [Nb Users] 268 Users CAF in Prompt Data Flow [Month 08] [Nb Jobs] * Express reconstruction on O(10%) quickly, then full pass after 24h AlCaReco: data for alignment and calibration T0 Store T1 Sites CMS Centre Monitoring cmscaf LSF queue: 635 job slots Free Space on CAF CASTOR pool Disk-only CASTOR pool for fast data access Dynamic disk space monitoring/alarming needed Data deletions triggered by CAF Data Managers, using central CMS Data Management tools Running/Pending Jobs batch queue Job statistics from the cmscaf LSF queue during Fall 08 data taking max jobs running : 635 average job slots usage: 67% [h] [Mbytes/s] [Days] [Mbytes/s] [h] [Jobs] [TB] [Month 08] [Job Slots] [Month 08] Ramp up and Commissioning in Spring 08 Cosmic data taking in Fall 08 Factor x1.8 additional CPU in 2009 Reached >500k jobs/month during Fall 08 data taking Dedicated to high-priority workflows (not useable by every CMS user) Today nearly 300 active CAF users Monitor/Control user activity is non-trivial In Fall 2008 CMS ran 4 weeks continuously Acquired ~300M Cosmic events with magnetic field at B=3.8 Tesla Good opportunity to test CAF workflows Example: mean of residual distributions in Tracker Primary Alignment Producer Condensed Track data Global Millepede fit Alignment Producer AlCaReco Misaligned Geometry Step 1: track-level analysis & track-by- track matrix elements: Step 2: global fit of alignment parameters parallelised across many CPUs dedicated millepede server to support memory-intensive fit …… AlCaReco Aligned Geometry Constants CAF Condensed Track data Condensed Track data Primary Alignment Producer [h] [Tbytes] Alignment and Calibration Trigger/detector diagnostics, monitoring and performance analysis Physics monitoring, analysis of express streams, fast-turnaround high-priority analysis TIER-0 TIER-1 CAF TIER-2 600MB/s HLT + Storage Manager TIER-1 CMS Detector 450MB/s Max rate-in : 3.5 GB/s Plateau rate-out : ~2GB/s 26 m High priority users only