Download presentation
Presentation is loading. Please wait.
Published byIda Corradini Modified over 6 years ago
1
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008
2
LHCb Computing Model (I)
DAQ parameters ~2000 evts/s 35 kB /evts ~70 MB/s recording rate at T0 (CERN tape) RAW DATA CERN RAW DATA CNAF PIC RAL IN2P3 GRIDKA NIKHEF Reconstruction output rDST Performed at the 7 sites (Tier0 + 6 Tier1s) Local storage of rDST (purely reconstruction output) 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
3
LHCb Computing Model (II)
CERN PIC IN2P3 DST replicated in each T1 NIKHEF CNAF RAL GRIDKA Stripping (pre-selection) output DST In order to reduce the data sample with specific pre-selection algorithms Performed at site where rDST are produced Output DST replicated to the 7 sites Analysis Performed at 7 sites (CNAF + Tier1s) MC Production Performed at Tier2, Italian collaboration has 1 Tier2 at CNAF 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
4
Workload, production and data management tools
DIRAC Distributed Infrastructure with Remote Agent Control Workload management System Central task queue Use of Pilot agents Data Management System Data transfers with full error recovery Automatic data distribution (Tier0 to Tier1s) Production tools Complex job workflows (multi-applications) Automatic job submission (data driven) for processing Jobs are submitted where data are (no forced transfers) 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
5
DIRAC WMS Realizes the PULL scheduling paradigm
Offers support for both centrally managed productions and individual user jobs Pilot Agent paradigm LCG jobs are Pilot jobs for the DIRAC WMS Allows checking of the environment before job scheduling On WN: matches job in the central task queue Terminates gracefully if no work is available 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
6
Data management system
Different tools are used for data transfer FTS gLite File Transfer Service It is a multi-experiment service, used to balance usage of site resources DIRAC has a transfer DB service to manage the transfer request (i.e RAW, DST distribution to TIER-1) Bulk transfers submitted to FTS lcg-utils Directly copy the file to the SE (i.e. MC production) DIRAC provides a distributed failover in case the select SE is not available 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
7
LHCb MC production DC06 700 million events were simulated since May 2006 1.5 M events / day through the DC06 period 1.5 million jobs successfully executed Up to jobs running simultaneously (figure below) Over 120 sites used worldwide 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
8
Reconstruction and stripping DC06
100 million events have been reconstructed 200,000 files recalled from tape 10 million events stripped 10,000 jobs submitted Up to 1200 jobs running simultaneously All the Tier-1 sites used CNAF Computing model successfully tested Data management is the big issue 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
9
Computing centre contribution
LHCb CPU usage 2007 LHCb data proccessing 2007 Data from 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
10
Distributed Data Analysis: Ganga
Ganga (Gaudi /Athena and Grid Alliance) is an easy-to-use frontend for job definition and management Application: specify the software to be run (Gaudi, ROOT) Backend: specify processing system (for LHCb: DIRAC) Input dataset: use only Logical File Name (LFN) Output dataset Splitters: possibility to divide a job into subjobs that can be processed in parallel Merging: possibility to combine the resultant outputs Used by LHCb and ATLAS 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
11
Job Splitting GANGA provides, if requested, the automatic job parallelization Need only to specify the splitting rule Automatic data driven submission of sub-jobs main Job List of LFN GANGA List of PFN catalog LFC sub-jobs CNAF RAL CERN IN2P3 GRIDKA PIC NL-T1 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
12
Ganga usage LHCb 65 unique users in the last month
total of 2500 sessions in the last month 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
13
BACKUP 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone
Perugia, 30 Gennaio-2 Febbraio
14
CCRC’08 February Activities
Maintain equivalent of 2 weeks of 2008 data taking Assuming a 50% machine cycle efficiency Raw data distribution from pit → T0 centre Raw data distribution from T0 → T1 centres Recons of raw data at CERN & T1 centres RAW rDST Stripping of data at CERN & T1 centres rDST DST Distribution of DST data to all other centres 1 copy of DST data replicated at each TIER-1 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
15
CCRC’08 numbers at CNAF (February)
A total of ~15 TB of data to be written T1D0, 7.2TB, CASTOR-2, srm 2.2 T0D1, 6.8TB, StoRM, srm 2.2 T1D1, 1.2 TB, StoRM, srm 2.2 Overall bandwidth for each “way” < 10 MB/s 330 simultaneous jobs/day Job type Duration (hrs) input files Recons 24 2 GB ~ 20KB/s Strip 6 8 GB ~ 100 KB/s 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
16
CCRC’08 activities across the sites
Breakdown of processing activities (CPU needs) Site Fraction (%) CERN 14 FZK IN2P3 22 CNAF 15 NIKHEF/SARA 17 PIC 8 RAL 10 No other activities, except user analysis 5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
17
Reconstruction 12.6 MB/s RAW 2.7 MB/s rDST 12.6 MB/s RAW 5.4 MB/s RAW
5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
18
Stripping 2.7 MB/s rDST 6.3 MB/s DST 1.1 MB/s DST 5.4 MB/s RAW
5° workshop italiano sulla fisica p-p ad LHC Angelo Carbone Perugia, 30 Gennaio-2 Febbraio
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.