Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 The ALICE Tier-2’s in Italy Roberto Barbera (*) Univ. of Catania and INFN Workshop CCR INFN 2006 Otranto, 08.06.2006 (*) Many thanks to A. Dainese, D.

Similar presentations


Presentation on theme: "1 The ALICE Tier-2’s in Italy Roberto Barbera (*) Univ. of Catania and INFN Workshop CCR INFN 2006 Otranto, 08.06.2006 (*) Many thanks to A. Dainese, D."— Presentation transcript:

1 1 The ALICE Tier-2’s in Italy Roberto Barbera (*) Univ. of Catania and INFN Workshop CCR INFN 2006 Otranto, 08.06.2006 (*) Many thanks to A. Dainese, D. Di Bari, S. Lusso, and M. Masera for providing slides and information for this presentation.

2 Workshop CCR INFN 2006, Otranto, 08.06.20062 Outline The ALICE computing model and its parameters ALICE and the Grid(s) Layout Implementation Recent results ALICE Tier-2’s in Italy Catania Torino Bari LNL-PD Summary and conclusions

3 Workshop CCR INFN 2006, Otranto, 08.06.20063 The ALICE computing model (1/2) pp Quasi-online data distribution and first reconstruction at T0 Further reconstructions at T1’s AA Calibration, alignment and pilot reconstructions during data taking Data distribution and first reconstruction at T0 during four months after AA Further reconstructions at T1’s One copy of RAW at T0 and one distributed at T1’s

4 Workshop CCR INFN 2006, Otranto, 08.06.20064 The ALICE computing model (2/2) T0 First pass reconstruction, storage of one copy of RAW, calibration data and first-pass ESD’s T1 Reconstructions and scheduled analysis, storage of the second collective copy of RAW and one copy of all data to be kept, disk replicas of ESD’s and AOD’s T2 Simulation and end-user analysis, disk replicas of ESD’s and AOD’s

5 Workshop CCR INFN 2006, Otranto, 08.06.20065 Parameters of the ALICE computing model UnitppPbPb T1#7 T2#23 Size rawMB0.2x512.5 Recording rateHz100 ESDMB0.042.50 AODkB4250 Event CataloguekB10 Running time s10 7 10 6 Events / y#10 9 10 8 Reconstruction passes (av)#3 RAW duplication#2 AOD/ESD duplication#2 Scheduled analysis passes / rec ev / y (av)#3 Chaotic analysis passes / rec ev / y (av)#20

6 Workshop CCR INFN 2006, Otranto, 08.06.20066 Legenda: TQ= Task Queue  Central job DB CAT= Central Catalogue ALICE & the Grid(s) ALICE Agents & Daemons ROOT ALIROOT Computing framework Resources NU Grid Resources ALICE TQ ALICE Agents & Daemons OSG Resources ALICE user ALICE CAT

7 Workshop CCR INFN 2006, Otranto, 08.06.20067 Implementation: the “VO-Box” LCG Site LCG CE WN JobAgent LCG SE LCG RB TQ VO-Box SCA SA Job request LFC SURL Registration File Catalogue LFN Registration PackMan Request configuration

8 Workshop CCR INFN 2006, Otranto, 08.06.20068 Who does what ? Configure, submit and track jobs User interface with massive production support Job DB (Production and user) User and role management Install software on sites Package Managers Distribute and execute jobs Workload Management System (Broker, L&B,…) Computing Element software Information Services Interactive analysis jobs Store and catalogue data Data catalogues (file, replica, metadata, local,…) Storage Element software Move data around File Trasfer services and schedulers Access data files I/O services File management (SRM) Monitor all that stuff Transport infrastructure Sensors Web presentation..on top of that: Enforce security! MIXED PROOF Mon ALISA MIXED Xrootd

9 Workshop CCR INFN 2006, Otranto, 08.06.20069 Some statistics and results for SC3/PDC05 In the last two months of 2005: 22,500 jobs (Pb+Pb and p+p) Average CPU time: 8 hours Data volume produced: 20 TB (90% CASTOR2 at CERN, 10% remote sites) Resource Centres participating (22 in total) 4 T1: CERN, CNAF, GridKa, CCIN2P3 18 T2: Bari, Clermont (FR), GSI (D), Houston (USA), ITEP (RUS), JINR (RUS), KNU (UKR), Muenster (D), NIHAM (RO), OSC (USA), PNPI (RUS), SPbSU (RUS), Prague (CZ), RMKI (HU), SARA (NL), Sejong (SK), Torino, UiB (NO) Job share per site: T1: CERN 19%, CNAF 17% (CPU 20%), GridKa 31%, CCIN2P3 22% T2: total of 11% Failure rate di AliRoot: 2.5%

10 Workshop CCR INFN 2006, Otranto, 08.06.200610 Job execution profile during SC3 2450 jobs (25% more than entire lxbatch capacity at Cern) Negative slope: AliEn problem during output retrieval. Fixed in the further release!

11 Workshop CCR INFN 2006, Otranto, 08.06.200611 Without INFN-T1 ~388000 job ~811000 job Memento: VO= Virtual Organization (esperimento) ALICE: 8% of the total number of jobs on the national grid Use of INFN Grid by LHC Exps.: JOB/VO (Sep 2005 - Dec 2005)

12 Workshop CCR INFN 2006, Otranto, 08.06.200612 ~ 98 years, 2 month, 18 days ~ 358 years, 7 months, 11 days Without INFN-T1 ALICE: 14% of CPU time outside T1 Use of INFN Grid by LHC Exps.: CPU/VO (Sep 2005 - Dec 2005)

13 Workshop CCR INFN 2006, Otranto, 08.06.200613 ALICE JOBS PER SITE. Warning: Job agents and real jobs are accounted in the same way

14 Workshop CCR INFN 2006, Otranto, 08.06.200614 ALICE Tier-2’s in Italy Four candidates: Bari, Catania, LNL-PD, and Torino (T2 projects available at the URL: http://www.to.infn.it/~masera/TIER2/). http://www.to.infn.it/~masera/TIER2/ The team of ALICE referees with representatives of the INFN Management Board visited all Tier-2 candidates between 10/2005 and 02/2006. Referees’ decision communicated at a meeting in Rome on 10/03/2006: Catania and Torino approved; Bari and LNL-PD “incubated” (kept in “life support” until real ALICE needs are proved by real test of the computing model in production mode).

15 Workshop CCR INFN 2006, Otranto, 08.06.200615 Network connectivity of T2-s ALICE Tier-2’s

16 Workshop CCR INFN 2006, Otranto, 08.06.200616 Catania (1/5) – Comp. room Present installation Future expansion Space available for installations: ~160 m 2

17 Workshop CCR INFN 2006, Otranto, 08.06.200617 Catania (2/5) - Infrastructure Traditional System High Density System

18 Workshop CCR INFN 2006, Otranto, 08.06.200618 Catania (3/5) - CPU 150 kSI2k SuperMicro dual AMD dual-core 275 with 4 GB RAM in 1U configuration IBM LS20 “blades” with dual AMD dual-core 280 with 4 GB RAM (within june) LSF 6.1 as LRMS

19 Workshop CCR INFN 2006, Otranto, 08.06.200619 Catania (4/5) - Storage 21+ TB with GPFS FC-2-SATA systems plus more traditional DAS with EIDE-2-SCSI controllers Filesystem: GPFS

20 Workshop CCR INFN 2006, Otranto, 08.06.200620 Catania (5/5) - Statistics Last month activity

21 Workshop CCR INFN 2006, Otranto, 08.06.200621 Torino (1/5) – Computing Room

22 Workshop CCR INFN 2006, Otranto, 08.06.200622 Torino (2/5) - Present installation Present solutions: blade servers (IBM) and 1U biprocessors Guidelines for the future: Minimize space Minimize power consumption

23 Workshop CCR INFN 2006, Otranto, 08.06.200623 Torino (3/5) - Resources CPU 38 Intel(R) Xeon(TM) CPU 2.40GHz; 12 Intel(R) Xeon(TM) CPU 3.06GHz. 45 Intel Biprocessors (<=4 years – 14 Blades) DISK ~6TB dedicated to ALICE 2TB shared among various VO’s (Classic-SE); 1 dCache SE with an internal disk of ~80GB for tests; ~15TB of disk space for ALICE is going to be commissioned soon. It is a FLX210 with 3 FLC200 expansions from di StorageTek Filesystem Ext3 for the ClassicSE; not yet defined for the new storage system; Tests with xrootd for local and remote access (through proxy) are scheduled. LRMS Torque-Maui; the default one coming with the INFN Grid release Open to all VO’s Dedicated to ALICE (at the moment)

24 Workshop CCR INFN 2006, Otranto, 08.06.200624 Torino (4/5) - Resources Future evolution Many nodes (~20 – the most recent) are being migrated from the ALICE farm to the LCG farm exploiting the forthcoming upgrade to gLite 3.0; New WN’s (80 cores – 130 KSI2K), recently bought, will be installed and configured very soon. Networking: All WN’s are in a hidden LAN (only outbound connectivity is allowed) and the NATting is done by an Extreme Networks switch. Almost all connection are Gigabit Ethernet. Monitoring: MRTG and NAGIOS for the local control of the farm.

25 Workshop CCR INFN 2006, Otranto, 08.06.200625 Torino (5/5) - Usage Scheduler locale. # di job LCG. Numero di Job Monitoring centrale ALICE

26 Workshop CCR INFN 2006, Otranto, 08.06.2006 Bari (1/2) Bari is a Tier-2 candidate both for ALICE and CMS. Bari supports also other VO’s. Priorities are given to the various VO’s proportionally to the different budgets for acquiring resources. In the last two years Bari has provided resources for ALICE both for PDC04 and SC3 and will provide for SC4.

27 Workshop CCR INFN 2006, Otranto, 08.06.2006 Bari (2/2) One 2 cpu 700 MHz PIII aligrid1.ba.infn.it - HD 40 GB One 2 cpu 1 GHz PIII alicegrid2.ba.infn.it - HD 160 GB Three 2 cpu Intel Xeon 1.8 GHz alicegrid4 - alicegrid6 (VOBOX) - 3 HD da 80GB One 2 cpu Intel Xeon 1.8 GHz alicegrid3.ba.infn.it - (SE for PDC04) with 0.7 TB of data One 2 cpu Intel Xeon 2.4 GHz alicegrid5.ba.infn.it - (SE for Finuda) with 1.5 TB disk space Three 2 cpu Intel Xeon 2.4 GHz - HD 80 GB One 2 cpu Intel Xeon 2.4 GHz alicegrid7.ba.infn.it - HD 80 GB - software repository + Quattor installation server One Opteron 2 dual core 275 - HD 120 GB Three 2 cpu Intel Xeon 2.8 GHz - HD 80 GB One 2 cpu Intel Xeon 3.0 GHz EM64T - HD 2 array x 2.5 TB (TOT 5 TB) (to be configured with xrootd for SC4)

28 Workshop CCR INFN 2006, Otranto, 08.06.2006 ALICE jobs at Bari (monitored by MonaLisa)

29 Workshop CCR INFN 2006, Otranto, 08.06.200629 LNL-PD Background: LNL-PD is an approved Tier-2 for CMS; Many-years experience in running a T2 prototype for CMS. Size of the existing Tier-2 for CMS: CPU: ~200 KSI2K (almost all “blades” dual core) Storage: EIDE-2-SCSI DAS with 3Ware + Storage Area Network LRMS: LSF Monitoring: Ganglia (local) + GridIce

30 Workshop CCR INFN 2006, Otranto, 08.06.200630 ALICE at LNL-PD ALICE activities already done: ALICE VO-box installed in 02/2006 Site testing with small productions OK Big ALICE production in April-May via LCG Future activities foreseen for the rest of 2006: Participation to PDC06 (~10 kSI2k dedicated resources + the possibility to use CMS resources, if/when available) Installation of an ALICE storage system with xrootd (~1 TB at the beginning)

31 Workshop CCR INFN 2006, Otranto, 08.06.200631 ALICE jobs at LNL-PD (monitored by GridIce) ALICE 15 April 2006 – 15 May 2006

32 Workshop CCR INFN 2006, Otranto, 08.06.200632 Common issues Need for a common solution for the infrastructure (to improve the economy of scale). Need for an affordable, reliable, and scalable solution for the storage. Need for a better organization of distributed support for Tier-2’s. Although new technologies (“blades” with low-power CPU’s) help a bit, power consumption at Tier-2 sites is becoming increasingly important from an economic point of view. Strict guidelines and a dedicated budget should be centrally created by INFN Management.

33 Workshop CCR INFN 2006, Otranto, 08.06.200633 The future: PDC06 (June 2006) Check of the distributed computing model: From raw-data to ESD Data tranfers among sites Calibration and alignment Analysis SC3 experience has helped a lot to improve AliEn (current version 2.10) Intense development of AliRoot to include calibration and alignment code for all sub- detectors and reduce the percentage of run time failures. Huge effort of the Italian groups in many sites.

34 Workshop CCR INFN 2006, Otranto, 08.06.200634 Resources ramp up at INFN Tier-2’s

35 Workshop CCR INFN 2006, Otranto, 08.06.200635 Summary and conclusions The ALICE computing model has been finalized and now it is ready to face the forthcoming data from LHC. INFN has identified the first official Tier-2’s for ALICE. Both for the design and the day-by-day operation of a LHC Tier-2 a strong collaboration between the Experiments, the INFN Grid Project, the INFN CCR, and the Computing&Network Services at the various INFN Departments is of vital importance.


Download ppt "1 The ALICE Tier-2’s in Italy Roberto Barbera (*) Univ. of Catania and INFN Workshop CCR INFN 2006 Otranto, 08.06.2006 (*) Many thanks to A. Dainese, D."

Similar presentations


Ads by Google