Download presentation
Presentation is loading. Please wait.
Published byJordan Berry Modified over 9 years ago
1
G.Rahal LHC Computing Grid: CCIN2P3 role and Contribution KISTI-CCIN2P3 Workshop Ghita Rahal KISTI, December 1st, 2008
2
G.Rahal 2 Index LHC computing grid LCG France LCG at CCIN2P3 Infrastructure Validation: An example with Alice General issues Conclusions Credits to Fabio Hernandez (CC), Latchezar Betev (Alice )
3
G.Rahal 3
4
Worldwide LCG Collaboration L HC C omputing G rid ■ Purpose: develop, build and maintain a distributed computing environment for the storage and processing of data for the 4 LHC experiments Ensure the computing service and common (to the 4 experiments) application libraries and tools ■ Resources contributed by the countries participating in the experiments Commitments made each October year N for year N+1 Planning 5-year forward 4
5
G.Rahal LHC Data Flow Raw data generated by the detectors that need to be permanently stored These figures don't include neither the derived nor the simulated data 5 ExperimentData rate [MB/sec] Alice100 Atlas320 CMS220 LHCb50 Σ all experiments690 Accelerator duty cycle: 14 hours/day, 200 days/year 7 PB of additional raw data per nominal year Accelerator duty cycle: 14 hours/day, 200 days/year 7 PB of additional raw data per nominal year
6
G.Rahal Processing power for LHC data Computing resource requirements ■ All LHC experiments for 2009 6 Type of ResourceRequirement CPU183 M SpecInt2000 Disk storage73 PB Tape storage71 PB About 28.000 quad-core Intel Xeon 2.33 GHz (Clovertown) CPUs (14.000 compute nodes) … and 5 MW of electrical power!!! About 28.000 quad-core Intel Xeon 2.33 GHz (Clovertown) CPUs (14.000 compute nodes) … and 5 MW of electrical power!!! More than 73.000 1TB-disk spins Source: WLCG Revised Computing Capacity Requirements, Oct. 2007WLCG Revised Computing Capacity Requirements, Oct. 2007
7
G.Rahal WLCG Architecture (cont.) Resource location per tier level 7 Significant fraction of the resources distributed over 130+ centres
8
G.Rahal Tier-1 centres 8 InstitutionCountry Experiments served with priority ALICEATLASCMSLHCb CA-TRIUMFCanada ES-PICSpain FR-CCIN2P3France DE-KITGermany IT-INFN-CNAFItaly NDGFDK/FI/NO/SE NL-T1Netherlands TW-ASGCTaiwan UK-T1-RALUnited Kingdom US-T1-BNLUSA US-FNAL-CMSUSA Total61076 Source: WLCG Memorandum of Understanding – 2007/12/07WLCG Memorandum of Understanding
9
G.Rahal 9
10
LCG-France project Goal ■ Setup, develop and maintain a WLCG Tier-1 and an Analysis Facility at CC-IN2P3 ■ Promote the creation and coordinate the integration of Tier-2/Tier3 French sites into the WLCG collaboration Funding ■ national funding for tier-1 and AF ■ Tier-2s and tier-3s funded by universities, local/regional governments, hosting laboratories, … Schedule ■ Started in June 2004 ■ 2004-2008: setup and ramp-up phase ■ 2009 onwards: cruise phase Equipment budget for Tier-1 and Analysis Facility ■ 2005-2012: 32 M€
11
G.Rahal LCG-France 11 Lyon Clermont-Ferrand Annecy Marseille Nantes Ile-de-France Strasbourg CC-IN2P3: tier-1 & analysis facility CC-IN2P3: tier-1 & analysis facility Subatech: tier-2 GRIF: tier-2 APC CEA/DSM/IRFU IPNO LAL LLR LPNHE GRIF: tier-2 APC CEA/DSM/IRFU IPNO LAL LLR LPNHE LPC: tier-2 IPHC: tier-3 LAPP: tier-2 CPPM: tier-3 IPNL: tier-3 Source: http://lcg.in2p3.frhttp://lcg.in2p3.fr Grenoble LPSC: tier-3
12
G.Rahal Associated Tier-2s 12 IHEP- ATLAS/CMS TIER-2 in BEIJING ICEPP – ATLAS TIER-2 in TOKYO CC-IN2P3 - LYON ROMANIAN ATLAS FEDERATION BELGIUM CMS TIER-2s
13
G.Rahal LCG-France sites AliceAtlasCMSLHCb T1 CC-IN2P3 llll Tier-2 Analysis Facility Lyon llll GRIF (Paris Region) llll LAPP (Annecy) l l LPC (Clermont) ll l Subatech (Nantes) l Tier-3 CPPM (Marseille) l l IPHC (Strasbourg) l l IPNL (Lyon) l l LPSC (Grenoble) ll Most sites serve the needs of more than one experiment and group of users 13
14
G.Rahal 14 Tier-2s planned contribution LCG-France target
15
G.Rahal Connectivity Excellent connectivity to other national and international institutions provided by RENATER The role of the national academic & research network is instrumental for the effective deployment of the grid infrastructure 15 Source: Frank Simon, RENATER Cadarache Tours Le Mans Angers Kehl Dark fiber 2,5 Gbit/s link 1 Gbit/s (GE) link Genève (CERN)
16
G.Rahal 16
17
G.Rahal LCG-France tier-1 & AF 17 Roughly equivalent to 305 Thumpers (with 1TB disks) or 34 racks
18
G.Rahal 18 LCG-France tier-1 & AF contribution
19
G.Rahal 2007 CPU ongoing activity at CC 19 ATLAS CMS Alice LHCb NOTE: scale is not the same on all plots 20072008 20072008
20
G.Rahal Resource usage (tier-1 + AF) 20
21
G.Rahal Resource deployment 21 X 6.7
22
G.Rahal Resource deployment (cont.) 22 X 3.1
23
G.Rahal LCG tier-1: availability & reliability 23 Scheduled shutdown of services on: 18/09/2007 03/11/2007 11/03/2008 Scheduled shutdown of services on: 18/09/2007 03/11/2007 11/03/2008 Source: WLCG T0 & T1 Site Reliability ReportsWLCG T0 & T1 Site Reliability Reports
24
G.Rahal LCG tier-1: availability & reliability (cont.) 24 Source: WLCG T0 & T1 Site Reliability ReportsWLCG T0 & T1 Site Reliability Reports
25
G.Rahal 25
26
G.Rahal Validation program: Goal Registration of Data in T0 and on the GRID. T0 T1 replication Condition Data on the GRID Quasi online reconstruction ■ Pass 1 at T0 ■ Reprocessing at T1 ■ Replication of ESD : T1 T2/CAFs Quality Control MC production and user’s analysis at T2/CAFs 26
27
G.Rahal 27 Data flow and rates CASTOR2DAQ rfcp CAF xrootd T1 storage reco@T0 FTS Gridftp 60MB/s xrootd First part: ½ nominal acquisition rate p+p (DAQ) + nominal rate for distribution Average:60MB/s Pic: 3GB/s Source: L. Betev
28
G.Rahal CCRC08 15 February- 10 March Tests with half the DAQ-to-CASTOR rates 28 82TB total with 90K files (0.9 GB/file) 70% of the nominal monthly volume p+p
29
G.Rahal T0 T1 replication 29 End of data taking Expected Rate: 60 MB/s
30
G.Rahal T0 T1 replication ALL 30 CCRC phase 1CCRC phase 2
31
G.Rahal T0 CC-IN2P3 31 End of data taking Tests before Run III (May)
32
G.Rahal T0 CC-IN2P3 ALL 32 Goal: 160 MB/sec or 14 TB/day Note: the expected rates are still unknown for some experiments (and keep changing). This is the goal according to the Megatable, which is the reference document (even if it is no longer maintained) Note: the expected rates are still unknown for some experiments (and keep changing). This is the goal according to the Megatable, which is the reference document (even if it is no longer maintained)
33
G.Rahal Alice CCRC08 : May period Detector activities Alice offline upgrades New VO-box installation New AliEn version Tuning of reconstruction software Exercise of ‘fast lane’ calib/alignment procedure… Data Replication T0->T1 Scheduled according to Alice shares 33
34
G.Rahal May: All 4 experiments concurrently 34 Goal: 160 MB/sec or 14 TB/day Note: the expected rates are still unknown for some experiments (and keep changing). This is the goal according to the Megatable, which is the reference document (even if it is no longer maintained) Note: the expected rates are still unknown for some experiments (and keep changing). This is the goal according to the Megatable, which is the reference document (even if it is no longer maintained) Tier-0 → CCIN2P3
35
G.Rahal Post Mortem CCRC08 Reliable central Data Distribution High CC-IN2P3 efficiency/stability (dCache, FTS,…) Good and high performance of French Tier 2s Shown large security margins for transfers between T1 and T2 35
36
G.Rahal High priority: Analysis Farm 1/2 Time to Concentrate on users analysis: ■ Must take place in parallel with other tasks ■ Unscheduled burst access to the data ■ User expects fast return of her/his output ■ Interactivity…. At CC ongoing activity: ■ identify the needs ■ Setup a common infrastructure for the 4 LHC experiments. 36
37
G.Rahal High priority: Analysis Farm 2/2 At CC ongoing activity cont’d: ■ Goal: prototype to be tested beginning of 2009. Alice Specifics: ■ Farm design already in test at CERN. Expect to deploy one in France according to specs but shareable with other experiments. 37
38
G.Rahal General Issues for CC-IN2P3 Improve each component: ■ Storage: higher performances for HPSS and improved interactions with dCache, ■ Increase level of redundancy of the services to decrease human interventions (Voboxes, LFC,…) ■ Monitoring, Monitoring, Monitoring….. Manpower: Need to reach higher level of staffing, mainly for storage. 38
39
G.Rahal Conclusion 2008 challenge has shown the Capability of LCG-France to meet the challenges of the computing for LHC It has also shown the need of permanent background test and monitoring of the worldwide platform Need to improve the level of reliability of storage and data distribution components. 39
40
G.Rahal 40
41
G.Rahal ALICE Computing Model p-p: ■ Quasi-online data distribution and first reco at T0 ■ Further reconstruction at Tiers-1s AA ■ Calibration, alignment and pilot recon during data taking ■ data distribution and first reco At T0 One Copy of RAW at T0 and one among Tier-1s 41
42
G.Rahal ALICE Computing Model T0: ■ First pass reco, storage of 1 Copy of RAW, ■ Calibration and first pass ESD. T1 ■ Storage of % of RAW, ESD’s and AODs on disk ■ Reconstructions ■ Scheduled analysis T2 ■ Simulation ■ End User analysis ■ Copy of ESD and AOD 42
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.