The LHC Computing Grid – February 2008 The Challenges of LHC Computing Dr Ian Bird LCG Project Leader 6 th October 2009 Telecom 2009 Youth Forum
2 Collisions at the LHC: summary 26 June 2009Ian Bird, CERN
4 pp collisions at 14 TeV at cm -2 s -1 How to extract this: (Higgs 4 muons) From this: With: 20 proton-proton collisions overlap And this repeats every 25 ns… A very difficult environment … Z at LEP (e+e-) 26 June 2009Ian Bird, CERN
The LHC Computing Challenge 5 Signal/Noise: (10 -9 offline) Data volume High rate * large number of channels * 4 experiments 15 PetaBytes of new data each year Compute power Event complexity * Nb. events * thousands users 100 k of (today's) fastest CPUs 45 PB of disk storage Worldwide analysis & funding Computing funding locally in major regions & countries Efficient analysis everywhere GRID technology 26 June 2009Ian Bird, CERN
6 A collision at LHC 26 June 2009Ian Bird, CERN
7 The Data Acquisition 26 June 2009
Ian Bird, CERN GB/sec (ions) Tier 0 at CERN: Acquisition, First pass processing Storage & Distribution 26 June 2009
Tier 0 – Tier 1 – Tier 2 Ian Bird, CERN9 Tier-0 (CERN): Data recording Initial data reconstruction Data distribution Tier-1 (11 centres): Permanent storage Re-processing Analysis Tier-2 (~130 centres): Simulation End-user analysis 26 June 2009
(w)LCG – Project and Collaboration LCG was set up as a project in 2 phases: – Phase I – Development & planning; prototypes End of this phase the computing Technical Design Reports were delivered (1 for LCG and 1 per experiment) – Phase II – – Deployment & commissioning of the initial services Program of data and service challenges During Phase II, the WLCG Collaboration was set up as the mechanism for the longer term: – Via an MoU – signatories are CERN and the funding agencies – Sets out conditions and requirements for Tier 0, Tier 1, Tier 2 services, reliabilities etc (“SLA”) – Specifies resource contributions – 3 year outlook 26 June Ian Bird, CERN
De-FZK US-FNAL Ca- TRIUMF NDGF CERN Barcelona/PIC Lyon/CCIN2P3 US-BNL UK-RAL Taipei/ASGC 26 June 2009Ian Bird, CERN11 Today we have 49 MoU signatories, representing 34 countries: Australia, Austria, Belgium, Brazil, Canada, China, Czech Rep, Denmark, Estonia, Finland, France, Germany, Hungary, Italy, India, Israel, Japan, Rep. Korea, Netherlands, Norway, Pakistan, Poland, Portugal, Romania, Russia, Slovenia, Spain, Sweden, Switzerland, Taipei, Turkey, UK, Ukraine, USA. Today we have 49 MoU signatories, representing 34 countries: Australia, Austria, Belgium, Brazil, Canada, China, Czech Rep, Denmark, Estonia, Finland, France, Germany, Hungary, Italy, India, Israel, Japan, Rep. Korea, Netherlands, Norway, Pakistan, Poland, Portugal, Romania, Russia, Slovenia, Spain, Sweden, Switzerland, Taipei, Turkey, UK, Ukraine, USA. WLCG Today Tier 0; 11 Tier 1s; 61 Tier 2 federations (121 Tier 2 sites) WLCG Today Tier 0; 11 Tier 1s; 61 Tier 2 federations (121 Tier 2 sites) Amsterdam/NIKHEF-SARA Bologna/CNAF
Preparation for accelerator start up Since 2004 WLCG has been running a series of challenges to demonstrate aspects of the system; with increasing targets for: – Data throughput – Workloads – Service availability and reliability Recent significant challenges – May 2008 – Combined Readiness Challenge All 4 experiments running realistic work (simulating data taking) Demonstrated that we were ready for real data – June 2009 – Scale Testing Stress and scale testing of all workloads including massive analysis loads In essence the LHC Grid service has been running for several years 1226 June 2009Ian Bird, CERN
Data transfer Full experiment rate needed is 650 MB/s Desire capability to sustain twice that to allow for Tier 1 sites to shutdown and recover Have demonstrated far in excess of that All experiments exceeded required rates for extended periods, & simultaneously All Tier 1s have exceeded their target acceptance rates Full experiment rate needed is 650 MB/s Desire capability to sustain twice that to allow for Tier 1 sites to shutdown and recover Have demonstrated far in excess of that All experiments exceeded required rates for extended periods, & simultaneously All Tier 1s have exceeded their target acceptance rates
Grid Activity – distribution of CPU delivered 26 June 2009Ian Bird, CERN14 Distribution of work across Tier0/Tier1/Tier 2 really illustrates the importance of the grid system – Tier 2 contribution is ~ 50%; – >85% is external to CERN Tier 2 sites Tier 0 + Tier 1 sites
First events 26 June Ian Bird, CERN
WLCG depends on two major science grid infrastructures …. EGEE - Enabling Grids for E-Science OSG - US Open Science Grid 16 Interoperability & interoperation is vital significant effort in building the procedures to support it Interoperability & interoperation is vital significant effort in building the procedures to support it 26 June 2009Ian Bird, CERN
Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE scale users LCPUs (cores) 25Pb disk 39Pb tape 12 million jobs/month +45% in a year 268 sites +5% in a year 48 countries +10% in a year 162 VOs +29% in a year Technical Status - Steven Newhouse - EGEE-III First Review June
CERN Computing – in numbers Computing – CPU: – 5700 systems = cores (+ planned 3000 systems, cores) – Used for CPU servers, disk servers, general services Computing – disk: – TB on disk drives (+ planned TB on drives) Computing – tape: – TB on tape cartridges – tape slots in robots, 160 tape drives Computer centre: – 2.9 MW usable power, + ~1.5 MW for cooling Ian Bird, CERN18
26 June 2009Ian Bird, CERN19