GridPP From Prototype to Production David Britton 21/Sep/06 1.Context – Introduction to GridPP 2.Performance of the GridPP/EGEE/wLCG Grid 3.Some Successes and Challenges
D. Britton 21/Sep/06 GridPP The CERN LHC 4 Large Experiments The world’s most powerful particle accelerator
D. Britton 21/Sep/06 GridPP ALICE - heavy ion collisions, to create quark-gluon plasmas - 50,000 particles in each collision LHCb - to study the differences between matter and antimatter - will detect over 100 million b and b-bar mesons each year ATLAS - General purpose - Origin of mass - Supersymmetry - 2,000 scientists from 34 countries CMS - General purpose - 1,800 scientists from over 150 institutes “One Grid to Rule Them All”? The LHC Experiments
D. Britton 21/Sep/06 GridPP Why do particle physicists need the Grid? Concorde (15 Km) Mt. Blanc (4.8 Km) One year’s data from LHC would fill a stack of CDs 20km high 100 million electronic channels 800 million proton-proton interactions per second Higgs per second 10 PBytes of data a year (10 million GBytes = 14 million CDs)
D. Britton 21/Sep/06 GridPP Why do particle physicists need the Grid? Example from LHC: starting from this event… …we are looking for this “signature” Selectivity: 1 in Like looking for 1 person in a thousand world populations Or for a needle in 20 million haystacks
D. Britton 21/Sep/06 GridPP 19 UK Universities, CCLRC (RAL & Daresbury) Funded by PPARC. GridPP "From Web to Grid" GridPP "From Prototype to Production" GridPP "From Production to Exploitation" Who are GridPP?
D. Britton 21/Sep/06 GridPP Global Context EDG EGEE-IEGEE-II LHC Data Taking GridPP1 GridPP2GridPP3 EGI ? GridPP EDG EGEE LCG ( Many) Evolving standardsDeveloping requirements Changing Costs and budgets Experience wLCG
D. Britton 21/Sep/06 GridPP Tier Structure Tier 0 Tier 1 National centres Tier 2 Regional groups Tier 3 Institutes Offline farm Online system CERN computer centre RAL,UK ScotGridNorthGridSouthGridLondon ItalyUSA GlasgowEdinburghDurham France Germany Detector
D. Britton 21/Sep/06 GridPP UK Tier-1/A Centre High quality data services National and International Role UK focus for International Grid development 1000 Dual CPU 330 TB Disk 532 TB Tape Grid Operations Centre
D. Britton 21/Sep/06 GridPP UK Tier-2 Centres ScotGrid Durham, Edinburgh, Glasgow NorthGrid Daresbury, Lancaster, Liverpool, Manchester, Sheffield SouthGrid Birmingham, Bristol, Cambridge, Oxford, RAL PPD, Warwick London Brunel, Imperial, QMUL, RHUL, UCL
D. Britton 21/Sep/06 GridPP Grid Performance Health of the Grid continuously monitored: - Site Functional Test (SFT) - Grid Status Monitor (GSTAT) - Resource Broker Logging - CPU/Storage accounting - Migration to “Service Availability Monitoring” (SAM)
D. Britton 21/Sep/06 GridPP Job Slots Number of published UK job slots from GSTAT Significant increase in early 2006
D. Britton 21/Sep/06 GridPP Availability Average monthly SITE and CPU availability from SFT. Large contribution from UKI
D. Britton 21/Sep/06 GridPP Availability by UK Site
D. Britton 21/Sep/06 GridPP Usage Usage over a week (16-27 June 2006) from the Resource Broker Logs Publishing problem.
D. Britton 21/Sep/06 GridPP Active Users (EGEE – wide) by LHC experiment ALICE (8) CMS (150) ATLAS (70) LHCb (40)
D. Britton 21/Sep/06 GridPP Active users at RAL Number of registered users (exc. DTEAM) Quarter: 05Q4 06Q1 06Q2 06Q3 Value: Number of active users (> 10 jobs) Quarter: 05Q4 06Q1 06Q2 06Q3 Value: Fraction: 6.2% 11.0%
D. Britton 21/Sep/06 GridPP Virtual Organisations Supported
D. Britton 21/Sep/06 GridPP 2005 CPU Usage/Efficiency
D. Britton 21/Sep/06 GridPP 2006 CPU Usage/Efficiency
D. Britton 21/Sep/06 GridPP 2006 Job Efficiency Major LHC experiments achieve ~90%
D. Britton 21/Sep/06 GridPP Tier Centre Efficiency Successful hours (from RB logs) for each Tier Centre (Jan to Apr 2006) Here, “Efficiency” = Efficiencies for each Tier Centre. 90% Successful Time Total Time
D. Britton 21/Sep/06 GridPP Tier-1 Tier-2s Storage Accounting Used Unused
D. Britton 21/Sep/06 GridPP Resources Delivered by the Tier-2 Sites CPU delivered (05Q4 and 06Q1) Disk delivered (05Q4 and 06Q1) Disk utilisation at Tier-2 sites has been low (and new purchases were strategically delayed). Situation addressed in 2006 (initial one-to-one relationship with LHC experiments). Occupancy rose to ~40% level by Q
D. Britton 21/Sep/06 GridPP Tickets Raised
D. Britton 21/Sep/06 GridPP Scheduled Downtime
D. Britton 21/Sep/06 GridPP Upgrades
D. Britton 21/Sep/06 GridPP Tier-0 to Tier-1 Data Transfers worldwide data transfers > 950MB/s for 1 week peak transfer rate from CERN of >1.6GB/s Ongoing experiment transfers as part of current service challenges
D. Britton 21/Sep/06 GridPP Tier-1 to Tier-2 Data Transfers UK data transfers >1000Mb/s for 3 days Peak transfer rate from RAL of >1.5Gb/s Need high data rate transfers to/from RAL as a routine activity
D. Britton 21/Sep/06 GridPP Summary “From Prototype to Production” is about understanding and improving performance. Monitoring, understanding, and improving performance of a Grid is, in itself, a Grid challenge. Many tools and metrics have, and are, being developed to measure and monitor the GridPP/EGEE/wLCG Grid performance are now providing feedback.
D. Britton 21/Sep/06 GridPP Many Successes…. 95% efficiencies 3 PB of data GridPP3 Proposal Industry Schools Physics! MOU signed Einstein Security Wiki
D. Britton 21/Sep/06 GridPP …and many Challenges A year later: Progress in all areas but the challenges remain.