GridPP Deployment Status, User Status and Future Outlook Tony Doyle.

Slides:



Advertisements
Similar presentations
D. Britton GridPP Status - ProjectMap 8/Feb/07. D. Britton08/Feb/2007GridPP Status GridPP2 ProjectMap.
Advertisements

London Tier2 Status O.van der Aa. Slide 2 LT 2 21/03/2007 London Tier2 Status Current Resource Status 7 GOC Sites using sge, pbs, pbspro –UCL: Central,
Slide 1 Steve Lloyd Grid Brokering Meeting - 4 Dec 2006 GridPP Steve Lloyd Queen Mary, University of London Grid Brokering Meeting December 2006.
S.L.LloydATSE e-Science Visit April 2004Slide 1 GridPP – A UK Computing Grid for Particle Physics GridPP 19 UK Universities, CCLRC (RAL & Daresbury) and.
1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005.
Grid Security Policy GridPP18, Glasgow David Kelsey 21sr March 2007.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow GridPP Oversight Committee Meeting.
NorthGrid status Alessandra Forti Gridpp15 RAL, 11 th January 2006.
GridPP9 – 5 February 2004 – Data Management DataGrid is a project funded by the European Union GridPP is funded by PPARC GridPP2: Data and Storage Management.
Deployment metrics and planning (aka Potentially the most boring talk this week) GridPP16 Jeremy Coles 27 th June 2006.
1 ALICE Grid Status David Evans The University of Birmingham GridPP 16 th Collaboration Meeting QMUL June 2006.
Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
Storage Review David Britton,21/Nov/ /03/2014 One Year Ago Time Line Apr-09 Jan-09 Oct-08 Jul-08 Apr-08 Jan-08 Oct-07 OC Data? Oversight.
LCG WLCG Operations John Gordon, CCLRC GridPP18 Glasgow 21 March 2007.
Southgrid Status Pete Gronbech: 21 st March 2007 GridPP 18 Glasgow.
Tony Doyle Executive Summary, PPARC, MRC London, 15 May 2003.
Project Status David Britton,15/Dec/ Outline Programmatic Review Outcome CCRC08 LHC Schedule Changes Service Resilience CASTOR Current Status Project.
RAL Tier1: 2001 to 2011 James Thorne GridPP th August 2007.
User Board - Supporting Other Experiments Stephen Burke, RAL pp Glenn Patrick.
S.L.LloydGridPP Collaboration Meeting IC Sept 2002Slide 1 Introduction Welcome to the 5 th GridPP Collaboration Meeting Steve Lloyd, Chair of GridPP.
GridPP2 Status Tony Doyle. OC Actions 1.GridPP TO PROVIDE DATA ON WHAT FRACTION OF THE REGISTERED USERS WERE MAKING THE GREATEST USAGE OF THE RESOURCES.
Middleware Roadmap for GridPP3 R.Middleton GridPP16- QMUL.
GridPP: Executive Summary Tony Doyle. Tony Doyle - University of Glasgow Oversight Committee 11 October 2007 Exec 2 Summary Grid Status: Geographical.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow GridPP24 Collaboration Meeting.
Tony Doyle GridPP2 Proposal, BT Meeting, Imperial, 23 July 2003.
The National Grid Service Mike Mineter.
Dave Kant Grid Monitoring and Accounting Dave Kant CCLRC e-Science Centre, UK HEPiX at Brookhaven 18 th – 22 nd Oct GOSC Oct 28.
The LHC experiments AuthZ Interoperation requirements GGF16, Athens 16 February 2006 David Kelsey CCLRC/RAL, UK
Service Data Challenge Meeting, Karlsruhe, Dec 2, 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Plans and outlook at GridKa Forschungszentrum.
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Torsten Antoni – LCG Operations Workshop, CERN 02-04/11/04 Global Grid User Support - GGUS -
GridPP From Prototype to Production David Britton 21/Sep/06 1.Context – Introduction to GridPP 2.Performance of the GridPP/EGEE/wLCG Grid 3.Some Successes.
GridPP 12th Collaboration Meeting Networking: Current Status Robin Tasker 31 January 2005.
15 May 2006Collaboration Board GridPP3 Planning Executive Summary Steve Lloyd.
2 GridPP2 Budget David Britton, 4/12/03 Imperial College.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
Quarterly report ScotGrid Quarter Fraser Speirs.
S.L.LloydGridPP CB 29 Oct 2002Slide 1 Agenda 1.Introduction – Steve Lloyd 2.Minutes of Previous Meeting (23 Oct 2001) 3.Matters Arising 4.Project Leader's.
Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
Quarterly report SouthernTier-2 Quarter P.D. Gronbech.
GridPP: the UK's contribution to the international collaboration building a worldwide Grid, the LHC Computing Grid GridPP – is the system usable? Tony.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
BINP/GCF Status Report BINP LCG Site Registration Oct 2009
Monitoring the Grid at local, national, and Global levels Pete Gronbech GridPP Project Manager ACAT - Brunel Sept 2011.
Quarterly report ScotGrid Quarter Fraser Speirs.
Monitoring in EGEE EGEE/SEEGRID Summer School 2006, Budapest Judit Novak, CERN Piotr Nyczyk, CERN Valentin Vidic, CERN/RBI.
3 June 2004GridPP10Slide 1 GridPP Dissemination Sarah Pearce Dissemination Officer
GridPP3 Project Management GridPP20 Sarah Pearce 11 March 2008.
Tony Doyle - University of Glasgow 1 July 2005Oversight Committee GridPP: Executive Summary Tony Doyle.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
11 March 2008 GridPP20 Collaboration meeting David Britton - University of Glasgow GridPP Status GridPP20 Collaboration Meeting, Dublin David Britton,
GridPP Deployment Status GridPP14 Jeremy Coles 6 th September 2005.
GridPP: Executive Summary Tony Doyle. Tony Doyle - University of Glasgow Oversight Committee 8 February 2007 Outline Exec 2 Summary Grid status High level.
GridPP Building a UK Computing Grid for Particle Physics Professor Steve Lloyd, Queen Mary, University of London Chair of the GridPP Collaboration Board.
Tony Doyle - University of Glasgow 8 July 2005Collaboration Board Meeting GridPP Report Tony Doyle.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
LCG Storage Accounting John Gordon CCLRC – RAL LCG Grid Deployment Board September 2006.
Tony Doyle - University of Glasgow Introduction. Tony Doyle - University of Glasgow 6 November 2006ScotGrid Expression of Interest Universities of Aberdeen,
UK Tier 1 Centre Glenn Patrick LHCb Software Week, 28 April 2006.
PIC port d’informació científica EGEE – EGI Transition for WLCG in Spain M. Delfino, G. Merino, PIC Spanish Tier-1 WLCG CB 13-Nov-2009.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
LCG User Level Accounting John Gordon CCLRC-RAL LCG Grid Deployment Board October 2006.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
J Jensen/J Gordon RAL Storage Storage at RAL Service Challenge Meeting 27 Jan 2005.
LCG Accounting Update John Gordon, CCLRC-RAL 10/1/2007.
Update on Plan for KISTI-GSDC
Collaboration Board Meeting
GridPP: Executive Summary
Presentation transcript:

GridPP Deployment Status, User Status and Future Outlook Tony Doyle

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Introduction A.What is the deployment status? B.Is the system usable? C.What is the future of GridPP? Wot no middleware?

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 GridPP Middleware is.. Security Network Monitoring Information Services Grid Data Management Storage Interfaces Workload Management Middleware

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 e.g. LCG monitoring applet Monitor: –resource brokers –virtual organisations ATLAS CMS LHCb DTeam Other SQL queries to logging and book-keeping database Middleware

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 e.g. APEL and R-GMA R-GMA structure used in accounting system (GOCDB) For gLite the sensors are provided by DGAS via DGAS2APEL the EGEE portal for accounting data is provided by CESGA Middleware

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Resources 17/12/06: EGEE total slots => UKI is 6949 ~20% of the total 17/12/06: EGEE jobs running => UKI is 2912 ~ 14% jobs Max EGEE = Max UKI = 8176 (N.B. hyperthreading distorts 1:1 job:CPU core relation – reduces UKI numbers by ~500) Sundays STATUStotalCPUfreeCPUrunJobwaitJobseAvail TBseUsed TBmaxCPUavgCPU Total Steady climb since 2004 towards target of ~10,000 CPU (cores) (~job slots)

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December Resources 2006 CPU Usage by Region Via APEL accounting

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 (not all records are being accounted) Resources

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December CPU Usage by experiment Resources Total CPU used 52,876,788 kSI2k-hours!

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 (Estimated utilisation based on gstat job slots/usage) UKI mirrors overall EGEE utilisation Average Utilisation for Q306: 66% Compared to target of ~70% CPU utilisation was a T2 issue, but now improving.. Utilisation

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 (measured by UK Tier-1 for all VOs) ~90% CPU efficiency due to i/o bottlenecks is OK Concern that this is currently ~75% Efficiency Each experiment needs to work to improve their system/deployment practice anticipating e.g. hanging gridftp connections during batch work target

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 (is still an issue for Tier-1 and Tier-2s) Utilisation is low (~30%) at T2s and accounting [by VO] is not (yet) there Storage

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 GOCDB Accounting Display - under development Looking at data for RAL-LCG2 Storage units are 1TB = 10^6 MB Tape Used + Disk Used = Total Sensor Drop Outs have been fixed Total Used Storage (TB) Tape Used Disk Used Storage

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 SRM at T1 ~200TB of disk (deployment problem in 2006) –~100% usage (problem for 2006 service challenges) –Castor 2.1 SRM at all T2s ~200TB of disk in total –~30% usage: difficult to calculate –dCache and DPM v –Dedicated disk servers advised (storage should be robust) Need to make sure sites are running the latest GIP plugins ( New GOC storage accounting system being put in place being deployed at Tier-2s SRM v2.2 is being implemented: need to test interoperability Storage

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 (individual rates) Aim: to maintain data transfers at a sustainable level as part of experiment service challenges File Transfers Current goals:goals >250Mb/s inbound-only >250Mb/s outbound-only >200Mb/s inbound and outbound

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Approval for new (shared) machine room – ETA Summer Space for 300 racks. Procurement –March 06: 52 AMD 270 units, 21 disk servers (168TB data capacity) –FY 06/07: 47 disk servers (282TB disk capacity), 64 twin dual-core Intel Woodcrest 5130 units (550kSI2K) –FY 06/07 upcoming: further 210 TB disk capacity plus high-availability systems (redundant PSUs, hot-swappable paired HDDs) Storage commissioning saga –Ongoing problems with March kit. Firmware updates have now solved problem. (Disks on Areca 1170 in raid 6 experienced multiple dropouts during testing of WD drives) Move to CASTOR –Very support heavy but made available for CSA06 and performing well General - Air-con problems with high-temperatures triggering high pressure cut-outs in refrigerator gas circuits (summers are warmer even in the UK...) - July security incident - 10Gb CERN line in place. Second 10Gb line scheduled in 07Q1 Tier-1 Resource

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 e.g. Glasgow: UKI-SCOTGRID-GLASGOW 800 kSI2k 100 TB DPM Needed for LHC s t a rt- u p August 28 September 1 October 13 October 23 T2 Resources IC-HEP 440 KSI2K 52 TB dCache Brunel 260 KSI2K 5 TB DPM

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Could also be 2006 T2 Resources As overheard at one T2 site..

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 A. Usability (Prequel) GridPP runs a major part of the EGEE/LCG Grid, which supports ~3000 users The Grid is not (yet) as transparent as end- users want it to be The underlying overall failure rate is ~10% User (interface)s, middleware and operational procedures (need to) adapt Procedures to manage the underlying problems such that system is usable are highlighted

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Virtual Organisations Users are grouped into VOs –Users/VO varies from 1 to 806 members (and growing..) Broadly four classes of VO –LHC experiments –EGEE supported –Worldwide (mainly non-LHC particle physics) –Local/regional e.g. UK PhenoGrid Sites can choose which VOs to support, subject to MOU/funding commitments –Most GridPP sites support ~20 VOs –GridPP nominally allocates 1% of resources to EGEE non-HEP VOs –GridPP currently contributes 30% of the EGEE CPU resources

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 User evolution Number of users of the UK Grid (exc. Deployment Team) Quarter: 05Q4 06Q206Q3 Value: Many EGEE VOs supported c.f EGEE target Number of active users (> 10 jobs per month) Quarter: 05Q4 06Q1 06Q2 Value: Fraction: 6.2% 11.0% Viewpoint: growing fairly rapidly, but not as active as they could be? depends on the active definition

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December atlas 763 dzero 577 cms 566 dteam 150 lhcb 131 alice 75 bio 65 dteamsgm 41 esr 31 ilc 27 atlassgm 27 alicesgm 21 cmsprg 18 atlasprg 17 fusn 15 zeus 13 dteamprg 13 cmssgm 11 hone 9 pheno 9 geant 7 babar 6 aliceprg 5 lhcbsgm 5 biosgm 3 babarsgm 2 zeussgm 2 t2k 2 geantsgm 2 cedar 1 phenosgm 1 minossgm 1 lhcbprg 1 ilcsgm 1 honesgm 1 cdf Know your users? UK-enabled VOs

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Resource allocation Assign quotas and priorities to VOs and measure delivery, but further work required on VOMS-roles/groups within each VO VOMS provides group/role information in the proxy Tools to control quotas and priorities in site services being developed –So far only at whole-VO level –Maui batch scheduler is flexible, easy to map to groups/roles –Sites set the target shares –Can publish VO/group-specific values in GLUE schema, hence the RB can use them for scheduling Accounting tool (APEL) measures CPU use at global level (UK task) –Storage accounting currently being added –GridPP monitors storage across UK –Privacy issues around user-level accounting, being solved by encryption

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 User Support Becoming vital as the number of users grows –But modest effort available in the various projects Global Grid User Support (GGUS) portal at Karlsruhe provides a central ticket interface –Problems are categorised Tickets are classified by an on-duty Ticket Process Manager, and assigned to an appropriate support unit –UK (GridPP) contributes support effort GGUS has a web-service interface to ticketing systems at each ROC –Other support units are local mailing lists –Mostly best-effort support, working hours only Currently ~tens of tickets/week –Manageable, but may not scale much further –Some tickets slip through the net

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Documentation & Training Need documentation and training for both system managers and users –Mostly expert users up to now, but user community is expanding –Induction of new VOs is a particular problem – no peer support –EGEE is running User Fora for users to share experience Next in Manchester in May 07 (with OGF) –EGEE has a dedicated training activity run by NeSC/Edinburgh Documentation is often a low priority, little dedicated effort –The rapid pace of change means that material requires constant review Effort on documentation is now increasing –GridPP has appointed a documentation officer GridPP web site, wiki –Installation manual for admins is good There is also a wiki for admins to share experience –Focus is now on user documentation New EGEE web site – coming soon

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Alternative view? The number of users in the Grid School for the Gifted is ~manageable now The system may be too complex, requiring too much work by the average user? Or the (virtual) help desk may not be enough? Or the documentation may be misleading? Or.. Having smart users helps (the current ones are)

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Timeline – 1 Proposal WritingProposal Defence Apr MayJunJulAugSepOct 31 st March – PPARC Call 16 th June – GridPP16 at QMUL 6 th September – 1 st PPRP review 1 st November – GridPP17 8 th November PPRP visiting panel 13 th July – Bid Submitted CBOCCB Future? Year-long process to define future LHC exploitation

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Scenario Planning – Resource Requirements [TB, kSI2k] GridPP requested a fair share of global requirements, according to experiment requirements Changes in the LHC schedule prompted a(nother) round of resource planning - presented to CRRB on Oct 24 th New UK resource requirements have been derived and incorporated in the scenario planning e.g. Tier-1

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Input to Scenario Planning – Hardware Costing Empirical extrapolations with extrapolated (large) uncertainties Hardware prices have been re-examined following recent Tier-1 purchase CPU (woodcrest) was cheaper than expected based on extrapolation of previous 4 years of data

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Scenario Planning An example 70% scenario based on Experiment Inputs [£m]

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Timeline – 2 Nov DecJanFebMarAprMay 8 th Nov –PPRP Visiting Panel 6 th Dec – PPRP recommend to SC PPARC Council Science Committee Grants etc. GridPP2+ outcome (1/9/07-31/3/08) now known emphasis on operations (modest middleware support) Anticipates GridPP3 outcome (1/4/08-31/3/11) known in the New Year Back to the Future?

Tony Doyle - University of Glasgow INFNGrid Meeting 20 December 2006 Conclusion A.What is the deployment status? (snapshot) See e.g. Performance of the UK Grid for Particle Physics for more info. B.Is the system usable? Yes, but more work required from end-user perspective C.What is the future of GridPP? Operations-led activity, working with EGEE/EGI (EU) and NGS (UK)