Ian Bird LCG Project Leader WLCG Collaboration Issues WLCG Collaboration Board 24 th April 2008.

Slides:



Advertisements
Similar presentations
LCG-France Project Status Fabio Hernandez Frédérique Chollet Fairouz Malek Réunion Sites LCG-France Annecy, May
Advertisements

Ian Bird LHCC Referees’ meeting; CERN, 18 th November 2014.
Ian Bird LCG Project Leader LHCC + C-RSG review. 2 Review of WLCG  To be held Feb 16 at CERN  LHCC Reviewers:  Amber Boehnlein  Chris.
Sue Foffano LCG Resource Manager WLCG – Resources & Accounting LHCC Comprehensive Review November, 2007 LCG.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Stefano Belforte INFN Trieste 1 CMS SC4 etc. July 5, 2006 CMS Service Challenge 4 and beyond.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
Resources and Financial Plan Sue Foffano WLCG Resource Manager C-RRB Meeting, 12 th October 2010.
SC4 Workshop Outline (Strong overlap with POW!) 1.Get data rates at all Tier1s up to MoU Values Recent re-run shows the way! (More on next slides…) 2.Re-deploy.
Ian Bird LHCC Referee meeting 23 rd September 2014.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
LCG Introduction John Gordon, SFTC GDB December 2 nd 2009.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 25 th April 2012.
John Gordon STFC-RAL Tier1 Status 9 th July, 2008 Grid Deployment Board.
Ian Bird LCG Project Leader OB Summary GDB 10 th June 2009.
Ian Bird GDB; CERN, 8 th May 2013 March 6, 2013
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
The LHC Computing Grid – February 2008 The Challenges of LHC Computing Dr Ian Bird LCG Project Leader 6 th October 2009 Telecom 2009 Youth Forum.
LCG CCRC’08 Status WLCG Management Board November 27 th 2007
Storage Accounting John Gordon, STFC GDB March 2013.
Ian Bird GDB CERN, 9 th September Sept 2015
Automatic Resource & Usage Monitoring Steve Traylen/Flavia Donno CERN/IT.
Procedure to follow for proposed new Tier 1 sites Ian Bird CERN, 27 th March 2012.
Ian Bird LCG Project Leader WLCG Update 6 th May, 2008 HEPiX – Spring 2008 CERN.
Site Manageability & Monitoring Issues for LCG Ian Bird IT Department, CERN LCG MB 24 th October 2006.
CCRC’08 Monthly Update ~~~ WLCG Grid Deployment Board, 14 th May 2008 Are we having fun yet?
WLCG Planning Issues GDB June Harry Renshall, Jamie Shiers.
Procedure for proposed new Tier 1 sites Ian Bird WLCG Overview Board CERN, 9 th March 2012.
LCG Report from GDB John Gordon, STFC-RAL MB meeting February24 th, 2009.
Ian Bird LCG Project Leader WLCG Status Report CERN-RRB th November, 2008 Computing Resource Review Board.
DJ: WLCG CB – 25 January WLCG Overview Board Activities in the first year Full details (reports/overheads/minutes) are at:
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
Ian Bird LCG Project Leader Updating the Resource Planning for 2009/2010.
LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.
Ian Bird LCG Project Leader WLCG Status Report CERN-RRB th April, 2008 Computing Resource Review Board.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ian Bird All Activity Meeting, Sofia
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
1 Proposal for a technical discussion group  Because...  We do not have a forum where all of the technical people discuss the critical.
LCG Issues from GDB John Gordon, STFC WLCG MB meeting September 28 th 2010.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
M.C. Vetterli – WLCG-CB, March ’09 – #1 Simon Fraser WLCG Collaboration Board Meeting Praha, March 22 nd, 2009 Thanks to Milos for hosting us.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Ian Bird LCG Project Leader WLCG Status Report 7 th May, 2008 LHCC Open Session.
WLCG Accounting Task Force Update Julia Andreeva CERN GDB, 8 th of June,
LCG Accounting Update John Gordon, CCLRC-RAL 10/1/2007.
LCG Introduction John Gordon, STFC-RAL GDB June 11 th, 2008.
LCG Introduction John Gordon, STFC-RAL GDB November 7th, 2007.
WLCG Information System Status Maria Alandes Pradillo, CERN CERN IT Department, Support for Distributed Computing Group GDB 9 th September 2015.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
Pledged and delivered resources to ALICE Grid computing in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
Ian Bird LCG Project Leader Summary of EGI workshop.
Availability of ALICE Grid resources in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
WLCG Tier-2 Asia Workshop TIFR, Mumbai 1-3 December 2006
Ian Bird WLCG Workshop San Francisco, 8th October 2016
LHC Computing Grid Status of Resources Financial Plan and Sue Foffano
Update on Plan for KISTI-GSDC
Summary from last MB “The MB agreed that a detailed deployment plan and a realistic time scale are required for deploying glexec with setuid mode at WLCG.
John Gordon, STFC GDB October 12th 2011
Input on Sustainability
Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002
LHC Data Analysis using a worldwide computing grid
WLCG Collaboration Workshop: Outlook for 2009 – 2010
Presentation transcript:

Ian Bird LCG Project Leader WLCG Collaboration Issues WLCG Collaboration Board 24 th April 2008

2 Strategic Issues  A number of aspects of WLCG where we see the need for some structuring of dialogue with the Tier 2 federations:  Reliabilities  Accounting  Resource pledges/installed capacity  Milestones  Other issues that are arising:  Engagement in EGI/NGI (etc) for future infrastructures  Resource procurement schedules/delays/process  General aspects of Tier 2 coordination/information flow:  Information from MB, engagement in GDB  Technical points – how to discuss with Tier 2s:  Move to SL5/6; pilot jobs; fabric monitoring/tools; what tools do Tier 2s miss?  What is the voice of the Tier 2’s ?

3 Recent grid use  Across all grid infrastructures  Preparation for, and execution of CCRC’08 phase 1  Move of simulations to Tier 2s Tier 2: 54% CERN: 11% Tier 1: 35% Federations not yet reporting: Finland India (IN-INDIACMS-TIFR) Norway Sweden Ukraine Federations not yet reporting: Finland India (IN-INDIACMS-TIFR) Norway Sweden Ukraine

4 Accounting for Tier-2s (1)  Test reporting took place in summer 2007 and formal reporting started from September  Monthly reports are now produced, circulated for comment and published on the LCG Project Planning website.  Currently the 52 of the 57 Federations are reporting accounting data over a total of 107 sites:  Changes still being signaled for site names therefore situation not yet fully stable  Some Federations provided pledge information from 2008 onwards and will be included in the reporting from April  Follow-up required with Finland, India, Norway, Sweden and Ukraine to include them in the accounting reporting  Slide 5 shows the global picture of reporting by country from September 2007-February  Slides 6 and 7 show the comparison of MoU pledge with CPU provided split according to size of pledge. Sue Foffano – CERN-IT-4

5 Accounting for Tier-2s (2) Sue Foffano – CERN-IT-5

6 Accounting for Tier-2s (3) Sue Foffano – CERN-IT-6

7 Accounting for Tier-2s (4) Sue Foffano – CERN-IT-7 What we don’t see here is the installed capacity

8

9 Computing Resource Pledge Responsibilities  Following the pledge revision exercise of Autumn 2007 a reminder of the process is felt necessary.  Autumn C-RRB meeting each Federation is expected to provide:  Firm commitment to pledge values for the following year  Planned pledge values for the subsequent 4 years  Spring C-RRB meeting each Federation is expected to:  Confirm that pledge values for the current year are installed and running a production service, or explain any problems for the current year or changes for future years  2 weeks before the next C-RRB on 11/11/08 the following is therefore required:  Confirmed 2009 pledge values (confirmation of already communicated value, or revised upwards)  Planned pledge values inclusive (confirmation or revision of already communicated values, ) Sue Foffano – CERN-IT-9

10 Tier 0/Tier 1 Site reliability  Target:  Sites 91% & 93% from December  8 best: 93% and 95% from December  See QR for full status Sep 07Oct 07Nov 07Dec 07Jan 08Feb 08 All89%86%92%87%89%84% 8 best93% 95% 96% Above target (+>90% target) Follow up process in MB over many months with individual sites

11 Tier 2 Reliabilities  Reliabilities published regularly since October  In February 47 sites had > 90% reliability OverallTop 50%Top 20%Sites 76%95%99%89  100  For the Tier 2 sites reporting:  For Tier 2 sites not reporting, 12 are in top 20 for CPU delivered SitesTop 50% Top 20% Sites> 90% %CPU72%40%70% Jan 08 How do we address this?

12

13 How should the federations be reported - weighted? How should the federations be reported - weighted?

14 Reliability reporting  Currently (Feb 08) All Tier 1 and 100 Tier 2 sites report reliabilities  Recent progress: MB set up group to  Agreement on equivalence of NDGF tests with those used at EGEE and all other Tier 1 sites – now in production at NDGF  Should also be used for Nordic Tier 2 sites  Similar process with OSG (for US Tier 2 sites): tests only for CE so far, agreement on equivalence, tests are in production, publication to SAM in progress  Missing – SE/SRM testing  Expect full production May 2008 (new milestone introduced)  Important that we have all Tier 2s regularly tested and reporting  Important that we have correct Tier 2 federation contact to follow up these issues

15 Reporting  Urgent now that:  Remaining Tier 2 federations start reporting on reliabilities and accounting  Follow up monthly in checking the published data – we have to understand if there are problems in the process  If the site names are wrong – please tell us what they should be (and how they map to the physical site host names)  Resource installation  We need to gather also information about installed resources at Tier 2s  Follow up process:  For Tier 1s this was done monthly in the MB, site by site – was manageable but slow; with Tier 2s this process is unwieldy (110+ sites)  Need a contact person for each federation, and would be far more convenient to have a contact for each country

WLCG April 2008: Tier 0 and 1 Resources16 Updated Resource Status Summary for May CCRC’08 For 5 May not all sites will now have their full 2008 cpu pledges available, a total of KSi2K (9600 KSi2K more than in 1Q2008 but a drop of 8000 from Feb plans). Largest missing sites are KSi2K at NL-T1 due November 2008, KSi2K at CNAF due June, KSi2K at US-CMS due end May and KSi2K at US-ATLAS due early June. For disk and tape many sites will catch up later in the year as need expands: 2008 disk requirements are 23 PB and 12.4 PB are expected to be available for 5 May (3 PB more than in 1Q2008 but a drop of 3.1 from Feb plans) while 2008 tape requirements are 24 PB and 13.6 PB are expected to be available for 5 May (4.8 PB more than in 1Q2008 but a drop of 1.4 PB from Feb plans). Disk and tape storage for May full scale dress rehearsal run of CCRC’08 are probably better modelled by requiring 55% (accelerator efficiency) times 30/100 (days running) of the increased resource requirements for 2008/9 over those of 2007/8 so 2.8 PB of disk and 3 PB of tape. Globally not a problem but some sites will not be able to fully contribute to the May CCRC if this model is correct. These requirements are to be modified with the specific April 2008 experiment requirements to be given in the next talks.

WLCG April 2008: Tier 0 and 1 Resources17 Summary of Disk Space Plans As usual the most critical resource: – ASGC: Last 300 TB delivery end June – CC-IN2P3: Last 880 TB planned for September – FZK: Last 650 TB planned for October (600 ALICE, 50 CMS) – CNAF: Last 730 TB planned for June/July – NDGF: Grow as needed reaching last 700 TB by Autumn – NL-T1: Add 800 TB by end May and last 1450 TB in November – PIC: Last 370 TB planned for early June. – RAL: Last 800 TB in acceptance, ready for end May. – TRIUMF: Full pledge for May CCRC – US-ATLAS: Add 1200 TB by end May and last 1000 TB in October – US-CMS: Full pledge for May CCRC

18 Resource procurement  This risks to be a major problem in the coming years  Important to work around the procurement processes so that we can be ready for the accelerator running each year  Has been a problem for almost all Tier 1s.  Is this also an issue for Tier 2s? 18

19 Milestones  The project has mostly had formal milestones associated with the project, Tier 0, Tier 1s  It is now time to start to impose milestones on the Tier 2s for specific issues:  E.g. Reliability, resource installation, etc.  Again, will be important to have the appropriate technical coordinators to report and follow up on these issues

20 Communication  Apart from the issues raised above,  How are the Tier 2s kept informed, and does it work?  Flow of information from Management Board, - do Tier 2s read the minutes?  Is everyone engaged in the GDB (or even aware that they can be)?  How can we structure the communication with the great number of Tier 2 sites, so that we can have a workable process to communicate problems and follow up (in both directions)??  How can we aggregate Tier 2 status to report in LHCC/OB/RRB/CB etc?  Today it is extremely difficult to get an overview of Tier 2 status and problems

21 Miscellaneous technical issues  Move to new versions of the OS – SL5/SL6  Pilot jobs/glexec – is it OK for sites to deploy this now?  Fabric monitoring –  do Tier 2s do this sufficiently?  Do they have the tools?  Security tools? – are sites appropriately protected?  What tools do Tier 2s miss?  How do Tier 2s keep abreast of these developments?  Should participate in the GDB  Is more needed?

22 Comments on EGI design study  Goal is to have a fairly complete blueprint in June  Main functions presented to NGIs in Rome workshop in March  Essential for WLCG that EGI/NGI continue to provide support for the production infrastructure after EGEE-III  We need to see a clear transition and assurance of appropriate levels of support; Transition will be  Exactly the time that LHC services should not be disrupted  Concerns:  NGIs agreed that a large European production-quality infrastructure is a goal  Not clear that there is agreement on the scope  Reluctance to accept level of functionality required  Tier 1 sites (and existing EGEE expertise) not well represented by many NGIs  WLCG representatives must approach their NGI reps and ensure that EGI/NGIs provide the support we need These comments apply equally to Tier 2s - they really need to engage with the NGI in their countries

23 EGI/NGI cont.  While WLCG should work hard to make sure that the EGI design study goes in the right direction,  Strategically the project must be prepared to plan for a fall-back  Tier 1s were questioned in the OB – all replied that they had some plan in place if there were no EGI/NGI  Albeit with a potential reduction in what they could contribute  We need to start thinking about what the Tier 2s can do  It will be clear in June whether the EGI_DS blueprint provides what we need  Put together a group to begin to look at fallback plans for Tier 2s?

24 Summary  A number of aspects of WLCG where we see the need for some structuring of dialogue with the Tier 2 federations:  General aspects of Tier 2 coordination/information flow:  Information from MB, engagement in GDB  Technical points:  Move to SL5/6; pilot jobs; fabric monitoring/tools; what tools do Tier 2s miss?  What is the voice of the Tier 2’s ?  Do we need a group to start looking at Tier 2 fallback plans if EGI_DS does not deliver?  And what is the situation in US with OSG?