Presentation is loading. Please wait.

Presentation is loading. Please wait.

M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #1 Simon Fraser Status of the WLCG Tier-2 Centres M.C. Vetterli Simon Fraser University and TRIUMF WLCG.

Similar presentations


Presentation on theme: "M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #1 Simon Fraser Status of the WLCG Tier-2 Centres M.C. Vetterli Simon Fraser University and TRIUMF WLCG."— Presentation transcript:

1 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #1 Simon Fraser Status of the WLCG Tier-2 Centres M.C. Vetterli Simon Fraser University and TRIUMF WLCG Overview Board, CERN, October 27 th 2008

2 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #2 Simon Fraser Sources of Information  Discussions with experiment representatives in July  APEL monitoring portal http://www3.egee.cesga.es/gridsite/accounting/CESGA/egee_view.php  WLCG reliability reports http://lcg.web.cern.ch/LCG/accounts.htm  October GDB mtg; dedicated to Tier-2 issues http://indico.cern.ch/conferenceDisplay.py?confId=20234  Talks from the last OB & LHCC Slides labeled with a * are from MV’s LHCC rapporteur talk

3 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #3 Simon Fraser Tier-2 Performance Summary*  Overall, the Tier-2s are contributing much more now  Significant fractions of the Monte Carlo simulations are being done in the T2s for all experiments  Reliability is better, but still needs to improve  CCRC’08 exercise is generally considered a success for the Tier2s

4 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #4 Simon Fraser  Overall, the Tier-2s and the experiments considered the CCRC’08 exercise to be a success  The networking/data transfers were tested extensively; some FTS tuning was needed, but it worked out  Experiments tended to continue other activities in parallel which is a good test of the system, although the load was not as high as anticipated  While CMS did include significant user analysis activities, the chaotic use of the Grid by a large number of inexperienced people is still to be tested Tier-2 Centres in CCRC’08 – General*

5 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #5 Simon Fraser Tier-2 Issues/Concerns As of CB and meetings with experiments this summer  Communications: Do Tier-2s have a voice? Is there a good mechanism for disseminating information?  Better monitoring: Pledges vs actual vs used  Hardware acquisitions: What should be bought? kSI2006?  Tier-2 capacity : Size of datasets? Effect of LHC delay?  …

6 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #6 Simon Fraser Tier-2 Issues/Concerns  Upcoming onslaught of users: Some user analysis tests have been done but scaling is a concern  User Support: Ticketing system exists but it is not really used for user support issues. This affects Tier-2s especially.  Federated Tier-2s: Tools to federate? Monitoring? (averaging)  Interoperability of EGEE, OSG, and NDGF should be improved  Software/Middleware updates: Could be smoother; too frequent

7 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #7 Simon Fraser Communications for Tier-2s  Identified by the T2s at the last CB as a serious problem. Interesting to me that many in experiment computing management did not share this concern.  Should communication be organized according to experiment or to Tier-1 association? There are also differing opinions on this.  There are two issues: Grid middleware/operations Experiment software  My view after studying this is that the situation is OK for “tightly coupled” Tier-2s, but not for remote and smaller Tier-2s that are not well coupled to a Tier-1.

8 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #8 Simon Fraser Communications for Tier-2s  Many lines of communication do indeed exist.  Some examples are: CMS has two Tier-2 coordinators: Ken Bloom (Nebraska) Giuseppe Bagliesi (INFN) - attend all operations meetings - feed T2 issues back to the operations group - write T2-relevant minutes - organize T2 workshops ALICE has designated 1 Core Offline person in 3 to have privileged contact with a given T2 site manager - weekly coordination meetings - Tier-2 federations provide a single contact person - A Tier-2 coordinates with its regional Tier-1

9 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #9 Simon Fraser Communications for Tier-2s ATLAS uses its cloud structure for communications - Every Tier-2 is coupled to a Tier-1 - 5 national clouds; others have foreign members (e.g. “Germany” includes Krakow, Prague, Switzerland; Netherlands includes Russia, Israel, Turkey) - Each cloud has a Tier-2 coordinator Regional organizations, such as: + France Tier-2/3 technical group: - coordinates with Tier-1 and with experiments - monthly meetings - coordinates procurement and site management + GRIF: Tier-2 federation of 5 labs around Paris + Canada: Weekly teleconferences of technical personnel (T1 & T2) to share information and prepare for upgrades, large production, etc. + Many others exist; e.g. in the US and the UK

10 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #10 Simon Fraser Communications for Tier-2s  Tier-2 Overview Board reps: Michel Jouvin and Atul Gurtu have just been appointed to the OB to give the Tier-2s a voice there.  Tier-2 mailing list: Actually exists and is being reviewed for completeness & accuracy  Tier-2 GDB: The October GDB was dedicated to Tier-2 issues + reports from experiments: role of the T2s; communications + talks on regional organizations + discussion of accounting + technical talks on storage, batch systems, middleware  Seems to have been a success; repeat a couple of times per year?

11 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #11 Simon Fraser

12 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #12 Simon Fraser

13 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #13 Simon Fraser  But how much of this is a problem of under-use rather than under-contribution?  a task force has been set up to extract installed capacities from the Glue schema  Monthly APEL reports still undergo significant modifications from first draft.  Good because communication with T2s better  Bad because APEL accounting still has problems Accounting seems to be very finicky; breaks when the CE or MON box is upgraded  How are jobs distributed to the Tier-2s? Tier-2 Installed Resources

14 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #14 Simon Fraser  How does the LHC delay affect the requirements and pledges for 2009? + We are told to go ahead and buy what was planned but we have already seen some under-use of CPU capacity and we have seen this for storage as well Tier-2 Hardware Questions

15 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #15 Simon Fraser  How does the LHC delay affect the requirements and pledges for 2009? + We are told to go ahead and buy what was planned but we have already seen some under-use of CPU and we are now starting to see this for storage as well  We need to use something other than SpecInt2000! + this benchmark is totally out-of-date & useless for new CPUs + continued delays in SpecHEP can cause sub-optimal decisions Tier-2 Hardware Questions

16 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #16 Simon Fraser  Networking to the nodes is now an issue. + with 8 cores per node, 1 GigE connection ≈ 16.8 MB/sec/core + Tier-2 analysis jobs run on reduced data sets and can do rather simple operations  have seen 7.5 MB/sec at ATLAS and much more (x10?) + Do we need to go to Infiniband? + We certainly need increased capability for the uplinks; we should have a minimum of fully non-blocking GigE the worker nodes.  We need more guidance from the experiments The next round of purchases is now! Tier-2 Hardware Questions

17 M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #17 Simon Fraser Summary  The role of the Tier-2 centres has increased markedly in the last year  >50% of Monte Carlo simulation is done in the T2s now.  The CCRC’08 exercise is considered a success by the Tier2s and by the experiments.  Availability and reliability are up, but still need improvement.  Resource acquisition vs pledges is better but still needs work  Issues for Tier2s: - communication should be (& is being) improved - work should ramp up on chaotic user analysis - reporting actual resources should be established - improved user support is needed


Download ppt "M.C. Vetterli – WLCG-OB, CERN; October 27, 2008 – #1 Simon Fraser Status of the WLCG Tier-2 Centres M.C. Vetterli Simon Fraser University and TRIUMF WLCG."

Similar presentations


Ads by Google