Presentation is loading. Please wait.

Presentation is loading. Please wait.

Top 5 Experiment Issues ExperimentALICEATLASCMSLHCb Issue #1xrootd- CASTOR2 functionality & performance Data Access from T1 MSS Issue.

Similar presentations


Presentation on theme: "Top 5 Experiment Issues ExperimentALICEATLASCMSLHCb Issue #1xrootd- CASTOR2 functionality & performance Data Access from T1 MSS Issue."— Presentation transcript:

1 Top 5 Experiment Issues ExperimentALICEATLASCMSLHCb Issue #1xrootd- CASTOR2 CASTOR@CERNCASTOR: functionality & performance Data Access from T1 MSS Issue #2xrootd-DPMIntegration of DDM/FTS/SRM/ GridFTP/LFC etc SRM I/F with functionality & performance glexec usage Issue #3FTS Service(Lack of) SRM 2.2FTS ServiceFile management Issue #4gLite WMSData Storage Management Tools Workload management Deployment procedure Issue #5VOMSStability of the Information System Information system

2

3 Integration of DM Components We agree that this is a very complex and important issue that must be addressed with high priority It is necessary that experiment, site and network service experts are involved in the debugging exercise, as all of these are intimately involved – a complete end-to-end solution must be demonstrated with adequate performance We propose a ‘back-to-basics’ approach, separating the main problems, which can be investigated (initially) independently and in parallel (& then ramping up…) We should not forget the role of the host lab in the WLCG model – we must be able to distribute data at the required rates for all VOs and to all sites  We still have not managed this – and we’ve been trying a long time! 3 MoU: Distribution of an agreed share of the raw data (+ESD) to each Tier1 Centre, in-line with data acquisition

4 Possible Initial Goals  Stable transfers of (each of) the two main (by rate) VOs to an agreed set of common sites.  Each VO can act as a ‘control’ for the other - any significant differences between them should be understood −filesize distribution, # active transfers / of streams, other key parameters etc.  Concurrent multi-VO transfers – first raised as a concern by ATLAS – also need to be demonstrated (WLCG milestones…)  The goal should be to obtain stable transfers with daily/weekly averages by site and by VO at an agreed fraction of the nominal rates e.g. initially 25%, then 50%, etc.) with a daily and weekly analysis of any fluctuations / other problems.  Then ramp-up in complexity: WLCG milestones, FDR preparations  Once stable transfers have been demonstrated, add complexity in a controlled fashion until full end-to-end testing has been achieved  In parallel, the 'heartbeat monitoring' proposed by both ATLAS and CMS should be better defined with the goal of reaching a common agreement that can be rapidly deployed across initially the T0 and T1s.  This is clearly needed both in the short-medium term, as well as in the long run, in order to provide a stable and reliable service 4

5 Backup Slides

6 HEPiX Rome 05apr06 LCG les.robertson@cern.ch WLCG Service Hierarchy Tier0 – the accelerator centre  Data acquisition & initial processing  Long-term data curation  Data Distribution to Tier-1 centres Canada – Triumf (Vancouver) France – IN2P3 (Lyon) Germany –Karlsruhe Italy – CNAF (Bologna) Netherlands – NIKHEF/SARA (Amsterdam) Nordic countries – distributed Tier-1 Spain – PIC (Barcelona) Taiwan – Academia SInica (Taipei) UK – CLRC (Oxford) US – FermiLab (Illinois) – Brookhaven (NY) Tier1 – “online” to the data acquisition process  high availability  Managed Mass Storage –  grid-enabled data service  All re-processing passes  Data-heavy analysis  National, regional support Tier2 – ~100 centres in ~40 countries  Simulation  End-user analysis – batch and interactive  Services, including Data Archive and Delivery, from Tier1s

7 Multi-VO Rates (March 26+) VOMar 26Mar 27Mar 28Mar 29Mar 30Mar 31April 1Av. ALICE-2060508013010060 ATLAS102060 802011050 CMS400500>300 LHCb-------

8 CMS CSA07 Export Targets SiteTarget (MB/s) Achieved Weekly av. Mar 26 Mar 27 Mar 28 Mar 29 Mar 30 Mar 31 Apr 1 ASGC264530402050 7065 CNAF3738701056030~-- FNAL1058213060406511040130 FZK262250 2530-~~ IN2P332765012560100 6040 PIC1045356050 404535 RAL26495065 20304570 Above table shows rates read off from GridView plots. CMS goal is success transfers on 95% of the challenge days ➨ Target rate on 50% of the challenge days

9 Q1 2007 – Tier0 / Tier1s 1.Demonstrate Tier0-Tier1 data export at 65% of full nominal rates per site using experiment-driven transfers –Mixture of disk / tape endpoints as defined by experiment computing models, i.e. 40% tape for ATLAS; transfers driven by experiments –Period of at least one week; daily VO-averages may vary (~normal) 2.Demonstrate Tier0-Tier1 data export at 50% of full nominal rates (as above) in conjunction with T1-T1 / T1-T2 transfers –Inter-Tier transfer targets taken from ATLAS DDM tests / CSA06 targets 3.Demonstrate Tier0-Tier1 data export at 35% of full nominal rates (as above) in conjunction with T1-T1 / T1-T2 transfers and Grid production at Tier1s –Each file transferred is read at least once by a Grid job –Some explicit targets for WMS at each Tier1 need to be derived from above 4.Provide SRM v2.2 endpoint(s) that implement(s) all methods defined in SRM v2.2 MoU, all critical methods pass tests –See attached list; Levels of success: threshold, pass, success, (cum laude) Status of the Milestones: The text explains well the status of the milestones. None of the milestones is fully “completed”; therefore specify a new deadline and mention the issue in the Outstanding Issues, with the corrective actions that are going to be undertaken in order to complete the milestones.

10


Download ppt "Top 5 Experiment Issues ExperimentALICEATLASCMSLHCb Issue #1xrootd- CASTOR2 functionality & performance Data Access from T1 MSS Issue."

Similar presentations


Ads by Google