SRM 2.2: tests and site deployment 30 th January 2007 Flavia Donno, Maarten Litmaath IT/GD, CERN.

Slides:



Advertisements
Similar presentations
LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
Advertisements

Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
Storage Issues: the experiments’ perspective Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 9 September 2008.
D. Düllmann - IT/DB LCG - POOL Project1 POOL Release Plan for 2003 Dirk Düllmann LCG Application Area Meeting, 5 th March 2003.
Summary of issues and questions raised. FTS workshop for experiment integrators Summary of use  Generally positive response on current state!  Now the.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
INFSO-RI Enabling Grids for E-sciencE SRMv2.2 experience Sophie Lemaitre WLCG Workshop.
SC4 Workshop Outline (Strong overlap with POW!) 1.Get data rates at all Tier1s up to MoU Values Recent re-run shows the way! (More on next slides…) 2.Re-deploy.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
Status report on SRM v2.2 implementations: results of first stress tests 2 th July 2007 Flavia Donno CERN, IT/GD.
Status of SRM 2.2 implementations and deployment 29 th January 2007 Flavia Donno, Maarten Litmaath IT/GD, CERN.
SRM 2.2: status of the implementations and GSSD 6 th March 2007 Flavia Donno, Maarten Litmaath INFN and IT/GD, CERN.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
2 Sep Experience and tools for Site Commissioning.
WLCG Service Report ~~~ WLCG Management Board, 1 st September
Δ Storage Middleware GridPP10 What’s new since GridPP9? CERN, June 2004.
INFSO-RI Enabling Grids for E-sciencE SA1 and gLite: Test, Certification and Pre-production Nick Thackray SA1, CERN.
DDM Monitoring David Cameron Pedro Salgado Ricardo Rocha.
Report on Installed Resource Capacity Flavia Donno CERN/IT-GS WLCG GDB, CERN 10 December 2008.
MW Readiness WG Update Andrea Manzi Maria Dimou Lionel Cons 10/12/2014.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
SRM Monitoring 12 th April 2007 Mirco Ciriello INFN-Pisa.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
WLCG Grid Deployment Board, CERN 11 June 2008 Storage Update Flavia Donno CERN/IT.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Maarten Litmaath (CERN), GDB meeting, CERN, 2006/06/07 SRM v2.2 working group update Results of the May workshop at FNAL
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
Busy Storage Services Flavia Donno CERN/IT-GS WLCG Management Board, CERN 10 March 2009.
The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
Storage Classes report GDB Oct Artem Trunov
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
WLCG Grid Deployment Board CERN, 14 May 2008 Storage Update Flavia Donno CERN/IT.
LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.
Handling of T1D0 in CCRC’08 Tier-0 data handling Tier-1 data handling Experiment data handling Reprocessing Recalling files from tape Tier-0 data handling,
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
SRM v2.2 Production Deployment SRM v2.2 production deployment at CERN now underway. – One ‘endpoint’ per LHC experiment, plus a public one (as for CASTOR2).
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Storage Element Model and Proposal for Glue 1.3 Flavia Donno,
8 August 2006MB Report on Status and Progress of SC4 activities 1 MB (Snapshot) Report on Status and Progress of SC4 activities A weekly report is gathered.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Grid Deployment Board 5 December 2007 GSSD Status Report Flavia Donno CERN/IT-GD.
1 SRM v2.2 Discussion of key concepts, methods and behaviour F. Donno CERN 11 February 2008.
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
Summary of SC4 Disk-Disk Transfers LCG MB, April Jamie Shiers, CERN.
SRM 2.2: experiment requirements, status and deployment plans 6 th March 2007 Flavia Donno, INFN and IT/GD, CERN.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
WLCG Accounting Task Force Update Julia Andreeva CERN GDB, 8 th of June,
LCG Accounting Update John Gordon, CCLRC-RAL 10/1/2007.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
SRM v2.2: service availability testing and monitoring SRM v2.2 deployment Workshop - Edinburgh, UK November 2007 Flavia Donno IT/GD, CERN.
LCG Service Challenge: Planning and Milestones
Status of the SRM 2.2 MoU extension
Flavia Donno, Jamie Shiers
Flavia Donno CERN GSSD Storage Workshop 3 July 2007
Database Readiness Workshop Intro & Goals
SRM v2.2 / v3 meeting report SRM v2.2 meeting Aug. 29
SRM Developers' Response to Enhancement Requests
SRM2 Migration Strategy
Summary from last MB “The MB agreed that a detailed deployment plan and a realistic time scale are required for deploying glexec with setuid mode at WLCG.
Data Management cluster summary
The LHCb Computing Data Challenge DC06
Presentation transcript:

SRM 2.2: tests and site deployment 30 th January 2007 Flavia Donno, Maarten Litmaath IT/GD, CERN

2 Tests executed  S2 test suite testing availability of endpoints, basic functionality, use cases and boundary conditions, interoperability, exhaustive and stress tests.  Availability: Ping and full put cycle (putting and retrieving a file)  Basic: basic functionality checking only return codes and passing all basic input parameters  Usecases: testing boundary conditions, exceptions, real use cases extracted from the middleware clients and experiment applications.  Interoperability: servers acting as clients, cross copy operations  Exhaustive: Checking for long strings, strange characters in input arguments, missing mandatory or optional arguments. Output parsed.  Stress: Parallel tests for stressing the systems, multiple requests, concurrent colliding requests, space exhaustion, etc.  S2 tests cron job running 6 times a day (overnight for US sites)  In parallel, manual tests from GFAL/lcg-utils,FTS, DPM test suite and test suite executed at LBNL daily.

3 Tests executed Basic MoU SRM methods MoU SRM methods needed by the end of Expected by the end of summer Needed now only for dCache!

4 Tests executed Christmas Break + WSDL Update in S2 Test suite Some stability. Some methods still not implemented WSDL update LHCC review announced Failures almost close to 0

5 Tests executed Availability UseCase Interoperability/ Cross Copy

6 Tests executed: status of the implementations  DPM has been rather stable and in good state for the entire period of testing. Few issues were found (few testing conditions caused crashes in the server) and fixed. All MoU methods implemented beside Copy (not needed at this stage).  DRM and StoRM: good interaction. At the moment all MoU methods implemented (Copy in PULL mode not available in StoRM). Implementations rather stable. Some communication issues with DRM need investigation.  dCache: very good improvements in the basic tests in the last weeks. Implementation is rather stable. All MoU methods have been implemented (including Copy that is absolutely needed for dCache) but ExtendFileLifeTime. Timur has promised to implement it as soon as he gets back to US.  CASTOR: The implementation has been rather unstable. The problems have been identified in transferring the requests to the back-end server. Therefore, it has been difficult to fully test with the basic test suite the implementation of the SRM interface. A major effort is taking place to fix these problems.

7 Plan  Plan for 1Q of 2007 :  Phase 1: From 16 Dec 2006 until end of January 2007:  Availability and Basic tests  Collect and analyze results, update page with status of endpoints:  Plot results per implementation: number of failures/number of tests executed for all SRM MoU methods.  Report results to WLCG MB.  Phase 2: From beginning until end of February 2007:  Perform tests on use-cases (GFAL/lcg-utils/FTS/experiment specific), boundary conditions and open issues in the spec that have been agreed on.  Plot results as for phase 1 and report to WLCG MB.  Phase 3: From 1 March until “satisfaction/end of March 2007” :  Add more SRM 2.2 endpoints (some T1s ?)  Stress testing  Plot results as for phase 2 and report to WLCG MB.  This plan has been discussed during the WLCG workshop. The developers have agreed to work on this as a matter of priority.

8 Grid Storage System Deployment (GSSD)  Working group launched by the GDB to coordinate SRM 2.2 deployment for Tier-1s and Tier-2s   Mailing list:  Mandate:  Establishing a migration plan from SRM v1 to SRM v2 so that the experiments can access the same data from the 2 endpoints transparently.  Coordinating with sites, experiments, and developers the deployment of the various 2.2 SRM implementations and the corresponding Storage Classes.  Coordinating the Glue Schema v1.3 deployment for the Storage Element making sure that the information published are correct.  Ensuring transparency of data access and the functionalities required by the experiments (see Baseline Service report).  People involved: developers (SRM and clients), providers, site admins, experiments

9 Deployment plan  (A.) Collecting requirements from the experiments: more details with respect to what is described in TDR. The idea is to understand how to specifically configure a Tier-1 or a Tier-2: storage classes (quality of storage), disk caches (how many, how big and with which access), storage transition patterns, etc.  Now  (B.) Understand current Tier-1 setup: requirements ?  Now  (C.) Getting hints from developers: manual/guidelines ?  Now  (D.) Selecting production sites as guinea pigs and start testing with experiments.  Beginning of March July 2007  (E.) Assisting experiments and sites during tests (monitoring tools, guidelines in case of failures, cleaning-up, etc.). Define mini SC milestones  March - July 2007  (F.) Accommodate new needs, not initially foreseen, if necessary.  (G.) Have production SRM 2.2 fully functional (MoU) by September 2007

10 Deployment plan in practice  Detailed requirements already collected from LHCb, ATLAS, CMS. Still some reiteration is needed for CMS and ATLAS.  Focus on dCache deployment first. Target sites: GridKA, Lyon, SARA. Experiments: ATLAS and LHCb.  Variety of implementations. Many issues covered by this exercise  Compiling specific guidelines targeting an experiment: ATLAS and LHCb.  Guidelines reviewed by developers and sites. Covering possibly also some Tier-2 sites.  Working then with some of the T2s deploying DPM. Repeat the exercise done for d-Cache. Possible parallel activity.  Start working with CASTOR if ready: CNAF, RAL.  Define mini SC milestones with the experiments and test in coordination with FTS and lcg-utils developers.

11 Deployment plan in practice: the client perspective  Site admins have to run both SRM v1.1 and SRM v2.2  Restrictions: SRM v1 and v2 endpoints must run on the same “host” and use the same file “path”. Endpoints may run on different ports.  Experiments will be able to access both old and new data through both the new SRM v2.2 endpoints and the old SRM v1.1  This is guaranteed when using higher level middleware such as GFAL, lcg- utils, FTS, LFC client tools. The endpoints conversion is performed automatically via the information system.  SRM type is retrieved from the information system  In case 2 versions found for the same endpoint SRM v2.2 is chosen only if space token (storage quality) specified. Otherwise SRM v1.1 is the default  FTS can be configured per channel on the version to use; policies can also be specified (“always use SRM 2.2”, “use SRM 2.2 if space token specified”,…)  It is the task of the GSSD Working Group to define and coordinate configuration details for mixed mode operations. ===>>> It is possible and foreseen to run in mixed mode with SRM v1.1 and SRM v2.2, until SRM v2.2 is proven stable for all implementations.

12 Conclusions clearer specifications  Much clearer description of SRM specifications. All ambiguous behaviors made explicit. A few issues left out for SRM v3 since they do not affect the SRM MoU. establishedmethodology  Well established and agreed methodology to check the status of the implementations. Boundary conditions, use cases from the upper layer middleware and experiment applications will be the focus of next month’s work. Monthly reports and problems escalation to the WLCG MB. clear plan  A clear plan has been put in place in order to converge. Developers will work on it as a matter of priority sites and experiments Storage Classes  Working with sites and experiments for the deployment of the SRM 2.2 and Storage Classes. Specific guidelines for Tier-1 and Tier-2 sites are being compiled. SRM 2.2 in production by September 2007  It is not unreasonable to expect SRM 2.2 in production by September migration and backup plans  The migration and backup plans foresee the use of a mixed environment SRM v1 and v2, where the upper layer middleware takes care of hiding the details from the users.