Presentation is loading. Please wait.

Presentation is loading. Please wait.

SRM 2.2: experiment requirements, status and deployment plans 6 th March 2007 Flavia Donno, INFN and IT/GD, CERN.

Similar presentations


Presentation on theme: "SRM 2.2: experiment requirements, status and deployment plans 6 th March 2007 Flavia Donno, INFN and IT/GD, CERN."— Presentation transcript:

1 SRM 2.2: experiment requirements, status and deployment plans 6 th March 2007 Flavia Donno, INFN and IT/GD, CERN

2 2 SRM 2.2 experiment requirements  In June 2005 the Baseline Service Working Group published a report:  http://lcg.web.cern.ch/LCG/peb/bs/BSReport-v1.0.pdf  A Storage Element Service was considered mandatory and high priority.  The full set of recommended features available by February 2006 but based on the already available SRM v2.1 features  In Mumbai (February 2006) experiments changed their requirements based on their previous usage of the available service.  LCG MoU for SRM 2.2 defined in May 2006 at FNAL  http://cd-docdb.fnal.gov/0015/001583/001/SRMLCG-MoU- day2%5B1%5D.pdf http://cd-docdb.fnal.gov/0015/001583/001/SRMLCG-MoU- day2%5B1%5D.pdf  The experiment requirements were defined:  Support for Permanent Files and volatile copies  Space Reservation (both static and dynamic) with possibility of releasing space  Permission Functions only on directories based on VOMS group/roles  Directory Functions  Data Transfer Functions: PrepareToGet/Put, Copy  File access protocol negotiation  VO specific relative paths

3 3 SRM 2.2 activities  Weekly meetings to follow the development of the implementations up to December 2006  The LBNL testing suite written in Java was the only one available till September 2006, manually run. Only in January 2007 this test suite was running automatically every day.  http://sdm.lbl.gov/srm-tester/v22-progress.html http://sdm.lbl.gov/srm-tester/v22-progress.html  http://sdm.lbl.gov/srm-tester/v22daily.html http://sdm.lbl.gov/srm-tester/v22daily.html  In September 2006 CERN took over from RAL the development of the S2 SRM 2.2 testing suite, enhancing it with a complete test set and with publishing and monitoring tools.  Reports about the status of the SRM 2.2 test endpoints are given monthly to the LCG MB

4 4 Study of SRM 2.2 specification  In September 2006 very different interpretations of the spec  3 Releases of the specification: July, September, December  A study of the spec (state/activity diagrams) has identified many behaviours not defined by the specs.  A list of about 50 points has been compiled in September 2006.  Many issues solved. Last 30 points discussed and agreed during the WLCG Workshop. The implementation for those will be ready in June 2007.  The study of the specifications, the discussions and testing of the open issues have helped insure consistency between SRM implementations. https://twiki.cern.ch/twiki/bin/view/SRMDev/IssuesInTheSpecifications

5 5 Tests executed  S2 test suite testing availability of endpoints, basic functionality, use cases and boundary conditions, interoperability, exhaustive and stress tests.  Availability: Ping and full put cycle (putting and retrieving a file)  Basic: basic functionality checking only return codes and passing all basic input parameters  Usecases: testing boundary conditions, exceptions, real use cases extracted from the middleware clients and experiment applications.  Interoperability: servers acting as clients, cross copy operations  Exhaustive: Checking for long strings, strange characters in input arguments, missing mandatory or optional arguments. Output parsed.  Stress: Parallel tests for stressing the systems, multiple requests, concurrent colliding requests, space exhaustion, etc.  S2 tests cron job running 5 times per day  In parallel, manual tests from GFAL/lcg-utils,FTS, DPM test suite.

6 6 Tests executed  For now only availability, basic, use case and interoperability tests executed on a regular base  Results published daily on a web page. Latest and history available:  https://twiki.cern.ch/twiki/bin/view/SRMDev  Results of failed and successful tests reported to developers to signal issues. Status page compiled:  https://twiki.cern.ch/twiki/bin/view/SRMDev/Implemen tationsProblems  Test results and issues discussed on srm-tester and srm-devel lists  https://hpcrdm.lbl.gov/mailman/listinfo/srmtester https://hpcrdm.lbl.gov/mailman/listinfo/srmtester  http://listserv.fnal.gov/archives/srm-devel.html http://listserv.fnal.gov/archives/srm-devel.html

7 7 Tests executed Copy and ChangeSpaceForFiles MoU SRM methods needed by the end of 2007. Expected by the end of summer srmCopy needed now only for dCache!

8 8 Tests executed Availability UseCase Interoperability/ Cross Copy

9 9 Testing Plan  Plan for 1Q of 2007 :  Phase 1: From 16 Dec 2006 until end of January 2007:  Availability and Basic tests  Collect and analyze results, update page with status of endpoints: https://twiki.cern.ch/twiki/bin/view/SRMDev/ImplementationsProblems https://twiki.cern.ch/twiki/bin/view/SRMDev/ImplementationsProblems  Plot results per implementation: number of failures/number of tests executed for all SRM MoU methods.  Report results to WLCG MB.  Phase 2: From beginning until end of February 2007:  Perform tests on use-cases (GFAL/lcg-utils/FTS/experiment specific), boundary conditions and open issues in the spec that have been agreed on.  Plot results as for phase 1 and report to WLCG MB.  Phase 3: From 1 March until “satisfaction” :  Add more SRM 2.2 endpoints (some T1s ?)  Stress testing  Plot results as for phase 2 and report to WLCG MB.  This plan has been discussed during the WLCG workshop. The developers have agreed to work on this as a matter of priority.

10 10 Test results

11 11 Test results

12 12 Test results

13 13 Tests executed: status of the implementations  DPM version 1.6.3 available for production. SRM 2.2 features still not officially certified. Implementation stable. All MoU methods implemented. Usecase tests are OK. Copy not available but interoperability tests are OK. Few general issues to be solved.  DRM and StoRM: Stable implementations. All MoU methods implemented. Copy in PULL mode not available in StoRM but interoperability tests OK. Some Usecase tests still not passing and under investigation.  dCache: Stable implementation is rather stable. All MoU methods have been implemented (including Copy that is absolutely needed for dCache). Interoperability tests not yet working with CASTOR and DRM. Working on some Usecase tests. General issues to be resolved (overwrite flag not supported).  CASTOR: The implementation is still rather unstable. A lot of progress during the last 3 weeks. Main instability causes found (race conditions, unintended mixing of threads and forks, etc.). Various problems identified and fixed by D. Smith, S. De Witt and G. Lo Presti with the involvement of various people from different IT groups. Various use cases to be resolved.

14 14 Status of SRM clients  FTS  SRM client code has been unit-tested and integrated into FTS  Tested against DPM, dCache and StoRM. CASTOR and DRM test started.  Released to development testbed.  Experiments could do tests on the dedicated UI set up for this purpose.  New dCache endpoint setup at FNAL for stress test.  GFAL/lcg-utils  New rpms available on test UI.  Still using old schema (see next slide).

15 15 GLUE Schema  GLUE 1.3 available  http://glueschema.forge.cnaf.infn.it/Spec/V13 http://glueschema.forge.cnaf.infn.it/Spec/V13  Not everything originally proposed, only the important changes  LDAP implementation done by Sergio Andreozzi. Available on the test UI.  Information providers started by Laurence Field. Static Information Providers available on test UI for CASTOR, dCache, DPM, and STORM.  Clients need to adapt to new schema.

16 16 Grid Storage System Deployment (GSSD)  Working group launched by the GDB to coordinate SRM 2.2 deployment for Tier-1s and Tier-2s  https://twiki.cern.ch/twiki/bin/view/LCG/GSSD https://twiki.cern.ch/twiki/bin/view/LCG/GSSD  Mailing list: storage-class-wg@cern.ch  People involved: developers, site admins, experiments  Use pre-GDB meetings for discussions: Tier-1s presenting their setup and reporting problems. Some Tier-2s.  Working groups setup to work on specific issues:  SRM v1 to v2 migration plan  Experiments input for Storage Classes, transfer rates, data flow patterns (input completed for LHCb and CMS, ATLAS coming)  Database entries conversion  Monitoring utilities

17 17 Deployment plan  (A.) Working with few candidate Tier-1s: IN2P3, FZK, SARA  (B.) Dedicated testing hardware.  (C.) DPM 1.6.3 and dCache 1.8 candidate implementations to test.  (D.) Have dCache and DPM tested at candidate sites, even if probably interoperability not fully tested till other implementations are deployed.  (E.) Enlarging the testbed to other sites even in production. Applying the migration plan foreseen for SRMv1-v2.  (F.) Include other implementations as soon as ready

18 18 Deployment plan: client perspective and backup plan  FTS, lcg-utils, GFAL  SRM v1.1 will be the default until SRM 2.2 is deployed in production and it has proven stable. However it is possible to configure a different default (v2.2)  Site admins have to run both SRM v1.1 and SRM v2.2  Until SRM v2.2 installation is stable  SRM type is retrieved from the information system  In case 2 versions found for the same endpoint SRM v2.2 is chosen only if space token (storage quality) specified. Otherwise SRM v1.1 is the default  FTS can be configured per channel on the version to use; policies can also be specified (“always use SRM 2.2”, “use SRM 2.2 if space token specified”,…)  Working on the details to make sure SRM v1.1 and SRM v2.2 allow for access to the same data (already possible for dCache and DPM). ===>>> It is possible and foreseen to run in mixed mode with SRM v1.1 and SRM v2.2, until SRM v2.2 is proven stable for all implementations. ===>>> Backup plan: run in mixed mode with SRM v1.1 and SRM v2.2 for the implementations that are ready.

19 19 Conclusions clearer specifications  Much clearer description of SRM specifications. All ambiguous behaviors made explicit. A few issues left out for SRM v3 since they do not affect the SRM MoU. establishedmethodology  Well established and agreed methodology to check the status of the implementations. Boundary conditions, use cases from the upper layer middleware and experiment applications will be the focus of next month’s work. Monthly reports and problems escalation to the WLCG MB. clear plan  A clear plan has been put in place in order to converge. not whereplanned  We are still not where we planned to be. However, we have a clearer view on how to proceed. sites and experiments Storage Classes  Working with sites and experiments for the deployment of the SRM 2.2 and Storage Classes. Specific guidelines for Tier-1 and Tier-2 sites are being compiled. plan  The plan foresees the use of a mixed environment SRM v1 and v2, where the upper layer middleware takes care of hiding the details from the users.


Download ppt "SRM 2.2: experiment requirements, status and deployment plans 6 th March 2007 Flavia Donno, INFN and IT/GD, CERN."

Similar presentations


Ads by Google