LCG Storage workshop at CERN. July Geneva, Switzerland. BNL’s Experience dCache1.8 and SRM V2.2 Carlos Fernando Gamboa Dantong Yu RHIC/ATLAS computer facility Brook Haven National Lab
LCG Storage workshop at CERN. July Geneva, Switzerland.1 Table of Content BNL’s dCache 1.8 storage and hardware configuration BNL’s dCache1.8 configuration Deployment problems Second re-deployment dCache 1.8 patch level 1 General Commentaries Future Plans
LCG Storage workshop at CERN. July Geneva, Switzerland.2 BNL’s dCache storage and hardware configuration BNL end point: srm://dct00.usatlas.bnl.gov:8443/srm/managerv2?SFN=//pnfs/usatlas.bnl.gov/data/dteam Storage class REPLICA-ONLINE (Tape0Disk1) BNL’s end point hardware specification – CPU speed 3400 MHz – Memory KB – This is a server located outside BNL firewall – BNL's firewall not tape storage configured. – Disk space for storage 60GB
LCG Storage workshop at CERN. July Geneva, Switzerland.3 BNL’s dCache1.8 configuration All dCache 1.8 components are installed on same server Components installed: SRM V2.2 door Dcap door Gridftp door GSIdcap door Gplazma cell 3 write pool of 20GB each one PNFS
LCG Storage workshop at CERN. July Geneva, Switzerland.4 BNL’s dCache1.8 configuration (cont.) Space Manager Configuration – Standar configuration based on dCache’s PoolManager.conf default parameters (1 write link, 1 linkGroup) – VO access control configuration – Group accounts: dteam001, ops001, usatlas5, usatlas4, usatlas3, usatlas2, usatlas1 – VO:ATLAS – Roles: dteamRole=*, /atlas/Role=production /atlas/soft-valid/Role=production
LCG Storage workshop at CERN. July Geneva, Switzerland.5 Deployment problems dCache 1.8 patch level 1 System StateTest NameObservation State1 -ReleaseFiles -Move RETURN:SRM_REQUEST_QUEUED After reboot of the dCache system State2 -06_StatusOfBringOnlineReques -09_ReleaseFiles -02_StatusOfPutRequest -04_PutDone, -05_PrepareToGet, -05_StatusOfGetRequest After working around 12 hours RETURN:SRM_REQUEST_Q UEUED State3 -06_BringOnlin -06_StatusOfBringOnlineReque -09_ReleaseFiles The test reported globus-url-copy failed Stability
LCG Storage workshop at CERN. July Geneva, Switzerland.6 Deployment problems dCache 1.8 patch level 1 Files appeared on the write pool not (PnfsManager) admin > cacheinfoof F48 cacheinfoof F48 No pool was returned However, the file does exist on the pool: data]# pwd /data/data5/dcache_pool_5/pool/data data]# ls -l F48 -rw-r--r-- 1 root root 2691 May 24 15: F48 Then looking in the admin on the pool (dct00_5) admin > rep ls -l F48 rep ls -l F F si={myStore:STRING} Tracing file from a specific test that reported SRM_REQUEST_QUEUED
LCG Storage workshop at CERN. July Geneva, Switzerland.7 Deployment problems dCache 1.8 patch level 1 (cont) Authentication gPlazma: An entry on the DN of the user is not fully recreated authentication fails; differs from dn stored on kpwd file. Solution Work around: included the USER’s DN without the CN=xxxxx number.
LCG Storage workshop at CERN. July Geneva, Switzerland.8 Second re-deployment dCache 1.8 patch level 1 dCache 1.8 Parameter/Component BeforeAfter reinstall Number write pools/size5 / 20GB3 / 30GB Increased the timeout pool srmCopyReqThreadPoolSize 2512 remoteGsiftpMaxTransfers gridftp transfer per pool assumed 54 Mover queue18/pool30/pool dcachesrm-gplazma.policykpwd vo-role-mapping dCache1.8 and pnfs was re-installed.
LCG Storage workshop at CERN. July Geneva, Switzerland.9 General commentaries Tuning Try several values for previous parameters until a stable configuration was found. Learning curve Would be useful to have examples on using different SRM parameters for tuning purpose.
LCG Storage workshop at CERN. July Geneva, Switzerland.10 General commentaries (cont.) Validation Looking forward to install dCache 1.8 patch level 6 and verify if previous strategy can be used to keep passing the tests.
LCG Storage workshop at CERN. July Geneva, Switzerland.11 Future plans Update dCache 1.8 patch level1 to patch level 6. Setup a second test point with access to tape resource.