Presentation is loading. Please wait.

Presentation is loading. Please wait.

CCRC'08 experience at PIC storage POV

Similar presentations


Presentation on theme: "CCRC'08 experience at PIC storage POV"— Presentation transcript:

1 CCRC'08 experience at PIC storage POV
Francisco Martínez Gerard Bernabeu Esther Acción

2 Outline Preparation Results Issues With proposals Conclusions

3 Preparation Little preparation was needed Most operations were carried over from Feb run Only the definitions of the tokens and namespace was needed to be modified It was handed to us by experiments the last day, that at least is an improvement from Feb run :) LHCb did not provide us the requeriments :(

4 Preparation dCache version was p6 (Feb run ready) We decided not to upgrade to the latest dCache version, the recommended one for CCRC08 run2, for its late arrival. It turned out to be an excellent choice. Enstore version was 1.0.1 No special versions for CCRC08, enstore seems to be pretty LHC-agnostic.

5 Results: network pools wan wn

6 Results: enstore Only IBM robot was used No special load or behavior CCRC08 running was not apparent

7 Issues Data on pools was poorly balanced Interferences PinManager bug Mover queues dimensioning Enstore movers

8 Issues Data on pools was poorly balanced Some pools were almost full Some pools were freshly added and had no data on them This lead to multiple crashes on pools due to overload dc046

9 Issues Proposal for better balancing Deep analysis of the cost assignment policy on dCache formerly no work at all in this area, we have the naive default policy Implementation of Brian Bockleman's scripts for load balancing of files and non-automatic file replication Replica Manager that-works® Dcache-pfm: Physical File Manager

10 Issues Interferences IFCA was trying to read files that we had marked as without access (STK robot) Those accesses generated unnecessary mover queuing Thanks to Brian Bockleman for sorting this out! Would be nice if Experiments can enforce files marked offline in some way Cannot be done at dCache level

11 Issues PinManager bug Running dCache version p6 has a bug regarding the pinManager Some files become unaccesible due to lcg-gt hanging A cron restarting the pinManager solved this Current dCache release p5 has already solved this

12 Issues Mover queues dimensioning Has to be done depending on the type of access of jobs Low throughput-high duration profile, with locking of files These need a high number of movers. Maybe a deadlock detection mechanism would help at application level! Throughput intensive profile These suggest low number of movers

13 Issues Proposal for mover queues Jon Backen, head dCache admin at FNAL is coming to visit PIC They have implemented drastical solutions to this problem We will need to see if we can use those configuration queues with 1800 movers

14 Issues Enstore movers Had an issue with a memory leak that made the movers go offline Was solved before run2 started Enstore accounting We have the info, but it is difficult to parse or analyze. It is advisable to implement some way to visualize that data

15 Conclusions Storage component at PIC seems to be roughly ready to start data-taking, but Still pending issues with dCache Load balancing is specially critical Kudos to Enstore: no issues We have good feedback with experiments thanks to the Liaisons, but Experiments as a whole are many times not too communicative


Download ppt "CCRC'08 experience at PIC storage POV"

Similar presentations


Ads by Google