Presentation is loading. Please wait.

Presentation is loading. Please wait.

Max Baak 1 Efficient access to files on Castor / Grid Cern Tutorial Max Baak, CERN 30 October 2008.

Similar presentations


Presentation on theme: "Max Baak 1 Efficient access to files on Castor / Grid Cern Tutorial Max Baak, CERN 30 October 2008."— Presentation transcript:

1 Max Baak 1 Efficient access to files on Castor / Grid Cern Tutorial Max Baak, CERN 30 October 2008

2 Max Baak 2 Intelligent FileStager  cmt co –r FileStager-00-00-37 Database/FileStager https://twiki.cern.ch/twiki/bin/view/AtlasProtected/FileStager Works in ROOT and Athena.  Intelligent file stager copies files one-by-one to local disk, while running over previous file(s). File pre-staging to improve wall-time performance.  Actual processing over local files in cache = fast! Only time loss due to staging first file. Minimum number of network connections kept open. Spreads the network load of accessing data over length of job.  Simple: ‘feels’ just like running over list of local files. Run semi-interactive analysis over files nearby, eg. on Castor.  In many cases: staging as fast as running over local files!

3 Max Baak 3 Idea behind File Stager TCopyChain: open file #1 TCopyChain: open file #0 analysis looping over events file #0 time etc … TStageManager: copy over file #1 wait till file #1 is staged, if needed order staging of next file  File Stager: for doing local analysis (ie. using Tier-3 farm) on (d)AODs stored at grid storage elements (or. castor)  Idea: “Nearby data to job”

4 Max Baak 4  FileStager algorithm turns out to work very well compared with other network algorithms.  Q: Why?

5 Max Baak 5 Large Scale Tests (single jobs) Time in Seconds [s] Datatransfer in Byte  Example analysis: Reading 35% of AOD file content and more algorithmic inside analysis  Timing: Overall comparable timing as algorithmic part gets dominant File Stager faster than local access, as files are still in cache when loaded by Athena (Xrootd not buffered is 20% faster than RFIO.)  Datatransfer Very inefficient for RFIO and xrootd in this setup.  (Fixed now for xrootd.) Matthias Schott, MB

6 Max Baak 6 Typical AOD read access pattern  (Provided by Andreas Peters, Cern IT)  Irregular access pattern serious complication for network transfer.  Better to copy over entire AOD to local disk in one go.

7 Max Baak 7 D. van der Ster, STEP09

8 Max Baak 8 D. van der Ster, STEP09

9 Max Baak 9 D. van der Ster, STEP09

10 Max Baak 10 Lessons for/from FileStager tutorial  https://twiki.cern.ch/twiki/bin/view/AtlasProtected/FileStager  Do not use RFIO at Cern (slow, very inefficient) !  Set of scripts that make it easy to define lists of files to run over.  Can use FileStager for running over (d)AODs, (d)ESDs, ntuples on Tier-3 farm, from: Collections on Castor, or nearby Grid clouds (eg. Geneva, Annecy, Lyon).  Don’t use for files on other end of world ;-) Eg. Need to use stager for any files not accessible by xrootd.  Works in Root/python and Athena


Download ppt "Max Baak 1 Efficient access to files on Castor / Grid Cern Tutorial Max Baak, CERN 30 October 2008."

Similar presentations


Ads by Google