Data management at T3s Hironori Ito Brookhaven National Laboratory
Types T3GS – Just like T2s Pros: – Can use all available production software – Sites are monitored 24/7 Cons: – Big overhead – Must be reliable T3G – Not like other ATLAS sites Pro – Minimum overhead – Reliability is not required Cons: – T3s are on your own when problem exist. (sometimes)
US T3 GS GS Sites are in ATLAS Tiers of ATLAS. – It is operated by BNL T3 DQ2 SS – Requirements SRM with space tokens Must accept ATLAS production proxy (as they are). – No special, manual registration at a site. Must pass a few tests – SAM test » lcg-cr, lcg-cp and lcg-del – ATLAS DDM functional tests Must register at OSG OIM Must publish to OSG BDII and CERN BDII via OSG Inter-op BDII – BNL publishes all SE only US T3s to OSG BDII. » T3 must request to BNL (via DDM queue in RT) by providing its SE information Must be able to respond to any ATLAS tickets within reasonable time.
T3 GS Management Use regular DQ2 tools – Subsription DaTRI – Deletion Central deletion dq2-delete-replicas – Consistency LFC is located in BNL for all T3 The content of LFC for specific T3 is delivered to corresponding T3 DDM site via DDM. – Sqlite format » Provide the fast search – Central catalog information » LFC has no dataset info.
T3 GS Management (continue… I) Use regular DQ2 tools – Consistency storageManagement.py – Work with the above LFC files – Scan local storages – Find SE and LFC dark files » SE dark files: exist in SE but not in LFC Select * from files where pfn_se is not null and pfn_lfc is null » LFC dark files: exists in LFC but not in SE Select * from files where pfn_se is null and pfn_lfc is not null – Delete dark files – Create logs » Log(s) is always created automatically. » All actions are stored in the log. – Obtain by » Svn checkout » Download via browser at
US T3G Not in Tiers of ATLAS. – Can’t use DQ2 SS Requirement – Grid enabled SE SRM or plain Gridftp server – Still register to OSG OIM Difference with GS – No need to accept ATLAS production proxy – No tests to pass
Data Tools in US T3G Use existing tools as much as possible. – Extend for future use dq2-get and dq2-ls – Dq2-get Plugins to use different transfer tools than lcg-cp – FTS plugins » Allow third party transfers between two remote SEs Supports SRMs as well as GridFTP » Allow queuing » Avoid chaotic lcg-cp » New dq2-client package will include this plugin by default The newest one is available at svn checkout Browswer download at pname=dq2plugin
Data Tools in US T3 G (continue… I) dq2-get and dq2-ls – dq2-get Global name space – Dq2 client developers are currently working on the change. – Store files with Global name space (LFC name space) » Same as LFC name space used in ATLAS production » Example DSN: data10_7TeV physics_JetTauEtmiss.merge.NTUP _JETMET.f293_p209_tid172219_00 LFN: NTUP_JETMET _ root.1 LFC LFN global name space /grid/atlas/dq2/data10_7TeV/NTUP_JETMET/f293_p209/d ata10_7TeV physics_JetTauEtmiss.merge.NTUP_J ETMET.f293_p209_tid172219_00/NTUP_JETMET _ root.1
Data Tools in US T3G (continue… II) dq2-get and dq2-ls – dq2-get Global name space – Use of SE as a file catalog » T3G has no LFC » Easy extensions for other file transfer mechanisms xROOTd-FRM Find/transfer files with remote FRMs automatically http(s) Many SEs do/will support http/https currently and/or in the future dCache/BestMan/DPM Make a new http plugins
Data Tools in US T3G (continue… III) dq2-get and dq2-ls – dq2-ls Global name space – dq2-ls currently requires LFC to find physical files – T3G has no LFC – dq2-ls will find physical files from the local(remote?) SE according to the global name space. » Dq2 developers are currently working on the change
Data Management at T3G T3 space must be managed by T3 administrators with minimum helps from T2s/T1 No central replica catalog No semi-central LFC No need to synchronize – Just delete files from SEs as needed. – All files in a given dataset are stored in one particular directory according to the global name space. delete-replica DSN rm –rf /A/B/…/DSN List-datasets-site SITE ls –R /base-data-directory
Thought on Global Name Space Great way to avoid local catalog – Cons: Performance issue on SE to list files? Expand the methods to access files – xRootd FRM – http/https: http/https is much easier, and has wide standard support Everyone knows how to use browser Many clients: – wget works everywhere. – Aria2 ( » Segmented download Stop-start transfer in the middle Use of multiple source sites for a single file Use of multiple streams from the single source hosts per single file Use of multiple downloads. » Casual test: Wget at 4MB/s aria2 at 60MB/s – LFC+dCache+http demos at