Download presentation
Presentation is loading. Please wait.
Published byGilbert Barton Modified over 9 years ago
1
BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India
2
CHEP 2006 Mumbai, India Bruce G. Gibbard 2 Introduction §The scale of computing required by modern High Energy and Nuclear Physics experiments can’t be met by single institutions, funding agencies or even countries §Grid computing, integrating widely distributed resources into a seamless facility, is the solution of choice §A critical aspect of such Grid computing is the ability to move massive data sets over great distances in near real time l High bandwidth wide area transfer rates l Long term sustained operations
3
CHEP 2006 Mumbai, India Bruce G. Gibbard 3 Specific Needs at Brookhaven §HENP Computing at BNL l Tier 0 center for Relativistic Heavy Ion Collider - RHIC Computing Facility (RCF) l US Tier 1 center for ATLAS experiment at the CERN LHC – ATLAS Computing Facility (ACF) §RCF requires data transfers to collaborating facilities l Such as RIKEN center in Japan §ACF requires data transfers from CERN and on to ATLAS Tier 2 Centers (Universities) l Such as Boston, Chicago, Indiana, Texas/Arlington
4
CHEP 2006 Mumbai, India Bruce G. Gibbard 4 BNL Staff Involved §Those involved in this work at BNL were member of the RHIC and ATLAS Computing Facility and the PHENIX and ATLAS experiments §Not named here there were of course similar contributing teams at the far end of these transfers: CERN, Riken, Chicago, Boston, Indiana, Texas/Arlington M. Chiu W. Deng B. Gibbard Z. Liu S. Misawa D. Morrison R. Popescu M. Purschke O. Rind J. Smith Y. Wu D. Yu
5
CHEP 2006 Mumbai, India Bruce G. Gibbard 5 PHENIX Transfer of Polarized Proton Data To Riken Computing Facility in Japan §Near Real Time l In particular not to tape storage so no added tape retrieval required l Very shortly after end of RHIC run, transfer should end §Part of RHIC Run in 2005 (~270 TB) §Planned Again for RHIC Run in 2006
6
CHEP 2006 Mumbai, India Bruce G. Gibbard 6
7
CHEP 2006 Mumbai, India Bruce G. Gibbard 7 Typical Network Activity During PHENIX Data Transfer
8
CHEP 2006 Mumbai, India Bruce G. Gibbard 8
9
CHEP 2006 Mumbai, India Bruce G. Gibbard 9 For ATLAS, (W)LCG Exercises §Service Challenge 3 l Throughput Phase (WLCG and computing sites develop, tune and demonstrate data transfer capacities) July ‘05 Rerun in Jan ‘06 §Service Challenge 4 l To begin in April 2006
10
CHEP 2006 Mumbai, India Bruce G. Gibbard 10 Read pools DCap doors SRM door doors GridFTP doors doors Control Channel write pools Data Channel DCap Clients Pnfs ManagerPool Manager HPSS GridFTP Clientsd SRM Clients Oak Ridge Batch system dCache System BNL ATLAS dCache/HPSS Based SE
11
CHEP 2006 Mumbai, India Bruce G. Gibbard 11 Disk to Disk Phase of SC3 §Transfer rate to 150 MB/sec achieved during early standalone operations §Even though FTS (transfer manager) failed to properly support dCache SRMCP degrading performance of BNL Tier 1 dCache based storage element
12
CHEP 2006 Mumbai, India Bruce G. Gibbard 12 Overall CERN Operations During Disk to Disk Phase §Saturation of network connection at CERN required throttling of individual site performances
13
CHEP 2006 Mumbai, India Bruce G. Gibbard 13 Disk to Tape Phase
14
CHEP 2006 Mumbai, India Bruce G. Gibbard 14 dCache Activity During Disk to Tape Phase §Tape Writing Phase l Green indicated income data l Blue indicates data being migrated out to HPSS, the tape storage system §Rate at 60-80 MBytes/sec were sustained
15
CHEP 2006 Mumbai, India Bruce G. Gibbard 15 SC3 T1 – T2 Exercises §Transfer to 4 Tier 2 sites (Boston, Chicago, Indiana, Texas/Arlington) resulted in aggregate rates to 40 MB/sec but typically ~15 MB/sec and quite inconsistent §Tier 1 sites only supported Gridftp on classic storage elements and were not prepared to support sustained operations
16
CHEP 2006 Mumbai, India Bruce G. Gibbard 16 Potential Network Contention §BNL has been operating with an OC 48 ESnet WAN connection with 2 x 1 GB/sec connectivity over to the ATLAS/RHIC network fabric PHENIX sustain transfer to Riken CCJATLAS Service Challenge Test ← ←
17
CHEP 2006 Mumbai, India Bruce G. Gibbard 17 Network Upgrade §ESnet OC48 WAN connectivity is being upgraded to 2 x §BNL site connectivity from border router to RHIC/ATLAS facility is being upgrade to redundant 20 Gb/sec paths §Internally, in place of previous channel bonding l ATLAS switches are being redundantly connected at 20 Gb/sec l RHIC switches are being redundantly connected at 10Gb/sec §All will be complete by end of this month
18
CHEP 2006 Mumbai, India Bruce G. Gibbard 18 RHIC/PHENIX Plans ‘06 §RHIC will run again this year with polarized protons and so the data will again be transferred to Riken Center in Japan. §Data taking rates will be somewhat higher with somewhat better duty factor so transfer may have to support rates as much as a factor of two higher §Such running is likely to begin in early March §Expect to use SRM for transfer rather than just Gridftp for additional robustness
19
CHEP 2006 Mumbai, India Bruce G. Gibbard 19 WLHC Service Challenge 4 §Service challenge transfer goals are for nominal real transfer rates required by ATLAS to US Tier 1 in first years of LHC operation l 200 MB/sec (Disk at CERN to Tape at BNL) l Disk to Disk to begin in April with Disk to Tape to follow as soon as possible l BNL Tier 1 expects to be ready with new tape system in April to do Disk to Tape l BNL is planning on being able to use dCache SRMCP in these transfers §Tier 2 exercises at a much more serious level are anticipated using dCache/SRM on storage elements
20
CHEP 2006 Mumbai, India Bruce G. Gibbard 20 Conclusions §Good success to date in both ATLAS exercises and RHIC real operations §New round with significantly higher demands within next 1-2 months §Upgrades of network, storage elements, tape systems, and storage element interfacing should make it possible to satisfy these demands
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.