Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scalla As a Full-Fledged LHC Grid SE Wei Yang, SLAC Andrew Hanushevsky, SLAC Alex Sims, LBNL Fabrizio Furano, CERN SLAC National Accelerator Laboratory.

Similar presentations


Presentation on theme: "Scalla As a Full-Fledged LHC Grid SE Wei Yang, SLAC Andrew Hanushevsky, SLAC Alex Sims, LBNL Fabrizio Furano, CERN SLAC National Accelerator Laboratory."— Presentation transcript:

1 Scalla As a Full-Fledged LHC Grid SE Wei Yang, SLAC Andrew Hanushevsky, SLAC Alex Sims, LBNL Fabrizio Furano, CERN SLAC National Accelerator Laboratory Stanford University 24-March-09 CHEP

2 CHEP 24-March-20092 Outline The canonical Storage Element Scallaxrootd Scalla/xrootd integration with SE components GridFTP GridFTP Cluster I/O BeStMan BeStMan SRM Name space issues Static Space Tokens Conclusions Future Directions Acknowledgements

3 CHEP 24-March-20093 The Canonical Storage Element S PaPaPaPa PgPgPgPg S PaPaPaPa PgPgPgPg dcap fs-I/O nfs rfio xroot p Application protocol for random I/O PaPaPaPa Grid protocol for sequential bulk I/O PgPgPgPg S PaPaPaPa PgPgPgPg File Servers SRMmanagesGrid-SEtransferssrmGridFTPftpd Data Storage Bottom Up View

4 CHEP 24-March-20094 Distinguishing SE Components SRM SRM (Storage Resource Manager v2+) Only two independent * version available StoRMStoRM Storage Resource Manager (StoRM) http://storm.forge.cnaf.infn.it/ BeStManBeStMan Berkeley Storage Manager (BeStMan) http://datagrid.lbl.gov/bestman/ Both are Java based and implement SRM v2.2GridFTP Only one de facto version available Globus GridFTP http://www.globus.org/grid_software/data/gridftp.php *Castor, dCache, DPM, Jasmine, L-Store, LBNL/DRM/HRM, and SRB SRM’s are tightly integrated with the underlying system.

5 CHEP 24-March-20095 Which SRM? BeStMan We went with BeStMan LBNL developers practically next door Needed integration assistance Address file get/put performance issues BeStMan-Gateway LBNL team developed BeStMan-Gateway Implementation of WLCG token specification Stripped down SRM for increased throughput Sustained performance ~ 7 gets/sec & ~ 5.6 puts/sec BeStMan Original BeStMan 1 ~ 1.5 gets/sec & 0.5 ~ 1 puts/sec Perhaps the fastest SRM available today

6 CHEP 24-March-20096 xrootdcmsd xrootd/cmsd The Integration Task GridFTP SRM You might mistakenly think this is simple! The SE StorageSystem

7 CHEP 24-March-20097 Integration Issues (why it’s not simple) Scalla xrootd Scalla/ xrootd is not inherently SRM friendly SRM relies on a true file system view of the cluster Scalla xrootd Scalla/ xrootd was not designed to be a file system! Architecture and meta-data is highly distributed Performance & scalability trump full file system semantics The Issues... GridFTP GridFTP I/O access to the cluster SRM’s view of the cluster’s name space WLCG Static Space Tokens

8 CHEP 24-March-20098 xrootdcmsd xrootd/cmsd ADAPTER GridFTP Integration Phase I ( GridFTP ) GridFTP SRM xrootd access Source Adapter: POSIX Preload Library for xrootd access Provides full high-speed cluster access via POSIX calls GridFTP GridFTP positioning can be more secure! Fire Wall We still have an SRM problem! Source adapters generally won’t work with Java.

9 CHEP 24-March-20099 xrootdcmsd xrootd/cmsd ADAPTER BeStMan SRM Integration Phase II ( BeStMan SRM ) GridFTP Target Adapter: File System in User Space (FUSE) xrootd Full POSIX file system based on XrdClient called xrootdFS BeStMan Interoperates with BeStMan and probably StoRM FUSE SRM We still need to solve the srmLS problem and accept space tokens! Fire Wall

10 CHEP 24-March-200910 The srmLS Problem & Solution The SRM needs full view of the complete name space SRM simply assumes a central name space exists Scallaxrootd Scalla/xrootd distributes the name space across all servers There is no central name space whatsoever! Solution: create a “central” shadow name space Shadow name space ≡ ∑ cluster name space xrootdcnsd Uses existing xrootd mechanisms + cnsd daemons (i.e., no database) This satisfies srmLS requirements Easily accessed via FUSE

11 CHEP 24-March-200911 cnsd The Composite Name Space ( cnsd ) Redirector xrootd@myhost:1094 Name Space xrootd@myhost:2094 DataServers Manager cnsd ofs.notify closew, create, mkdir, mv, rm, rmdir |/opt/xrootd/bin/cnsd cnsd rexecutes open/trunc mkdir mv rm rmdir opendir() refers to the directory structure maintained by xrootd:2094 (full cluster name space) Clientmyhost

12 CHEP 24-March-200912 Composite Name Space Actions xrootd All name space actions sent to designated xrootd’s Local xrootdcnsd Local xrootd’s use an external local process named cnsd cnsd cnsd name space operations done in the background Neither penalizes nor serializes the data server xrootd Designated xrootd’s maintain composite name space Typically, these run on the redirector nodes Distributed name space can now be concentrated No external database needed Small disk footprint Well known locations for find complete name space

13 CHEP 24-March-200913 xrootdcmsd xrootd/cmsd The 10,000 Meter View FUSE ADAPTER FUSE GridFTP Fire Wall SRM cnsd A cnsd runs on each data server node communicating xrootd to an extra xrootd server running on the redirector node cnsd /cnsd

14 CHEP 24-March-200914 SRM Static Space Tokens Encapsulate fixed space characteristics Type of space E.g., Permanence, performance, etc. Implies a specific quota Using an arbitrary pre-defined name E.g., atlasdatadisk, atlasmcdisk, atlasuserdisk, etc. Typically used to create new files Think of it as a space profile Space tokens required by “some” LHC experiments E.g. Atlas

15 CHEP 24-March-200915 Static Space Token (SST) Paradigm Static space tokens map well to disk partitions A set of partitions define a set of space attributes Performance, quota, etc. Since an SST defines a set of space attributes Then partitions and SST’s are interchangeable Why do we care? xrootd Because partitions are natively supported by xrootd

16 CHEP 24-March-200916 Supporting Static Space Tokens xrootd We leverage xrootd’s built-in partition manager Just map space tokens on a set of named partitions xrootd xrootd supports real and virtual partitions Automatically tracks usage by named partition Allows for quota management (real → hard & virtual → soft quota) Since Partitions  SRM Space Tokens Usage is also automatically tracked by space token getxattr() returns token & usage information Available through FUSE and POSIX Preload Library See Linux & MacOS man pages

17 CHEP 24-March-200917 Integration Recap GridFTP Using POSIX preload library (source adapter) BeStMan SRM (BeStMan) Cluster access using FUSE (target adapter) srmLS support cnsdxrootd Using distributed cnsd’s + central xrootd processes Static space token support xrootd Using the built-in xrootd partition manager

18 CHEP 24-March-200918 xrootdcmsdcnsd xrootd/cmsd/cnsd The Scalla/xrootd SE FUSE ADAPTER FUSE GridFTP Fire Wall SRM But wait! Can’t we replace the source adapter with the target adapter Why not use FUSE for the complete suite?

19 CHEP 24-March-200919 Because Simpler May Be Slower Currently, FUSE I/O performance is limited Always enforces a 4k transfer block size Solutions? Wait until corrected in a post 2.6 Linux kernel xrootd Use the next SLAC xrootdFS release Improved I/O via smart read-ahead and buffering xrootd Use Andreas Peters’, CERN xrootdFS Fixes applied to significantly increase transfer speed GridFTP Just use the Posix Preload Library with GridFTP You will get the best possible performance

20 CHEP 24-March-200920 Conclusions Scallaxrootd Scalla/xrootd is a solid base for an SE Works well; is easy to install and configure Successfully deployed at many sites Optimal for most Tier 2 and Tier 3 installations Distributed as part of the OSG VDT FUSE provides a solution to many problems But, performance limits constrain its use

21 CHEP 24-March-200921 Future Directions More simplicity! cnsdcmsd Integrating the cnsd into cmsd Reduces configuration issues Pre-linking the extended open file system (ofs) Less configuration options Tutorial-like guides! Apparent need as we deploy at smaller sites

22 CHEP 24-March-200922 Acknowledgements Software Contributors Alice: Derek Feichtinger CERN: Fabrizio Furano, Andreas Peters Fermi: Tony Johnson (Java) Root: Gerri Ganis, Beterand Bellenet, Fons Rademakers STAR/BNL: Pavel Jackl SLAC: Jacek Becla, Tofigh Azemoon, Wilko Kroeger BeStMan LBNL: Alex Sim, Junmin Gu, Vijaya Natarajan (BeStMan team) Operational Collaborators BNL, FZK, IN2P3, RAL, UVIC, UTA Partial Funding US Department of Energy Contract DE-AC02-76SF00515 with Stanford University


Download ppt "Scalla As a Full-Fledged LHC Grid SE Wei Yang, SLAC Andrew Hanushevsky, SLAC Alex Sims, LBNL Fabrizio Furano, CERN SLAC National Accelerator Laboratory."

Similar presentations


Ads by Google