Scalla As a Full-Fledged LHC Grid SE Wei Yang, SLAC Andrew Hanushevsky, SLAC Alex Sims, LBNL Fabrizio Furano, CERN SLAC National Accelerator Laboratory.

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

Potential Data Access Architectures using xrootd OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC
HEPiX GFAL and LCG data management Jean-Philippe Baud CERN/IT/GD.
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Xrootd Roadmap Atlas Tier 3 Meeting University of Chicago September 12-13, 2011 Andrew Hanushevsky, SLAC
Xrootd Update OSG All Hands Meeting University of Nebraska March 19-23, 2012 Andrew Hanushevsky, SLAC
EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
Scalla Back Through The Future Andrew Hanushevsky SLAC National Accelerator Laboratory Stanford University 8-April-10
Scalla/xrootd Andrew Hanushevsky SLAC National Accelerator Laboratory Stanford University 19-August-2009 Atlas Tier 2/3 Meeting
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
SRM at Clemson Michael Fenn. What is a Storage Element? Provides grid-accessible storage space. Is accessible to applications running on OSG through either.
A. Sim, CRD, L B N L 1 US CMS Workshop, Mar. 3, 2009 Berkeley Storage Manager (BeStMan) Alex Sim Scientific Data Management Research Group Computational.
Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC
Scalla/xrootd Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 19-May-09 ANL Tier3(g,w) Meeting.
Scalla/xrootd Andrew Hanushevsky SLAC National Accelerator Laboratory Stanford University 29-October-09 ATLAS Tier 3 Meeting at ANL
Data Management The GSM-WG Perspective. Background SRM is the Storage Resource Manager A Control protocol for Mass Storage Systems Standard protocol:
Xrootd, XrootdFS and BeStMan Wei Yang US ATALS Tier 3 meeting, ANL 1.
Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum.
SLAC Experience on Bestman and Xrootd Storage Wei Yang Alex Sim US ATLAS Tier2/Tier3 meeting at Univ. of Chicago Aug 19-20,
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Xrootd Monitoring Atlas Software Week CERN November 27 – December 3, 2010 Andrew Hanushevsky, SLAC.
Are SE Architectures Ready For LHC? Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University 3-November-08 ACAT Workshop.
July-2008Fabrizio Furano - The Scalla suite and the Xrootd1.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
02-June-2008Fabrizio Furano - Data access and Storage: new directions1.
Accelerating Debugging In A Highly Distributed Environment CHEP 2015 OIST Okinawa, Japan April 28, 2015 Andrew Hanushevsky, SLAC
Performance and Scalability of xrootd Andrew Hanushevsky (SLAC), Wilko Kroeger (SLAC), Bill Weeks (SLAC), Fabrizio Furano (INFN/Padova), Gerardo Ganis.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Scalla/xrootd Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 08-June-10 ANL Tier3 Meeting.
Scalla Advancements xrootd /cmsd (f.k.a. olbd) Fabrizio Furano CERN – IT/PSS Andrew Hanushevsky Stanford Linear Accelerator Center US Atlas Tier 2/3 Workshop.
XRootD & ROOT Considered Root Workshop Saas-Fee September 15-18, 2015 Andrew Hanushevsky, SLAC
Scalla Authorization xrootd /cmsd Andrew Hanushevsky SLAC National Accelerator Laboratory CERN Seminar 10-November-08
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
Scalla In’s & Out’s xrootdcmsd xrootd /cmsd Andrew Hanushevsky SLAC National Accelerator Laboratory OSG Administrator’s Work Shop Stanford University/SLAC.
SRM Space Tokens Scalla/xrootd Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University 27-May-08
Scalla + Castor2 Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University 27-March-07 Root Workshop Castor2/xrootd.
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
Federated Data Stores Volume, Velocity & Variety Future of Big Data Management Workshop Imperial College London June 27-28, 2013 Andrew Hanushevsky, SLAC.
1 Xrootd-SRM Andy Hanushevsky, SLAC Alex Romosan, LBNL August, 2006.
11-June-2008Fabrizio Furano - Data access and Storage: new directions1.
Bestman & Xrootd Storage System at SLAC Wei Yang Andy Hanushevsky Alex Sim Junmin Gu.
09-Apr-2008Fabrizio Furano - Scalla/xrootd status and features1.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Scalla Update Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University 25-June-2007 HPDC DMG Workshop
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
New Features of Xrootd SE Wei Yang US ATLAS Tier 2/Tier 3 meeting, University of Texas, Arlington,
OSG STORAGE OVERVIEW Tanya Levshina. Talk Outline  OSG Storage architecture  OSG Storage software  VDT cache  BeStMan  dCache  DFS:  SRM Clients.
A. Sim, CRD, L B N L 1 Production Data Management Workshop, Mar. 3, 2009 BeStMan and Xrootd Alex Sim Scientific Data Management Research Group Computational.
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
Storage Architecture for Tier 2 Sites Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University 8-May-07 INFN Tier 2 Workshop
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Solutions for WAN data access: xrootd and NFSv4.1 Andrea Sciabà.
BeStMan/DFS support in VDT OSG Site Administrators workshop Indianapolis August Tanya Levshina Fermilab.
KIT - University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association Xrootd SE deployment at GridKa WLCG.
Federating Data in the ALICE Experiment
a brief summary for users
StoRM: a SRM solution for disk based storage systems
SLAC National Accelerator Laboratory
Berkeley Storage Manager (BeStMan)
StoRM Architecture and Daemons
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
Scalla/XRootd Advancements
Data Management cluster summary
INFNGRID Workshop – Bari, Italy, October 2004
Summary of the dCache workshop
Presentation transcript:

Scalla As a Full-Fledged LHC Grid SE Wei Yang, SLAC Andrew Hanushevsky, SLAC Alex Sims, LBNL Fabrizio Furano, CERN SLAC National Accelerator Laboratory Stanford University 24-March-09 CHEP

CHEP 24-March Outline The canonical Storage Element Scallaxrootd Scalla/xrootd integration with SE components GridFTP GridFTP Cluster I/O BeStMan BeStMan SRM Name space issues Static Space Tokens Conclusions Future Directions Acknowledgements

CHEP 24-March The Canonical Storage Element S PaPaPaPa PgPgPgPg S PaPaPaPa PgPgPgPg dcap fs-I/O nfs rfio xroot p Application protocol for random I/O PaPaPaPa Grid protocol for sequential bulk I/O PgPgPgPg S PaPaPaPa PgPgPgPg File Servers SRMmanagesGrid-SEtransferssrmGridFTPftpd Data Storage Bottom Up View

CHEP 24-March Distinguishing SE Components SRM SRM (Storage Resource Manager v2+) Only two independent * version available StoRMStoRM Storage Resource Manager (StoRM) BeStManBeStMan Berkeley Storage Manager (BeStMan) Both are Java based and implement SRM v2.2GridFTP Only one de facto version available Globus GridFTP *Castor, dCache, DPM, Jasmine, L-Store, LBNL/DRM/HRM, and SRB SRM’s are tightly integrated with the underlying system.

CHEP 24-March Which SRM? BeStMan We went with BeStMan LBNL developers practically next door Needed integration assistance Address file get/put performance issues BeStMan-Gateway LBNL team developed BeStMan-Gateway Implementation of WLCG token specification Stripped down SRM for increased throughput Sustained performance ~ 7 gets/sec & ~ 5.6 puts/sec BeStMan Original BeStMan 1 ~ 1.5 gets/sec & 0.5 ~ 1 puts/sec Perhaps the fastest SRM available today

CHEP 24-March xrootdcmsd xrootd/cmsd The Integration Task GridFTP SRM You might mistakenly think this is simple! The SE StorageSystem

CHEP 24-March Integration Issues (why it’s not simple) Scalla xrootd Scalla/ xrootd is not inherently SRM friendly SRM relies on a true file system view of the cluster Scalla xrootd Scalla/ xrootd was not designed to be a file system! Architecture and meta-data is highly distributed Performance & scalability trump full file system semantics The Issues... GridFTP GridFTP I/O access to the cluster SRM’s view of the cluster’s name space WLCG Static Space Tokens

CHEP 24-March xrootdcmsd xrootd/cmsd ADAPTER GridFTP Integration Phase I ( GridFTP ) GridFTP SRM xrootd access Source Adapter: POSIX Preload Library for xrootd access Provides full high-speed cluster access via POSIX calls GridFTP GridFTP positioning can be more secure! Fire Wall We still have an SRM problem! Source adapters generally won’t work with Java.

CHEP 24-March xrootdcmsd xrootd/cmsd ADAPTER BeStMan SRM Integration Phase II ( BeStMan SRM ) GridFTP Target Adapter: File System in User Space (FUSE) xrootd Full POSIX file system based on XrdClient called xrootdFS BeStMan Interoperates with BeStMan and probably StoRM FUSE SRM We still need to solve the srmLS problem and accept space tokens! Fire Wall

CHEP 24-March The srmLS Problem & Solution The SRM needs full view of the complete name space SRM simply assumes a central name space exists Scallaxrootd Scalla/xrootd distributes the name space across all servers There is no central name space whatsoever! Solution: create a “central” shadow name space Shadow name space ≡ ∑ cluster name space xrootdcnsd Uses existing xrootd mechanisms + cnsd daemons (i.e., no database) This satisfies srmLS requirements Easily accessed via FUSE

CHEP 24-March cnsd The Composite Name Space ( cnsd ) Redirector Name Space DataServers Manager cnsd ofs.notify closew, create, mkdir, mv, rm, rmdir |/opt/xrootd/bin/cnsd cnsd rexecutes open/trunc mkdir mv rm rmdir opendir() refers to the directory structure maintained by xrootd:2094 (full cluster name space) Clientmyhost

CHEP 24-March Composite Name Space Actions xrootd All name space actions sent to designated xrootd’s Local xrootdcnsd Local xrootd’s use an external local process named cnsd cnsd cnsd name space operations done in the background Neither penalizes nor serializes the data server xrootd Designated xrootd’s maintain composite name space Typically, these run on the redirector nodes Distributed name space can now be concentrated No external database needed Small disk footprint Well known locations for find complete name space

CHEP 24-March xrootdcmsd xrootd/cmsd The 10,000 Meter View FUSE ADAPTER FUSE GridFTP Fire Wall SRM cnsd A cnsd runs on each data server node communicating xrootd to an extra xrootd server running on the redirector node cnsd /cnsd

CHEP 24-March SRM Static Space Tokens Encapsulate fixed space characteristics Type of space E.g., Permanence, performance, etc. Implies a specific quota Using an arbitrary pre-defined name E.g., atlasdatadisk, atlasmcdisk, atlasuserdisk, etc. Typically used to create new files Think of it as a space profile Space tokens required by “some” LHC experiments E.g. Atlas

CHEP 24-March Static Space Token (SST) Paradigm Static space tokens map well to disk partitions A set of partitions define a set of space attributes Performance, quota, etc. Since an SST defines a set of space attributes Then partitions and SST’s are interchangeable Why do we care? xrootd Because partitions are natively supported by xrootd

CHEP 24-March Supporting Static Space Tokens xrootd We leverage xrootd’s built-in partition manager Just map space tokens on a set of named partitions xrootd xrootd supports real and virtual partitions Automatically tracks usage by named partition Allows for quota management (real → hard & virtual → soft quota) Since Partitions  SRM Space Tokens Usage is also automatically tracked by space token getxattr() returns token & usage information Available through FUSE and POSIX Preload Library See Linux & MacOS man pages

CHEP 24-March Integration Recap GridFTP Using POSIX preload library (source adapter) BeStMan SRM (BeStMan) Cluster access using FUSE (target adapter) srmLS support cnsdxrootd Using distributed cnsd’s + central xrootd processes Static space token support xrootd Using the built-in xrootd partition manager

CHEP 24-March xrootdcmsdcnsd xrootd/cmsd/cnsd The Scalla/xrootd SE FUSE ADAPTER FUSE GridFTP Fire Wall SRM But wait! Can’t we replace the source adapter with the target adapter Why not use FUSE for the complete suite?

CHEP 24-March Because Simpler May Be Slower Currently, FUSE I/O performance is limited Always enforces a 4k transfer block size Solutions? Wait until corrected in a post 2.6 Linux kernel xrootd Use the next SLAC xrootdFS release Improved I/O via smart read-ahead and buffering xrootd Use Andreas Peters’, CERN xrootdFS Fixes applied to significantly increase transfer speed GridFTP Just use the Posix Preload Library with GridFTP You will get the best possible performance

CHEP 24-March Conclusions Scallaxrootd Scalla/xrootd is a solid base for an SE Works well; is easy to install and configure Successfully deployed at many sites Optimal for most Tier 2 and Tier 3 installations Distributed as part of the OSG VDT FUSE provides a solution to many problems But, performance limits constrain its use

CHEP 24-March Future Directions More simplicity! cnsdcmsd Integrating the cnsd into cmsd Reduces configuration issues Pre-linking the extended open file system (ofs) Less configuration options Tutorial-like guides! Apparent need as we deploy at smaller sites

CHEP 24-March Acknowledgements Software Contributors Alice: Derek Feichtinger CERN: Fabrizio Furano, Andreas Peters Fermi: Tony Johnson (Java) Root: Gerri Ganis, Beterand Bellenet, Fons Rademakers STAR/BNL: Pavel Jackl SLAC: Jacek Becla, Tofigh Azemoon, Wilko Kroeger BeStMan LBNL: Alex Sim, Junmin Gu, Vijaya Natarajan (BeStMan team) Operational Collaborators BNL, FZK, IN2P3, RAL, UVIC, UTA Partial Funding US Department of Energy Contract DE-AC02-76SF00515 with Stanford University