Xrootd, XrootdFS and BeStMan Wei Yang 2009-10-30 US ATALS Tier 3 meeting, ANL 1.

Slides:



Advertisements
Similar presentations
Tom Byrne, 12 th November 2014 Ceph – status update and xrootd testing Alastair Dewhurst, Tom Byrne 1.
Advertisements

Duke and ANL ASC Tier 3 (stand alone Tier 3’s) Doug Benjamin Duke University.
Exploring the UNIX File System and File Security
Extensible File Systems Yong Yao CS614 May 1 st, 2001.
6/24/2015B.RamamurthyPage 1 File System B. Ramamurthy.
7/15/2015B.RamamurthyPage 1 File System B. Ramamurthy.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
NETWORK FILE SYSTEM (NFS) By Ameeta.Jakate. NFS NFS was introduced in 1985 as a means of providing transparent access to remote file systems. NFS Architecture.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Chapter 3.1:Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
Setup your environment : From Andy Hass: Set ATLCURRENT file to contain "none". You should then see, when you login: $ bash Doing hepix login.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
SRM at Clemson Michael Fenn. What is a Storage Element? Provides grid-accessible storage space. Is accessible to applications running on OSG through either.
A. Sim, CRD, L B N L 1 US CMS Workshop, Mar. 3, 2009 Berkeley Storage Manager (BeStMan) Alex Sim Scientific Data Management Research Group Computational.
BaBar MC production BaBar MC production software VU (Amsterdam University) A lot of computers EDG testbed (NIKHEF) Jobs Results The simple question:
Tier 3 Data Management, Tier 3 Rucio Caches Doug Benjamin Duke University.
Multi-Tiered Storage with Xrootd at ATLAS Western Tier 2 Andrew Hanushevsky Wei Yang SLAC National Accelerator Laboratory 1CHEP2012, New York
A. Sim, CRD, L B N L 1 OSG Applications Workshop 6/1/2005 OSG SRM/DRM Readiness and Plan Alex Sim / Jorge Rodriguez Scientific Data Management Group Computational.
Chapter Two Exploring the UNIX File System and File Security.
Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum.
SLAC Experience on Bestman and Xrootd Storage Wei Yang Alex Sim US ATLAS Tier2/Tier3 meeting at Univ. of Chicago Aug 19-20,
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Experience with the Thumper Wei Yang Stanford Linear Accelerator Center May 27-28, 2008 US ATLAS Tier 2/3 workshop University of Michigan, Ann Arbor.
Chapter 3: Services of Network Operating Systems Maysoon AlDuwais.
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
Introduction to HDFS Prasanth Kothuri, CERN 2 What’s HDFS HDFS is a distributed file system that is fault tolerant, scalable and extremely easy to expand.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
 CASTORFS web page - CASTOR web site - FUSE web site -
Status & Plan of the Xrootd Federation Wei Yang 13/19/12 US ATLAS Computing Facility Meeting at 2012 OSG AHM, University of Nebraska, Lincoln.
Introduction to HDFS Prasanth Kothuri, CERN 2 What’s HDFS HDFS is a distributed file system that is fault tolerant, scalable and extremely easy to expand.
Scalla/xrootd Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 08-June-10 ANL Tier3 Meeting.
Chapter 10: File-System Interface Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 File-System Interface.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
Scalla In’s & Out’s xrootdcmsd xrootd /cmsd Andrew Hanushevsky SLAC National Accelerator Laboratory OSG Administrator’s Work Shop Stanford University/SLAC.
Scalla As a Full-Fledged LHC Grid SE Wei Yang, SLAC Andrew Hanushevsky, SLAC Alex Sims, LBNL Fabrizio Furano, CERN SLAC National Accelerator Laboratory.
1 Xrootd-SRM Andy Hanushevsky, SLAC Alex Romosan, LBNL August, 2006.
DMLite GridFTP frontend Andrey Kiryanov IT/SDC 13/12/2013.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
CVMFS: Software Access Anywhere Dan Bradley Any data, Any time, Anywhere Project.
Bestman & Xrootd Storage System at SLAC Wei Yang Andy Hanushevsky Alex Sim Junmin Gu.
Testing Infrastructure Wahid Bhimji Sam Skipsey Intro: what to test Existing testing frameworks A proposal.
Introduce Caching Technologies using Xrootd Wei Yang 10/28/14ATLAS TIM 2014 Univ. Chicago1.
Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.
Chapter Five Distributed file systems. 2 Contents Distributed file system design Distributed file system implementation Trends in distributed file systems.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
New Features of Xrootd SE Wei Yang US ATLAS Tier 2/Tier 3 meeting, University of Texas, Arlington,
A. Sim, CRD, L B N L 1 Production Data Management Workshop, Mar. 3, 2009 BeStMan and Xrootd Alex Sim Scientific Data Management Research Group Computational.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Solutions for WAN data access: xrootd and NFSv4.1 Andrea Sciabà.
IRODS Advanced Features Michael Wan
BeStMan/DFS support in VDT OSG Site Administrators workshop Indianapolis August Tanya Levshina Fermilab.
KIT - University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association Xrootd SE deployment at GridKa WLCG.
Federating Data in the ALICE Experiment
a brief summary for users
Diskpool and cloud storage benchmarks used in IT-DSS
Data Bridge Solving diverse data access in scientific applications
Berkeley Storage Manager (BeStMan)
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
GFAL 2.0 Devresse Adrien CERN lcgutil team
The INFN Tier-1 Storage Implementation
Integration of Singularity With Makeflow
Exploring the UNIX File System and File Security
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
File System B. Ramamurthy B.Ramamurthy 11/27/2018.
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

Xrootd, XrootdFS and BeStMan Wei Yang US ATALS Tier 3 meeting, ANL 1

 Xrootd Storage components  How does Xrootd works  What is XrootdFS  How to access Xrootd storage Interactive From ATLAS jobs  BeStMan 2

Redirector Xrootd/cmsd Redirector Xrootd/cmsd Name space xrootd Name space xrootd Data server Xrootd/cmsd XrdCnd Data server Xrootd/cmsd XrdCnd BeStMan –G XrootdFS BeStMan –G XrootdFS GridFTP w/ Xrootd DSI or XrootdFS GridFTP w/ Xrootd DSI or XrootdFS End Users XrootdFS End Users XrootdFS Remote site Data server Xrootd/cmsd XrdCnd Data server Xrootd/cmsd XrdCnd Data server Xrootd/cmsd XrdCnd Data server Xrootd/cmsd XrdCnd Data server Xrootd/cmsd XrdCnd Data server Xrootd/cmsd XrdCnd Internal network Border machines World SURL TURL Storage Architecture Batch nodes TXNetFile or “cp” Batch nodes TXNetFile or “cp” 3

Storage Components  Bestman Gateway  T2/T3g  XrootdFS  For users and minimum T3g Usage is like NFS Based on Xrootd Posix library and FUSE BeStMan, dq2 clients, and Unix tools need it  GridFTP for Xrootd  WT2 for a while Globus GridFTP + Data Storage Interface (DSI) module for Xrootd/Posix  Xrootd Core  All Babar needed is this layer Redirector, data servers, xrdcp More functionality 4

How Xrootd works Glue file servers together by a redirector User only need to know XROOT path: root://redirector:port//path/file Simple, low overhead  No complex features such as locking  Good for reading dominated environment, e.g. HEP data analysis Redirector /xrootd/file1 /xrootd/file2 /xrootd/file3 /xrootd/fileN U U: I want /xrootd/file2 R: Who has /xrootd/file2 D: I have /xrootd/file2 R: Please talk to data node 2 U: I want /xrootd/file0 R: Who has /xrootd/file0 Time out! R: Nobody has /xrootd/file0 Waiting for a reply from data nodes Data node 1,2,3,…,N 5

Xrootd Export Path, Disk Cache and Space Token $VDT_LOCATION/xrootd/etc/xrootd.cfg  Xrootd Export Path is what user will use to access file all.export = /xrootd => root://host:port//xrootd/file  Xrootd Disk Caches are hard disk partitions storing data files Filesystem Size Used Avail Use% Mounted on /dev/sdb 12G 6.0G 5.0G 55% /xrdcache01 oss.cache public /xrdcache01 Export Path contains directories and symlinks, pointing to data files OSS Cache  To support WLCG static space tokens, add more cache groups oss.cache public /xrdcache01 xa # “xa”: extend attributes oss.cache tokenA /xrdcache01 xa User create a file using root://host:port//xrootd/file?oss.cgroup=tokenA 6

Composite Name Space (CNS) A standalone Xrootd instance, not part of the main Xrootd cluster /xrootd/file1 /xrootd/file4 /xrootd/file2 /xrootd/file3 /xrootd/file6 /xrootd/file7 Real files, distributed on Several Xrootd nodes /xrootd/file5 redirector /xrootd/file1 /xrootd/file2 /xrootd/file3 /xrootd/file4 /xrootd/file5 /xrootd/file6 /xrootd/file7 /xrootd/file1 /xrootd/file2 /xrootd/file3 /xrootd/file4 /xrootd/file5 /xrootd/file6 /xrootd/file7 Empty files (with the “right size”). All in one Standalone Xrootd node. They are there for directory browsing By default CNS run on the redirector 7

User interface to Xrootd TXNetFile class (C++ and ROOT CINT) Fault tolerance High performance thought intelligent logics in TXNetFile and server Command line tools  xrdcp simple, native, light weight, high performance  Xrootd Posix preload library export LD_PRELOAD=/…/libXrdPosixPreload.so ls/cat/cp/file root://redirector:port//path/file A subset of UNIX I/O command will work with Posix preload library on files, not on directories Some overhead, I/O performance isn’t as good as xrdcp 8

User interface to Xrootd, cont’d XrootdFS, a client of Xrootd  Easy to use: NFS like accessing to data in Xrootd.  Relatively expensive compare to direct accessing Xrootd storage cluster User space Kernel space File system layer User probram XrootdFS daemon Client host 9

XrootdFS, cont’d File system interface for Xrootd Mount the Xrootd cluster on client host’s local file system tree Provide standard Posix I/O interface to the Xrootd cluster  open(), close(), read(), write(), lseek(), unlink(), rename()  opendir(), closedir(), readdir(), mkdir() Work with most UNIX commands/tools  cd, ls, cp, rm, mkdir, cat, grep, find  ssh/sftp server, gridftp server, SRM, xrootd server  scp/sftp, gridftp clients, SRM clients, ATLAS dq2 clients Be aware: no file locking, no ownership/protection file creation delay some UNIX command are not scalable, e.g. find, ls cp is slow (due to small I/O block size)  no longer true Reduce # of network connections to Xrootd data server More overhead on I/O performanc 10

Xrootd redirector Xrootd (CNS) XrootdFS User XrootdFS, cont’d Each Xrootd data server contains part of the directory tree CNS has a complete directory tree, with shadow files XrdCnsd forward create(), closew(), mkdir() Redirector broadcasts unlink(), rename(), rmdir() and truncate() 11

XrootdFS Configuration ($VDT_LOCATION/xrootdfs/bin/start.sh) XrootdFS is a Xrootd client. The following script starts XrootdFS export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/opt/vdt/xrootd/lib:/opt/fuse/lib export XROOTDFS_OFSFWD=0 # export XROOTDFS_USER='daemon’ export XROOTDFS_FASTLS="RDR” insmod /lib/modules/`uname -r`/kernel/fs/fuse/fuse.ko 2> /dev/null export XROOTDFS_RDRURL="root://xrootd-redirector:1094//xrootd” export XROOTDFS_CNSURL="root://CNS:2094//xrootd” (optional for non-interactive machines) MOUNT_POINT="/xrootd” xrootdfsd $MOUNT_POINT -o allow_other,fsname=xrootdfs,max_write= $ df -h Filesystem Size Used Avail Use% Mounted on xrootdfs 55T 34T 22T 62% /xrootd Use “umount /xrootd” to stop XrootdFS 12

Accessing Xrootd data from ATLAS jobs  Copy input data from Xrootd to local disk on WN A wrapper script using xrdcp, or cp + xrootd posix preload library Panda production jobs at SLACXRD work this way.  Read ROOT files directly from Xrootd storage Identify ROOT file using Unix ‘file’ command (w/ posix preload library) Copy non-ROOT files to local disk on WN Put ROOT file’s xroot URL (root://…) in PoolFileCatalog.xml Athena uses TXNetFile class to read ROOT file ANALY_SLAC and ANALY_SWT2_CPB use this mixed accessing mode. Both need a set of tools for copying, deleting, file id and checksum  Mount XrootdFS on all batch nodes All files appear under local file system tree. None of the above is needed Untested: XrootdFS came out after SLAC sites were established. 13

BeStMan Full mode and BeStMan Gateway mode Full implementation of SRM v2.2Full implementation of SRM v2.2 Support for dynamic space reservationSupport for dynamic space reservation Support for request queue management and space managementSupport for request queue management and space management Plug-in support for mass storage systemsPlug-in support for mass storage systems Follows the SRM v2.2 specificationFollows the SRM v2.2 specification Full implementation of SRM v2.2Full implementation of SRM v2.2 Support for dynamic space reservationSupport for dynamic space reservation Support for request queue management and space managementSupport for request queue management and space management Plug-in support for mass storage systemsPlug-in support for mass storage systems Follows the SRM v2.2 specificationFollows the SRM v2.2 specification Support for essential subset of SRM v2.2Support for essential subset of SRM v2.2 Support for pre-defined static space tokensSupport for pre-defined static space tokens Faster performance without queue and space managementFaster performance without queue and space management Follows the SRM functionalities needed by ATLAS and CMSFollows the SRM functionalities needed by ATLAS and CMS Support for essential subset of SRM v2.2Support for essential subset of SRM v2.2 Support for pre-defined static space tokensSupport for pre-defined static space tokens Faster performance without queue and space managementFaster performance without queue and space management Follows the SRM functionalities needed by ATLAS and CMSFollows the SRM functionalities needed by ATLAS and CMS 14

Bestman-Gateway for Xrootd Storage ($VDT_LOCATION/bestman/conf/bestman.rc) Stable! we tuned a few parameters Java heap size: (1300MB on a 2GB machine) Recently increased the # of contains thread from 5 to 25 Make sure BeStMan-G’s external dependences are working When Xrootd servers are under stress Xrootd stat() call takes too long: result in HTTP time out or CONNECT time out Redirector can’t locate a file, result in file not found Panda jobs (not going though SRM interface) will also suffer 15

GridFTP configuration  Globus GridFTP on XrootdFS No additional configuration May have performance penalty  Data Storage Interface (DSI) module for Xrootd/Posix Use along with Xrootd Posix preload library $ cat $VDT_LOCATION/vdt/services/vdt-run-gsiftp.sh #!/bin/sh. $VDT_LOCATION/setup.sh export LD_PRELOAD=/opt/xrootd/lib/libXrdPosixPreload.so export XROOTD_VMP=“xrootd-redirector:port:/xrootd=/xrootd” # Make sure “libglobus_gridftp_server_posix_gcc32dbg.so” is in LD_LIBRARY_PATH exec $VDT_LOCATION/globus/sbin/globus-gridftp-server -dsi posix How to access: root://xrootd-redirector:port//xrootd = gsiftp://gridftpserver/xrootd 16