Presentation is loading. Please wait.

Presentation is loading. Please wait.

A. Sim, CRD, L B N L 1 Data Management Foundations Workshop, Mar. 3, 2009 Storage in OSG and BeStMan Alex Sim Scientific Data Management Research Group.

Similar presentations


Presentation on theme: "A. Sim, CRD, L B N L 1 Data Management Foundations Workshop, Mar. 3, 2009 Storage in OSG and BeStMan Alex Sim Scientific Data Management Research Group."— Presentation transcript:

1 A. Sim, CRD, L B N L 1 Data Management Foundations Workshop, Mar. 3, 2009 Storage in OSG and BeStMan Alex Sim Scientific Data Management Research Group Computational Research Division Lawrence Berkeley National Laboratory

2 A. Sim, CRD, L B N L 2 Data Management Foundations Workshop, Mar. 3, 2009 Storages in OSG OSG supports users!OSG supports users! Storage Group in OSG Packaging storage software for VDT dCache, BeStMan, BeStMan-gateway/Xrootd Develop/maintain/package accounting and monitoring tools Support/test/package community tools Simplify configuration/installation for OSG Develop and run validation tests Perform troubleshooting and debugging Provide users and admins support Help VOs to use storage on OSG sites OSG liaison to storage developer groups Educate OSG community about storage, provide documentation Participate in grid schools organized by OSG Two commonly used, very different storage services in OSGTwo commonly used, very different storage services in OSG dCache (FNAL, DESY) BeStMan (LBNL) 2 These are SRMs! And end-users do not see the differences in accessing these!

3 A. Sim, CRD, L B N L 3 Data Management Foundations Workshop, Mar. 3, 2009 What is SRM? SRM : Storage Resource ManagerSRM : Storage Resource Manager Well-defined storage management interface specification based on standard Different implementations for underlying storage systems are based on the same SRM specification Provides dynamic space allocation and file management on shared storage components on the Grid Over 300 deployments of different SRM servers in the world Managing more than 10 PB

4 A. Sim, CRD, L B N L 4 Data Management Foundations Workshop, Mar. 3, 2009 Why do you need SRM? Suppose you want to run a job on your local machineSuppose you want to run a job on your local machine Need to allocate space, and bring all input files Need to ensure correctness of files transferred Need to monitor and recover from errors What if files don’t fit space? Need to manage file “streaming” Need to remove files to make space for more files Need to remove files after the job is done for more jobs Now, suppose that the machine and storage space is a shared resourceNow, suppose that the machine and storage space is a shared resource Need to do the above for many users, Need to enforce quotas Need to ensure fairness of space allocation and scheduling Now, suppose you want to do that on a GridNow, suppose you want to do that on a Grid Need to access a variety of storage systems mostly remote systems, need to have access privileges Need to have special software to access mass storage systems Now, suppose you want to run distributed jobs on the GridNow, suppose you want to run distributed jobs on the Grid Need to allocate remote spaces Need to copy (or stream) files to remote sites Need to manage file outputs and their transfer to destination site(s)

5 A. Sim, CRD, L B N L 5 Data Management Foundations Workshop, Mar. 3, 2009 What does SRM do? Support space management for files with lifetimeSupport space management for files with lifetime Allocation of space, garbage collection Support dynamic space reservation – opportunistic storageSupport dynamic space reservation – opportunistic storage Support for multiple file transfer protocolsSupport for multiple file transfer protocols Support for transfer protocol negotiation Support for multiple file transfer servers Incoming and outgoing file transfer queue management and transfer monitoring Support for asynchronous multi-file requestsSupport for asynchronous multi-file requests Directory management and ACLsDirectory management and ACLs Support file sharing and file streamingSupport file sharing and file streaming Gives compatibility and interoperability based on standardGives compatibility and interoperability based on standard

6 A. Sim, CRD, L B N L 6 Data Management Foundations Workshop, Mar. 3, 2009 What do users do with SRM? File System / Storage Gridftp server...... SRM Client srmPrepareToGet/Put TURL GridFTP file transfers srmReleaseFiles/srmPutDone PUT/GET File system / Storage Gridftp server...... SRM Client srmLs/srmRm/srmMkdir/srmRmdir Ls/Rm/Mkdir/Rmdir

7 A. Sim, CRD, L B N L 7 Data Management Foundations Workshop, Mar. 3, 2009 SRMs Facilitate Analysis Jobs SRM Disk Cache DISK CACHE Client Job Gate Node Worker Nodes Disk Client Job Client Job submission Fermilab SRM/iRODS dCache SC2008 Demo: Analysis jobs at NERSC/LBNL with 6 SRM implementations at 12 participating sites

8 A. Sim, CRD, L B N L 8 Data Management Foundations Workshop, Mar. 3, 2009 Interoperability in SRMs Clients CASTOR dCache Fermilab Disk BeStMan StoRM DPM SRM/iRODS xrootd

9 A. Sim, CRD, L B N L 9 Data Management Foundations Workshop, Mar. 3, 2009 Berkeley Storage Manager (BeStMan) Light-weight implementation of SRM v2.2Light-weight implementation of SRM v2.2 Works on existing disk storages with posix compliant file systems E.g. NFS, GPFS, GFS, NGFS, PNFS, HFS+, PVFS, Lustre, Xrootd, Hadoop, Ibrix Supports multiple partitions Adaptable to other file systems and storages Supports customized plug-in for file system access Supports customized plug-in for MSS to stage/archive such as HPSS Easy adaptability and integration to special project environments Supports multiple transfer protocolsSupports multiple transfer protocols Supports load balancing for multiple transfer servers Scales well with some file systems and storagesScales well with some file systems and storages Xrootd, Hadoop Works with grid-mapfile or GUMS serverWorks with grid-mapfile or GUMS server Simple installation and easy maintenanceSimple installation and easy maintenance Packaged in VDT using PacmanPackaged in VDT using Pacman Who would benefit from BeStMan?Who would benefit from BeStMan? Sites with limited resources and/or limited admin effort

10 A. Sim, CRD, L B N L 10 Data Management Foundations Workshop, Mar. 3, 2009 What can BeStMan do? In addition to what SRMs do….In addition to what SRMs do…. Dynamic installation, configuration and runningDynamic installation, configuration and running If the target host does not have an SRM, BeStMan can be installed, configured, and started with a few commands by the user. BeStMan can restrict all user access to certain directory paths through configurationBeStMan can restrict all user access to certain directory paths through configuration BeStMan (full mode) can be configured to restrict user access to files by owners/creators only without external service componentBeStMan (full mode) can be configured to restrict user access to files by owners/creators only without external service component A site can customize the load-balancing mechanism for transfer servers through plug-inA site can customize the load-balancing mechanism for transfer servers through plug-in

11 A. Sim, CRD, L B N L 11 Data Management Foundations Workshop, Mar. 3, 2009 Some BeStMan Use Cases CMSCMS BeStMan Gateway as an SRM frontend for Hadoop at UNL Passed all the automated CMS tests through EGEE SAM ATLASATLAS BeStMan on NFS BeStMan Gateway on Xrootd/FS, GPFS, Ibrix STARSTAR Data replication between BNL and NERSC/LBNL HPSS access at BNL and NERSC SRMs in production for over 4 years Part of analysis scenario to move job-generated data files from PDSF/NERSC to remote BNL storage Earth System GridEarth System Grid Serving about 6000 users Over a million files and 170TB of climate data from 5 storage sites (LANL, LLNL, NCAR, NERSC, ORNL) Uses an adapted BeStMan for NCAR’s own MSS

12 A. Sim, CRD, L B N L 12 Data Management Foundations Workshop, Mar. 3, 2009Summary SRM is an essential part of Grid and OSGSRM is an essential part of Grid and OSG Users have uniform access to Grid storages OSG supports users! BeStMan is an implementation of SRM v2.2BeStMan is an implementation of SRM v2.2 Great for disk-based storage and file systems BeStMan Gateway mode gives scalable performance on some file systems and storages Easy installation and maintenance through VDT or tar file Works with other SRM v2.2 implementations Servers: CASTOR, dCache, DPM, StoRM, SRM/SRB, … Clients: PhEDEx, FTS, glite-url-copy, lcg-cp, srm-copy, srmcp, … In OSG, WLCG/EGEE, ESG, …

13 A. Sim, CRD, L B N L 13 Data Management Foundations Workshop, Mar. 3, 2009 Documents and Support OSG Storage documentationOSG Storage documentation https://twiki.grid.iu.edu/twiki/bin/view/Documentation/WebHome BeStManBeStMan http://datagrid.lbl.gov/bestman https://twiki.grid.iu.edu/bin/view/Documentation/BestmanGateway https://twiki.grid.iu.edu/bin/view/Documentation/BestmanGateway-Xrootd Xrootd and XrootdFSXrootd and XrootdFS http://xrootd.slac.stanford.edu http://wt2.slac.stanford.edu/xrootdfs/xrootdfs.html SRM Collaboration and SRM SpecificationsSRM Collaboration and SRM Specifications http://sdm.lbl.gov/srm-wg Contact and supportContact and support osg-storage@opensciencegrid.orgosg-storage@opensciencegrid.org SRM@LBL.GOVSRM@LBL.GOV


Download ppt "A. Sim, CRD, L B N L 1 Data Management Foundations Workshop, Mar. 3, 2009 Storage in OSG and BeStMan Alex Sim Scientific Data Management Research Group."

Similar presentations


Ads by Google