– n° 1 StoRM latest performance test results Alberto Forti Otranto, Jun 8 2006.

Slides:

Advertisements

Similar presentations

Lectures on File Management

Advertisements

1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.

EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari

Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.

Advanced topic: The SRM protocol and the StoRM implementation Ezio Corso (EGRID Project, ICTP)

16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.

File System. NET+OS 6 File System Architecture Design Goals File System Layer Design Storage Services Layer Design RAM Services Layer Design Flash Services.

CERN, 29 August 2006 Status Report Riccardo Zappi INFN-CNAF, Bologna.

A. Sim, CRD, L B N L 1 Oct. 23, 2008 BeStMan Extra Slides.

03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.

Status report on SRM v2.2 implementations: results of first stress tests 2 th July 2007 Flavia Donno CERN, IT/GD.

A. Sim, CRD, L B N L 1 OSG Applications Workshop 6/1/2005 OSG SRM/DRM Readiness and Plan Alex Sim / Jorge Rodriguez Scientific Data Management Group Computational.

LCG Middleware Testing in 2005 and Future Plans E.Slabospitskaya, IHEP, Russia CERN-Russia Joint Working Group on LHC Computing March, 6, 2006.

Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 3: Operating-System Structures System Components Operating System Services.

GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh

Write-through Cache System Policies discussion and A introduction to the system.

D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.

Andrew C. Smith – Storage Resource Managers – 10/05/05 Functionality and Integration Storage Resource Managers.

 CASTORFS web page - CASTOR web site - FUSE web site -

SRM workshop – September’05 1 SRM: Expt Reqts Nick Brook Revisit LCG baseline services working group Priorities & timescales Use case (from LHCb)

CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.

1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.

WLCG Grid Deployment Board, CERN 11 June 2008 Storage Update Flavia Donno CERN/IT.

USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.

SRM & SE Jens G Jensen WP5 ATF, December Collaborators Rutherford Appleton (ATLAS datastore) CERN (CASTOR) Fermilab Jefferson Lab Lawrence Berkeley.

1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.

1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.

CERN SRM Development Benjamin Coutourier Shaun de Witt CHEP06 - Mumbai.

6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.

Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.

Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.

Padova, 5 October StoRM Service view Riccardo Zappi INFN-CNAF Bologna.

David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.

Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.

EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.

CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.

WLCG Grid Deployment Board CERN, 14 May 2008 Storage Update Flavia Donno CERN/IT.

1.3 ON ENHANCING GridFTP AND GPFS PERFORMANCES A. Cavalli, C. Ciocca, L. dell’Agnello, T. Ferrari, D. Gregori, B. Martelli, A. Prosperini, P. Ricci, E.

DMLite GridFTP frontend Andrey Kiryanov IT/SDC 13/12/2013.

Gfal-srm-ifce library CERN Todor Manev

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

StoRM: status report A disk based SRM server.

SRM-iRODS Interface Development WeiLong UENG Academia Sinica Grid Computing 1.

CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.

BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.

1 SRM v2.2 Discussion of key concepts, methods and behaviour F. Donno CERN 11 February 2008.

1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.

INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.

R&D activities at INFN - CNAF Tiziana Ferrari on behalf ofAntonia Ghiselli CNAF, Mar

ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)

An Introduction to GPFS

CMS User Support and Beijing Site Xiaomei Zhang CMS IHEP Group Meeting March

Bologna, March 30, 2006 Riccardo Zappi / Luca Magnoni INFN-CNAF, Bologna.

Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.

15.June 2004Bernd Panzer-Steindel, CERN/IT1 CERN Mass Storage Issues.

a brief summary for users

StoRM: a SRM solution for disk based storage systems

Status of the SRM 2.2 MoU extension

SRM v2.2 / v3 meeting report SRM v2.2 meeting Aug. 29

StoRM Architecture and Daemons

SRM Developers' Response to Enhancement Requests

T-StoRM: a StoRM testing framework

GFAL 2.0 Devresse Adrien CERN lcgutil team

The INFN Tier-1 Storage Implementation

Storage Virtualization

Experience with GPFS and StoRM at the INFN Tier-1

Chapter 1 Introduction to Operating System Part 5

INFNGRID Workshop – Bari, Italy, October 2004

Presentation transcript:

– n° 1 StoRM latest performance test results Alberto Forti Otranto, Jun

– n° 2 StoRM : Storage Resource Manager  Description : StoRM is a disk based storage resource manager. It implements the SRM v standard interface. StoRM is designed to support guaranteed space reservation and direct access (native posix I/O call), as well as other I/O libraries like RFIO. Security aspects are based on user identity (VOMS certificate).  StoRM is designed for taking advantage from high performance distributed file systems like GPFS. Also standard POSIX file systems are supported, like ext3 and XFS, and new plug-ins could be easily developed.

– n° 3 StoRM: Functionalities  SRM Interface V Dynamic management of (disk) storage resources (files and space):  Introduces concepts of lifetime of a file (volatile with a fixed amount of life-time, durable or permanent), file pinning (to ensure a file is not cancelled during operation), space pre-allocation (to ensure the requested disk space is available for the whole life of the application since the beginning).  Files are no longer permanent entities on the storage, but dynamical ones that can appear and disappear according to the user’s specifications (when life-time expires it is available to a garbage collector for deletion without further notice).  More relevant functionalities (already implemented and tested in StoRM): Data Transfer :  srmPrepareToPut()  creates a file and on request allocates disk space.  srmPrepareToGet()  pins a file and forbids deletion by the SRM.  srmCopy()  asynchronously, creates a file on one side, pins it on the other side, and execute the transfer  srm*Status()  check the status of asynchronous operations submitted.  srmPutDone()  tells the SRM that the file has been written and then can be deleted if e.g. lifetime expires. Space Management : srmReserveSpace(), srmGetSpaceMetaData().  To allocate big chunks of disk space to be subsequently used by independent files (similar to allocating space for a single file, but done at once). Directory functions : srmLs() with recursive option, srmRm(), srmRmDir(), srmMkDir().

– n° 4 StoRM – Grid scenario

– n° 5 Simple minded use case (I)  Analysis job without StoRM (i.e. without SRM v2.1): The job locates the input data physical path on the SE (i.e. via file catalogues). The job copies (e.g. via GridFTP if remote SE or RFIO if local SE) the input datasets from the SE onto the local disk of the WN. The job processes the dataset and writes output data onto the local disk. The job copies the locally produced data (e.g. ntuples) back to the SE.  DRAWBACKS! if the local disk gets full during the job lifetime (e.g. other jobs running on the same WN exhaust the available space) the job will fail. If the SE fills up (or the quota is over) during the job execution data cannot be copied back, and the job will fail. No dynamical usage of the SE disk space is allowed, files should be taken permanently resident on the disk, and just cleaned by some central administrator or the owner him/herself at some point.

– n° 6 Simple minded use case (II)  Analysis with StoRM (i.e. with SRM v2.1) The job locates the input data physical path on the SE (i.e. via file catalogues). In case the file is already on the local SE:  The job pins the file on the SE through a SRM prepare to get call, to ensure it is not cancelled during job operation. In case the file is available from a remote SE:  The job executes a srmCopy v2 call to transfer the data from the remote SE to the local SE (this assuming that a local disk-based cache storage is always available) assigning to the local copy a fixed lifetime according to the estimated lifetime of the job itself. The job opens the input files on the local SE (using UNIX-like, i.e. POSIX system calls – no need of additional protocols embedded in the application). The job creates output files for writing to the local SE (SRM prepare to put call), pre-allocating the estimated size of disk space for the job output, and opens them. The job processes the input datasets and writes the output data files from/to the local SE. The job unpins the input data files and the output data files (releaseFile and PutDone SRM v2 calls). It does not need to rely on the availability of the local disk, and no further copies are needed.  NB: More advanced scenarios include higher layers as FTS for data transfer SRM v2 is not a replacement of FTS!

– n° 7 Early StoRM testbed at CNAF T1 (pre-Mumbai)  Framework: The disk storage was composed by roughly 40 TB, provided by 20 logical partitions of one dedicated StorageTEK FLexLine FLX680 disk array storage, aggregated by GPFS.  Write test: srmPrepareToPut() with implicit reserveSpace of 1GB files. globus-url-copy from local source to the returned TURL. 80 simultaneous client processes.  Read test: srmPrepareToGet() followed by globus-url- copy from the returned TURL to a local file (1 GB files). 80 simultaneous client processes.  Results: Sustained read and write throughputs measured : 4 Gb/s and 3 Gb/s respectively. The two tests are meant to validate the functionality and robustness of the srmPrepareToPut() and srmPrepareToGet() functions provided by StoRM, as well as to measure the read and write throughputs of the underlying GPFS file system.

– n° 8 Latest functionalities stress test  Early testbed not available any more due to storage needs of T1 operations  Current (small) testbed: 2 StoRM instances Server1: Dual-PIII 1GHz, 512MB, Fast Ethernet (100 Mb/s) Server2: Dual-Xeon 2.4 GHz, 2GB RAM, Fast Ethernet (100 Mb/s) Each machine mounts a small GPFS volume and runs:  StoRM server(s).  MySQL, GridFTP and other services.  Description: stress tests for each functionality provided. Large number of requests ( ) performed with different submission rate ( requests per second). For each test was evaluated:  Execution rate.  Number of failures.  NB: Performances hereby reported are strictly related to the underlying systems used (CPU, disks, memory), and would scale up significantly by using more performant hardware.

– n° 9 Functionalities stress test: results  Synch Functions: Mkdir:  2000 requests with submission rate of ~30 Hz.  Executed at ~10 Hz (depends on underlying GPFS, and in this case it was a toy one on local disks) without failures. Rmdir:  2000 requests with submission rate of ~30 Hz.  Executed at ~30 Hz without failures. Rm:  1000 requests for files of 1 MB with submission rate of ~30 Hz.  Executed at ~30 Hz without failures. SrmLS:  1000 requests on single files submitted at ~60 Hz, executed at ~20 Hz.  1 single LS on directory with 1000 file, 6 s.  3 simultaneous LS on directory with 1000 file, 10 s.  12 simultaneous LS on directory with 1000 file, 30 s. …  No failures.

– n° 10 Functionalities stress test: results  Asynch Functions: PrepareToPut  request submitted at 10 Hz rate (10 request for seconds)  Executed at 10Hz rate by StoRM. PrepareToGet  1000 request with submission rate 30 Hz, executed at 20 Hz.  SrmCopy request of 10 MB files submitted at rate of 10 Hz, 50 gridftp transfers at once (transfer pool threads in StoRM was set to 50, but can be tuned by StoRM option files). Poor data throughput (just Fast Ethernet connectivity) but perfect enqueuing of the copies, all executed slowly at 1 Hz according to the available bandwidth (100 Mb/s).

– n° 11 Test with official srm-client v2.1.x  For interoperability issues we performed tests with srm clients ditributed by the srm working group.  Results: Clients are able to communicate with our server. Most functionalities are executed with success.(e.g. SrmRm, SrmMkdir, SrmReserveSpace, SrmPrepareToPut, SrmPrepareToGet, SrmGetRequestSummary, SrmCopy). We have some problems with interpretation of some specific parameter (e.g. Date time in SrmLS, recursive flag in SrmRmdir, …). We believe that with some work and interaction with client developers StoRM will gain full interoperability.

– n° 12 StoRM : Conclusions  The results obtained, in terms of rates and efficiencies, are very interesting, even if using cheap and old hardware, as we did for our latest tests.  The system can be scaled up to N StoRM servers working in parallel on the same (or different) GPFS volume(s), with a centralized persistent database ( mysql at the moment, but the support for other vendors, e.g. Oracle, can be easily put in place in future releases ) similarly to the Castor-SRM or dCache.  After a fruitful debugging phase, that will be now extended over the next 2-3 weeks, StoRM will be ready to be deployed to production environments. The production release candidate will be ready by the end of June.  The StoRM Project is a collaboration between INFN/CNAF (3 developers) and EGRID/ICTP (3 developers).