Distributed cluster dynamic storage: a comparison of dcache, xrootd, slashgrid storage systems running on batch nodes Alessandra Forti CHEP07, Victoria.

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

ESLEA and HEPs Work on UKLight Network. ESLEA Exploitation of Switched Lightpaths in E- sciences Applications Exploitation of Switched Lightpaths in E-
GridKa May 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Installing dCache into an existing Storage environment at GridKa Forschungszentrum.
Steve Traylen Particle Physics Department Experiences of DCache at RAL UK HEP Sysman, 11/11/04 Steve Traylen
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Distributed Xrootd Derek Weitzel & Brian Bockelman.
EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
New CERN CAF facility: parameters, usage statistics, user support Marco MEONI Jan Fiete GROSSE-OETRINGHAUS CERN - Offline Week –
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
10 May 2007 HTTP - - User data via HTTP(S) Andrew McNab University of Manchester.
Security Middleware and VOMS service status Andrew McNab Grid Security Research Fellow University of Manchester.
16 th May 2006Alessandra Forti Storage Alessandra Forti Group seminar 16th May 2006.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
SLAC Experience on Bestman and Xrootd Storage Wei Yang Alex Sim US ATLAS Tier2/Tier3 meeting at Univ. of Chicago Aug 19-20,
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
LFC Replication Tests LCG 3D Workshop Barbara Martelli.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
Status SC3 SARA/Nikhef 20 juli Status & results SC3 throughput phase SARA/Nikhef Mark van de Sanden.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Drupal at CERN Juraj Sucik Jarosław Polok.
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
GridKa December 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann dCache Implementation at FZK Forschungszentrum Karlsruhe.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Western Tier 2 Site at SLAC Wei Yang US ATLAS Tier 2 Workshop Harvard University August 17-18, 2006.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.
StoRM + Lustre Proposal YAN Tian On behalf of Distributed Computing Group
Martina Franca (TA), 07 November Installazione, configurazione, testing e troubleshooting di Storage Element.
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
STORAGE EXPERIENCES AT MWT2 (US ATLAS MIDWEST TIER2 CENTER) Aaron van Meerten University of Chicago Sarah Williams Indiana University OSG Storage Forum,
CMS data access Artem Trunov. CMS site roles Tier0 –Initial reconstruction –Archive RAW + REC from first reconstruction –Analysis, detector studies, etc.
BeStMan/DFS support in VDT OSG Site Administrators workshop Indianapolis August Tanya Levshina Fermilab.
a brief summary for users
WLCG IPv6 deployment strategy
Understanding and Improving Server Performance
dCache “Intro” a layperson perspective Frank Würthwein UCSD
Experiences with http/WebDAV protocols for data access in high throughput computing
Service Challenge 3 CERN
Bernd Panzer-Steindel, CERN/IT
Model (CMS) T2 setup for end users
STORM & GPFS on Tier-2 Milan
GFAL 2.0 Devresse Adrien CERN lcgutil team
Simulation use cases for T2 in ALICE
Ákos Frohner EGEE'08 September 2008
Storage Virtualization
Integration of Singularity With Makeflow
Grid Canada Testbed using HEP applications
Support for ”interactive batch”
CC and LQCD dimanche 13 janvier 2019dimanche 13 janvier 2019
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

Distributed cluster dynamic storage: a comparison of dcache, xrootd, slashgrid storage systems running on batch nodes Alessandra Forti CHEP07, Victoria

3 September 2007Alessandra Forti2 Outline Hardware Storage setup Comparison Table Some tests results Conclusions

3 September 2007Alessandra Forti3 Manchester Nodes 900 nodes 2X2.8 GHz 2X2 GB RAMs 2x250 GB disks –Total space available for data ~350TB –Disk transfer speed ~470 Mb/s specs and benchmarked –WN disks are NOT raided disk 1: OS+scratch+data disk 2: data No tape storage. –Nor other more permanent storage as raided disk servers. Nodes divided in two identical, independent clusters. –Almost! –Head nodes have the same specs as the nodes.

3 September 2007Alessandra Forti4 Cluster Network 50 racks –Each rack 20 nodes 1 node 2x1 Gb/s 1 switch 2x1 Gb/s to central switch Central switch –10 Gb/s to regional network

3 September 2007Alessandra Forti5 dCache communication OSG twiki

3 September 2007Alessandra Forti6 dcache setup srm pnfs server admin pool(s) gridftp gsidcap pnfs disk Head Node pool(s) pnfs user process disk Batch Node pool(s) pnfs user process disk Batch Node user process Batch Node Rack1RackN [.....] dcache on all the nodes at least 2 permanent open connections on each node with the head node = ~900 connections per head

3 September 2007Alessandra Forti7 Xrootd communication Client Redirector (Head Node) Data Servers open file X A B C go to C open file X Who has file X? I have Cluster Client sees all servers as xrootd data servers Supervisor (sub-redirector) Who has file X? D E F I have go to F open file X I have A. Hanushevsky

3 September 2007Alessandra Forti8 Xrootd setup olbd Manager Head Node xrootd olbd server user process disk Batch Node xrootd olbd server user process disk Batch Node xrootd olbd supervisor user process disk Batch Node Rack1 RackN [.....] xrootd on 100 nodes only There are no permanent connections between the nodes

3 September 2007Alessandra Forti9 Slashgrid and http/gridsite slashgrid is a shared file system based on http. –ls /grid/ /dir Main target applications are –Input sand box –Final analysis of small ntuples It was develop as a light weight alternative to afs. –Its still in testing phase and is installed only on 2 nodes in Manchester For more information see poster – d=21&confId=3580http://indico.cern.ch/contributionDisplay.py?contribId=103&sessionI d=21&confId=3580 Although there is the possibility of an xrootd like architecture, in the tests it was used in the simplest way. –client contacting data server directly without any type of redirection. Transfer tests didnt involve /grid but htcp over http/gridsite User analisys test where done reading from /grid

3 September 2007Alessandra Forti10 Slashgrid setup apache gridsite slashgrid daemon apache gridsite slashgrid daemon slashgrid daemon remote apache user process user process user process disk Batch Node Rack1

3 September 2007Alessandra Forti11 Comparison table dcachexrootdslashgrid http/gridsite rpms212+4 config files3611 log files2011 databases to manage200 srmyesno resilienceyesno load balancingyes gsi authentication and VOMS compat yesnoyes configuration toolsyaim for EGEE sites VDT for OSG sites not really needed name space/pnfsnone/grid number of protocols supported 513

3 September 2007Alessandra Forti12 Some tests results Basic tests Transfers: –dccp, srmcp, xrdcp, htcp User analysis job over a set of ntuples: –dcache, xrootd, slashgrid, root/http, afs htcp in different conditions srmcp strange behaviour BaBar skim production jobs –A real life application AFS vs slashgrid –real time against user time

3 September 2007Alessandra Forti13 Transfer tools htcp a job kicked in on the serving node Same file copied many times over many times xrootd server busy with jobs gridftp and dccp have access to replicas

3 September 2007Alessandra Forti14 User analysis Same set of small files copied many times: 29 files ~79MB each

3 September 2007Alessandra Forti15 htcp tests 1GB files parallel process kicks in memory speed writing on the same disk parallel process writing on different disk

3 September 2007Alessandra Forti16 Dcap API same as htcp transferring data from memory

3 September 2007Alessandra Forti17 srmcp tests 1GB files list of 5 1GB files –files on different pools –files replicated 1 st test: –copy 10 times the each file for 5 times 2 nd test: –copy each file once for 5 times 3 rd test: –same as the first one 4 th test: –the same as the first one in different sequence Drop in efficiency after the first loop in each case Cluster empty no concurrent processes to explain the drop. –Needs more investigation

3 September 2007Alessandra Forti18 BaBar skim production average reading rate 10 Mb/s two files in input reading non sequential –subset of events –jumps from 1 file to another average job efficiency (cpu time/elapsed) is 98.6% Even increasing the speed of a factor 10 wouldnt change the job efficiency!!

3 September 2007Alessandra Forti19 afs vs slashgrid Manchester department uses AFS –Local experiment software installation –User shared data AFS measures for comparison with slashgrid. –cache smaller than the amount of data read the job was reading from disk after the first time all the same. Slashgrid in not designed to do it. AFS copying from cache

3 September 2007Alessandra Forti20 Conclusions dcache more complicated, difficult to manage, but has 3 features difficult to beat –resilience –srm front end –5 different protocols xrootd is elegant and simple but all the data management part is in the users/admin hands and the lack of an SRM front end makes it unappealing for the grid community. slashgrid could be useful for software distribution –for data reading is still too slow. –over AFS it has the advantage of being easy to install and maintain. Speed within the cluster is comparable among different protocols –Applications like BaBar skim production demonstrate that a very high speed is not required. –However more measurements are needed to better understand different behavior especially when cluster features enter the equation.