1 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November 21 2006 T2 storage issues M. Biasotto – INFN Legnaro.

Slides:

Advertisements

Similar presentations

1 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November T2 storage issues M. Biasotto – INFN Legnaro.

Advertisements

Storage Workshop Summary Wahid Bhimji University Of Edinburgh On behalf all of the participants…

LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.

Deployment Team. Deployment –Central Management Team Takes care of the deployment of the release, certificates the sites and manages the grid services.

EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari

Distributed Tier1 scenarios G. Donvito INFN-BARI.

– n° 1 StoRM latest performance test results Alberto Forti Otranto, Jun

Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.

Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.

PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.

Experiences Deploying Xrootd at RAL Chris Brew (RAL)

LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.

16 th May 2006Alessandra Forti Storage Alessandra Forti Group seminar 16th May 2006.

StoRM Some basics and a comparison with DPM Wahid Bhimji University of Edinburgh GridPP Storage Workshop 31-Mar-101Wahid Bhimji – StoRM.

LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.

Data management for ATLAS, ALICE and VOCE in the Czech Republic L.Fiala, J. Chudoba, J. Kosina, J. Krasova, M. Lokajicek, J. Svec, J. Kmunicek, D. Kouril,

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

Light weight Disk Pool Manager experience and future plans Jean-Philippe Baud, IT-GD, CERN September 2005.

Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.

Stefano Belforte INFN Trieste 1 CMS Simulation at Tier2 June 12, 2006 Simulation (Monte Carlo) Production for CMS Stefano Belforte WLCG-Tier2 workshop.

1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.

INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.

USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.

VMware vSphere Configuration and Management v6

Performance Tests of DPM Sites for CMS AAA Federica Fanzago on behalf of the AAA team.

6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.

Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.

OSG Abhishek Rana Frank Würthwein UCSD.

Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.

BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.

Padova, 5 October StoRM Service view Riccardo Zappi INFN-CNAF Bologna.

Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.

Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.

EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.

CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.

DMLite GridFTP frontend Andrey Kiryanov IT/SDC 13/12/2013.

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

StoRM: status report A disk based SRM server.

Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.

CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.

BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.

1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.

INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.

Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.

StoRM + Lustre Proposal YAN Tian On behalf of Distributed Computing Group

An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)

DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.

An Introduction to GPFS

Bologna, March 30, 2006 Riccardo Zappi / Luca Magnoni INFN-CNAF, Bologna.

Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.

Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.

J Jensen / WP5 /RAL UCL 4/5 March 2004 GridPP / DataGrid wrap-up Mass Storage Management J Jensen

CASTOR: possible evolution into the LHC era

Jean-Philippe Baud, IT-GD, CERN November 2007

DPM at ATLAS sites and testbeds in Italy

LCG Service Challenge: Planning and Milestones

StoRM: a SRM solution for disk based storage systems

Vincenzo Spinoso EGI.eu/INFN

dCache “Intro” a layperson perspective Frank Würthwein UCSD

StoRM Architecture and Daemons

Introduction to Data Management in EGI

SRM Developers' Response to Enhancement Requests

SRM2 Migration Strategy

CMS transferts massif Artem Trunov.

STORM & GPFS on Tier-2 Milan

Ákos Frohner EGEE'08 September 2008

The INFN Tier-1 Storage Implementation

Data Management cluster summary

Experience with GPFS and StoRM at the INFN Tier-1

INFNGRID Workshop – Bari, Italy, October 2004

Presentation transcript:

1 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November T2 storage issues M. Biasotto – INFN Legnaro

2 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November T2 issues  Storage management is the main issue for a T2 site  CPU and network management are easier and we have reached some stability years of experience stable tools (batch systems, installation, etc.) total number of machines for average T2 is small: O(100)  Several different issues in storage hardware: which kind of architecture and technology? hw configuration and optimization storage  cpu network storage resource managers

3 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November Hardware  Which kind of hardware for T2 storage? DAS (Direct Attached Storage) Servers: cheap and good performances SAN based on SATA/FC disk-arrays and controllers: flexibility and reliability Others? (iSCSI, ATA over Ethernet),.... There are already working groups dedicated to this (technology tracking, tests, etc.), but information is a bit dispersed  SAN with SATA/FC disks preferred by most sites but economic concerns: will fundings be enough for this kind of solution?  Important, but not really critical? once you have bought some kind of hardware, you are stuck with it for years, but it’s usually possible to mix different types

4 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November Storage configuration  Optimal storage configuration is not easy, a lot of factors to take into consideration how many TB per server? which RAID configuration? fine tuning of parameters: in disk-arrays, controllers and servers (cache, block sizes, buffer sizes, kernel params,... a long list)  Storage pool architecture: is one large pool enough, or is it necessary to split? buffer pools (WAN transfer buffer, local WN buffer)? different pools for different activities (production pool, analysis pool)?  Network configuration: avoid bottlenecks between servers and CPU  Optimal configuration depends strongly on the application 2 main (very different) types of access: remote I/O from WN or local copy to/from WN. Currently CMS uses remote and Atlas local I/O production and analysis activities have different access pattern

5 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November Storage configuration  Optimal configuration varies depending on many factors: there is not a single simple solution every site will have to fine-tune its own storage and it will vary over time  But having some guidelines would be useful exploit current experience (mostly at T1)  Can have huge effects on performances, but it’s not so critical if you have enough flexibility many parameters can be easily changed and adjusted a SAN hardware architecture is much more flexible than DAS (rearrange the storage, increase or reduce number of servers,...)

6 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November Storage Resource Manager  Which Storage Resource Manager for a T2? DPM and dCache already in use at most of LCG sites Storm is the INFN developed tool Xrootd protocol required by Alice: currently being implemented in dCache and DPM  The choice of a SRM is a more critical issue: it’s much more difficult to change adopting one and learning how to use it is a large investment: know- how in deployment, configuration, optimization, problem finding and solving,... obvious practical problems if a site has a lot of data already stored: data migration, catalogues update (often outside control of site)  First half of 2007 last chance for a final decision? of course nothing is ever ‘final’, but after that a transition would be much more problematic

7 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November Current status of INFN sites SiteHardwareStorage ManagerTeraBytes BariDASdCache10 CataniaSATA/FCDPM19 FrascatiSATA/FCDPM6 LegnaroSATA/FCDPM17 MilanoSATA/FCDPM3 NapoliSATA/SCSIDPM5 PisaSATA/FCdCache (*) 8 RomaDPM (**) TorinoSATA/FC (*) Pisa recently switched from DPM to dCache (**) Roma has 2 DPM installations (CMS and Atlas)

8 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November Requirements  Performance & scalability how much is needed for a T2? WAN bandwithO(100) MB/s LAN bandwithO(400) MB/s DiskO(500) TB Concurrent accessO(400)  Reliability & stability  Advanced features data management (replication, recovery), monitoring, configuration tuning,...  Cost (in terms of human and hardware resources) does the tool architecture address the scalability issue?

9 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November dCache  dCache is currently the most mature product used in production since a few years deployed at several large sites: T1 FNAL, T1 FZK, T1 IN2P3, all US- CMS T2s, T2 Desy,...  There is no doubt it will satisfy the performance and scalability needs of a T2  Two key features to guarantee performance and scalability  Services can be split among different nodes all ‘access doors’ (gridftp, srm, dcap) can be replicated also ‘central services’ (which usually run all on the admin node) can be distributed

10 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November dCache  “Access queues” to manage high number of concurrent accesses storage access requests are queued and can be distributed, prioritized, limited based on protocol type or access type (read/write) buffer for temporary high load, avoid server overloading  Provides a lot of advanced features data replication (for 'hot' datasets) pool match-making dynamic and highly configurable pool draining for scheduled maintenance operations grouping and partitioning of pools internal monitoring and statistics tool

11 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November dCache issues  Services are heavy and not much efficient written in java, require a lot of RAM and CPU central services can be split: do they need to be split? Even in a T2 site? Having to manage several dCache admin nodes could be a problem  Still missing VOMS support and SRM v2, but both should be available soon  More expensive in terms of human resources more difficult to configure and maintain steeper learning curve, documentation needs to be improved  It’s more complex, with more advanced features, and this obviously comes at a cost does a T2 need the added complexity and features, can they be afforded?

12 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November INFN dCache experience  Performance test at CNAF + Bari (end 2005) demonstrated performance needed for T admin node and 4 pool nodes slide by G.Donvito (Bari)

13 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November  Pisa experience: SC4 started with DPM, during the summer switched to dCache DPM problems in CMS, data loss after hw failure, overall impression that a more mature solution was really needed participated in CSA06 challenge with dCache, they are pleased with the change INFN dCache experience  Used in production at Bari since May 2005, building up a lot of experience and know-how heavily used in SC4 CMS LoadTest, good stability and perfomance Bari WAN yearly graphSC4 activity Pisa WAN yearly graph

14 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November Storm  Developed in collaboration between INFN-CNAF and ICTP-EGRID (Trieste)  Designed for disk-based storage: implements a SRM v2 interface on top of an underlying parallel or cluster file-system (GPFS, Lustre, etc.)  Storm takes advantage of the aggregation functionalities of the underlying file-system to provide performance, scalability, load balancing, fault tolerance,...  Not bound to a specific file-system separation between SRM interface and data management functionalities is an interesting feature of Storm: in principle it allows to exploit the very high research and development activity in the clustering file-system field  support of SRM v2 functionalities (space reservation, lifetime, file pinning, pre-allocation,...) and ACL  Full VOMS support

15 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November Storm  Central services scalability  Storm service has 3 components: Front-End (web service for SRM interface), Back-End (core) and DataBase FE and BE can be replicated and distributed on multiple nodes centralized database: currently MySql, possible others (Oracle) in future releases  Advanced fetaures provided by the underlying file-system GPFS: data replication, pool vacation

16 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November Storm issues  Not used by any LCG site so far (not compatible with SRM v1), and only few ‘test installations’ at external sites  It’s likely that a first “field test” would result in a lot of small issues and problems (shouldn’t be a concern in the longer term)  Installation and configuration procedures not yet reliable enough recent integration with yaim and more external deployments should quickly bring improvements in this area  No access queue for concurrent access management  No internal monitoring (or only partial, provided by fs)  There could be compatibility issues between the underlying cluster file- system and some VO applications some file-systems have specific requirements on kernel version

17 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November INFN Storm experience  Obvioulsy CNAF has all the necessary know-how on Storm  Also GPFS experience within INFN, mostly at CNAF but not only (Catania, Trieste, Genova,...) overall good in term of performance, scalability and reliability  Storm installations CNAF (interoperability test within SRM-WG group) CNAF (pre-production) Padova (EGRID project) Legnaro and Trieste (GridCC project)  Performance test at CNAF at beginning of 2006 Storm + GPFS v2.3 testbed (see next slide)

18 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November  Framework: The disk storage was composed by roughly 40 TB, provided by 20 logical partitions of one dedicated StorageTEK FLexLine FLX680 disk array storage, aggregated by GPFS.  Write test: srmPrepareToPut() with implicit reserveSpace of 1GB files. globus-url-copy from local source to the returned TURL. 80 simultaneous client processes.  Read test: srmPrepareToGet() followed by globus-url- copy from the returned TURL to a local file (1 GB files). 80 simultaneous client processes.  Results: Sustained read and write throughputs measured : 4 Gb/s and 3 Gb/s respectively. The two tests are meant to validate the functionality and robustness of the srmPrepareToPut() and srmPrepareToGet() functions provided by StoRM, as well as to measure the read and write throughputs of the underlying GPFS file system. Performance test at CNAF (slide by Storm team)

19 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November DPM  DPM is the SRM system suggested by LCG, distributed with LCG middleware good Yaim integration: easy to install and configure possible migration from old classic SE it’s the natural choice for a LCG site that quickly needs to setup a SRM  As a result of all this, there are a lot of DPM installations around  VOMS support (including ACL)  SRM v2 implementation (but still limited functionalities)

20 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November DPM issues  Still lacking many functionalities (some of them important) load balancing very simple (round robin among file-systems in pool) and not configurable data replication still buggy in current release, no pool draining for server maintenance or dismission no pool selection based on path no internal monitoring  Scalability limits? no problem for rfio and gridftp services: easily distributed on pool servers but ‘central services’ on head node? In principle ‘dpm’ ‘dpns’ and mysql services can be split: not tested yet (will it be necessary? will it be enough?) no ‘access queue’ like in dCache to manage concurrent access (but DPM is faster in serving the requests) and avoid server overloading

21 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November DPM issues  DPM-Castor-rfio compatibility DPM used the same rfio protocol as Castor, but with gsi security layer added: 2 different shared libraries for the same protocol => big mess and 2 different versions of rfio commands (rfcp, rfmkdir,...): old Castor ones in /usr/bin/ and new DPM ones in /opt/lcg/bin/ problem in CMS, where applications do remote I/O operations CMS sw distributed with Castor rfio library (used at Cern): DPM sites need to manually hack sw installation to make it work  Has been (and still is!) a serious issue for CMS T2s problem discovered during SC3 in Legnaro (the only DPM site): nobody cared until more widely spread in SC4, and still not solved after more than 1 year along with the fact that dCache is the solution adopted by main CMS sites, many CMS T2s have avoided DPM, and the others are considering the transition just an example of a more general issue: stay in the same boat as the others, otherwise when a problem arises you are out of the game

22 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November INFN DPM experience  Used in production at many INFN sites: Catania, Frascati, Legnaro, Milano, Napoli, Roma no major issues (some problems migrating data from old SE) main complain: lack of some advanced features good overall stability, but still very small numbers (in size and throughput rate)  Catania and Legnaro currently largest installations Catania has an interesting DPM + GPFS configuration: one large GPFS pool mounted on DPM Legnaro has adopted a more complex configuration: 8 DPM pools, 2 pools for CMS and 1 for each other VO ● allow a sort of VO quota management, keep data of different VOs on different servers (reduce activity interference, different hw performance) ● with proper data management functionalities, it could be done (better?) in a single pool

23 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November INFN DPM experience  So far no evidence of problems or limitations, but reached performance values are still low (even in CSA06 system not stressed enough) Local access example: CMS MC production ‘merge’ jobs (high I/O activity) ~100 concurrent rfio on a single partition: ~120 MB/s (read+write) Legnaro WAN yearly graph SC4 activity  stability and reliability: CMS LoadTest (not a throughput test) ~3 months continuous transfer, ~200TB transferred in import (10-50 MB/s), ~60TB transferred in export (5-20 MB/s)

24 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November SRM summary  dCache mature product, meets all performance and scalability requirements more expensive in terms of hardware and human resources  DPM important features still missing, but this is not a concern in the longer term (no reason why they shouldn’t be added) required performance and scalability not proven yet: are there any intrinsic limits?  Storm potentially interesting, but not used by any LCG site yet required performance and scalability not proven yet: are there any intrinsic limits?

25 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November Conclusions  The storage system is the most challeging part for a T2 site, with several issues  The choice of a Storage Resource Manager is one of the most critical current tools at different level of maturity, but all in active development  A common solution for all INFN sites? There would be obvious benefits, but: not all sites will have the same size: multi-VO T2, single-VO T2, T3 sites different VOs have different requirements, storage access models, compatibility issues A common solution within each VO?  We are running out of time and of course all T2 sites have already started making choices

26 M. Biasotto – INFN T1 + T2 cloud workshop, Bologna, November Acknowledgments  Thanks to all the people who have provided info and comments: T. Boccali (Pisa) S. Bagnasco (Torino) G. Donvito (Bari) A. Doria (Napoli) S. Fantinel (Legnaro) L. Magnoni (CNAF, Storm) G. Platania (Catania) M. Serra (Roma) L. Vaccarossa (Milano) E. Vilucchi (Frascati)