status, usage and perspectives

Storage@CNAF: status, usage and perspectives
Luca dell’Agnello Bologna, March 1st 2006

HW Resources… Disk: Tape libraries: FC, IDE, SCSI, NAS technologies
470 TB raw (~ 450 FC-SATA) 2005 tender: 200 TB raw (~ Euro/TB net + VAT) Additional 20% of last tender acquisition requested Tender for 400 TB (not before Fall 2006) Tape libraries: STK L TB (only used for backups) STK 5500 6 LTO-2 drives with 1200 tapes  240 TB 4 9940B drives (+ 3 to be installed next weeks) with tapes  130 TB (260) (1.5 KEuro/TB initial cost  0.35 KEuro/TB pure tape cost)

….. Human Resources Luca dell’Agnello – permanent Pier Paolo Ricci (100%) - permanent Giuseppe Lore (50%) - temporary Barbara Martelli (100%) - temporary Vladimir Sapunenko (100%) - temporary Giuseppe Lopresti - CERN Fellowship (Tier0-Tier1 agreement for supporting Tier1 CASTOR installation)

Storage & Database group tasks
Disk and SAN management HW/SW installation and maintenance Pier Paolo Ricci Vladimir Sapunenko remote (gridSE) and local (rfiod/nfs/GPFS) access service management Giuseppe Lore CASTOR HSM management Giuseppe Lo Presti (CERN) gridftp and SRM access service DB (Oracle for Castor & RLS test, Tier 1 “global” Hardware db) Barbara Martelli SC infrastructure (gridftp servers, SRM interface, FTS) Tests (synergy with farming group + external staff) Clustered/parallel filesystem tests GPFS, Lustre, dcache SRM tests (storm) LCG3D

Storage status Main storage (IBM Fast-900, STK FLX680) organized in one Fabric Storage Area Network (3 Brocade switches, star topology) Level-1 disk servers connected via FC Usually in GPFS cluster Easiness of administration Load balancing and redundancy Also Lustre under evaluation Some level-2 disk servers connected to storage only via GPFS (over IP) LCG and FC dependencies on OS decoupled WNs are not members of GPFS cluster (but scalability on large number of WNs currently under investigation) Supported protocols: rfio, gridftp, xrootd (BaBar), NFS,AFS NFS used mainly for accessing experiment software - strongly discouraged for data access AFS used only by CDF for accessing experiment software

Diskservers with Qlogic FC HBA 2340
Storage Hardware (1) STK180 with 100 LTO-1 (10Tbyte Native) HSM (400 TB) NAS (20TB) NAS1,NAS4 3ware IDE SAS Gbyte W2003 Server with LEGATO Networker (Backup) Linux SL 3.0 clients ( nodes) RFIO NFS PROCOM 3600 FC NAS Gbyte CASTOR HSM servers WAN or TIER1 LAN H.A. PROCOM 3600 FC NAS2 7000Gbyte STK L5500 robot (5500 slots) 6 IBM LTO-2, 4 STK 9940B drives NFS-RFIO-GridFTP oth... Diskservers with Qlogic FC HBA 2340 SAN 1 (400TB RAW) SAN 2 (40TB) IBM FastT900 (DS 4500) 3/4 x GByte 4 FC interfaces Infortrend 4 x 3200 GByte SATA A16F-R1A2-M1 Brocade FABRIC 2 Silkworm port FC port FC 2 Gadzoox Slingshot port FC Switch Infortrend 5 x 6400 GByte SATA A16F-R1211-M2 + JBOD STK FlexLine 600 4 x GByte 4 FC interfaces STK BladeStore About GByte 4 FC interfaces AXUS BROWIE About 2200 GByte 2 FC interface

Storage Hardware (2) 16 Diskservers with dual Qlogic FC HBA 2340 Sun Fire U20Z dual Opteron 2.6GHZ DDR 400MHz 4 x 1GB RAM SCSIU320 2 x 73 10K 10TB each diskserver Brocade Director FC Switch (full licenced) with 64 port (out of 128) central fabric for the other 4 x 2GB redundand connections to the Switch 4 Flexline 600 with 200TB RAW (150TB) RAID5 8+1 Performance on single volume not high (45MB/s write, 35MB/s read) (but good MB/s aggregate) parallel i/o needed for optimization. Total theoretic maximum bandwidth MB/s to 200TB. Management software and monitor included.

CASTOR issues At present STK library with 6xLTO2 and 4x9940B drives
2000 x 200GB LTO-2 tapes slots capability  400TB 3500 x 200GB 9940B tapes slots capability  700TB Current availability: 1200 x 200GB LTO-2 (240TB 40% used) and 680 x 200GB 9940B (130TB 70% used) In general CASTOR performances (as other HSM software) increase with clever pre-staging of files (ideally ~ 90%). Not present in CASTOR-1 (done manually  at CNAF with scripts) but implemented in CASTOR-2 (to be tested) LTO-2 technology drives not usable in a real production environment with present CASTOR release Used only for archiving copies of disk data or with a big staging (disk buffer) area which set almost at zero the tape access. HW problems solved using 9940B technology drives CASTOR-2 has been fully installed and is currently under test CASTOR-1 (production) and CASTOR-2 (test) can share the same resources and are currently “living together” in our software implementation SC will run on the CASTOR-2 environment but production activities will remain on the CASTOR-1 services After SC CASTOR-2 will be used in production and the CASTOR-1 services will be dismissed

Database service (1) Database services in production: Oracle db backup
Service Challenge databases: LFC, FTS (Oracle 10g) Castor database (Oracle 9.2) PostgreSQL Tier1 database (HW resources and configuration of our computing center). migration to the new server planned. Various MySQL databases for experiments applications (CMS PubDB, Pamela DB, Argo DB) Oracle db backup hot backups and Archive log backup with RMAN (on tape with Legato) Daily incremental Weekly full Daily exports on a backup server (disk). We use export utility as a possible workaround in case of RMAN restore failure. Old exports are deleted weekly, only Sunday exports are retained. Daily hot full backups for postgreSQL databases (i.e. with pg_dump) Daily cold full backups of MySQL databases INNODB hot backup utility under evaluation

Database service (2) 1 HP Proliant DL380 G4 (dual Xeon 3,6 GHz, 4 GB RAM, 6xSCSI disks of 300 GB configured in RAID 1+0) FTS db ( ) Castor2 db ( ) 2 IBM Xseries 330 (dual Xeon 3 GHz, 4 GB RAM, 2x150 GB disk) Castor db Recovery Catalog for RMAN backups 1 dual Xeon 2,4 GHz, 2 GB RAM with two 60 GB disks for 3D replication tests. 1TB raw SAN storage for 3D tests. 6 dual Xeon (4x3 GHz – 2x2,4 GHz) 4GB RAM, 2x80 GB disks for Real Application Cluster tests. 1TB raw storage (from IBM FastT900) 2 Pentium4 3,2 GHz 2GB RAM 80GB hard disk for service challenge LFC and FTS (to be migrated) instances

LCG3D LCG 3D Milestones Tier-1 services starts - milestone for early production Tier 1 sites Full LCG database service in place - milestone for all Tier 1 sites Oracle services will runs only on the TIER0-TIER1 layers Replication will run from TIER0 (read/write) to TIER1 (read only replicas) Cross vendor copy from Oracle to Open Source databases (MySQL, PostgreSQL) to TIER2 or higher TIERS Dirk Duellmann

Test Activities Parallel Filesystem (GPFS,Lustre) with Storm SRM 2.1 compliant interface over SAN and a big number of clients (1000 WN testbed with the help of Farming Tier1 group) dCache over SAN (collaboration with CMS Bari for manpower) Possibly other diskpools (DPM…) need of manpower! Studies on different filesystems (RedHat GFS, XfS…) High Avaliability of services (rfio,nfs and other like dCache pools) using cluster failover technologies (RedHat cluster…) Oracle RAC scalability and reliability for different purposes (Lcg3D)

Parallel Filesystem Testbed
Read throughput Roughly 40 TB in 20 partitions The GPFS disk-servers are 4 high performance SunFire V20Z servers with dual Opteron 2.6 GHz connected to the SAN Write throughput

Short term task summary
HSM software: Castor-2 migration, administration and future debugging/evolution Disk pool in production: dCache OR Storm over Parallel Filesystem Grid services (SE,FTS,SRM…) for the storage resources granted in agreement with the TIER1 SLA LCG3D TIER1 and Database services Tender and installation of the new 400TB over SAN integrated with the current storage (dual Fabric SAN for high availability)

Conclusions The storage management manpower for administering the current resources is presently undersized in number. The storage resources will greatly increase every year with yearly tenders with a 6-9 months delay. Also all the required I/O throughput, disk/tape reliability and complexity of the hardware/software management will grow with the same rate We need at least to double the number of people in the Storage Group to grant the minimum required services at the LHC activity start (5 more staff FTE). An even larger number is also required for crucial test activities. Also at least 2 technicians are required for the hardware maintenance.

Manpower Assignment The staff numbers are: 7 Tecnologo FTE and 2 Technicians FTE 4 Major group Hardware support: 1 Tecnologo and 2 Technicians. The Hardware support include the supervision of ALL the installed hardware (Tape Library, SAN,Disk arrays and Diskservers) in terms of administration, direct intervention, call and supervision of external support and supports contracts for a full week time. Database support: 1.5 Tecnologo: Dbas of all the databases provided by the TIER1 CNAF. This may include specific databases requested by experiments, the future LCG3d activity and the INFN replica of the database Central Administration. Castor support and development: 2 Tecnologo. Installation, Administration of Castor-2 (or other HSM) software and personalization in case of different request from the other TIER1 users. Also specific modification of the Castor-2 could be requested in case of hardware differences from the other TIERS of a different specific type of usage. Perheps also the backup service could be included here. Front-ends services administration and test : 2.5 Tecnologo. This includes the installation and administrations of the different services and diskservers used for accessing both the disk and tape resources. Also testing and maintenance of different diskpool systems, filesystems etc…

Accounting storage

CASTOR HMS system (1) STK 5500 library Access Disk staging area
6 x LTO2 drives 4 x 9940B drives 1300 LTO2 (200 GB) tapes B (200 GB) tapes Access CASTOR filesystem hides tape level Native access protocol: rfio srm interface for grid fabric available (rfio/gridftp) Disk staging area Data migrated to tapes asap and deleted from staging area when full CASTOR 2 currently under test Migration before SC4 (April ’06) CASTOR disk space

status, usage and perspectives

Similar presentations

Presentation on theme: "status, usage and perspectives"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

status, usage and perspectives

Similar presentations

Presentation on theme: "status, usage and perspectives"— Presentation transcript:

Similar presentations

About project

Feedback