Download presentation
Presentation is loading. Please wait.
Published byTrevin Bradshaw Modified over 9 years ago
1
Castor @ CERN Castor external operation meeting – November 2006 Olof Bärring CERN / IT
2
Olof Bärring (IT/FIO/FS) 2 Outline Recent ‘achievement’ CASTOR2 operation team CASTOR2 deployment and operation CASTOR2 instances LHC experiments’ migration to CASTOR2 Plans for non-LHC experiments Main problems and workarounds during 2006 Tape service SRM service Conclusions
3
Olof Bärring (IT/FIO/FS) 3 The 100 th million CASTOR file nsls -i /castor/cern.ch/cms/MTCC/data/00004438/A/mtcc.00004438.A.testStorageManager_0.22.root 100000000 /castor/cern.ch/cms/MTCC/data/00004438/A/mtcc.00004438.A.testStorageManager_0.22.root Only ~50M files
4
Olof Bärring (IT/FIO/FS) 4 CASTOR2 operation team Jan Veronique MiguelIgnacio Olof
5
Olof Bärring (IT/FIO/FS) 5 CASTOR2 instances c2alicec2atlasc2cmsc2lhcbc2publicc2testC2itdc dev srm-v1.cern.ch Tape service dev srm-v2.cern.ch
6
Olof Bärring (IT/FIO/FS) 6 The c2 layout c2 srv01 c2 scheduler LSF master c2 srv02 c2 rtcpcld rtcpclientd MigHunter c2 srv03 c2 stager stager cleaning c2 srv04 c2 rh rhserver c2 srv05 c2 dlf dlfserver c2 srv06 c2 rmmaster C2 expert rmmaster expertd C2 stgdbC2 dlfdb Head nodes Oracle DB servers Normal disk servers today Next year: oracle certified h/w (NAS/Gbit solution) Disk pools All servers run rfiod, rootd and gridftp Currently all SLC3 + XFS, hw RAID-5 Soon SLC4/64bit + XFS
7
Olof Bärring (IT/FIO/FS) 7 LHC experiments’ CASTOR2 migration Jan’06 Feb’06 Mar’06 Apr’06 May’06 Jun’06 Jul’06 Aug’06 Sep’06 Oct’06 Nov’06 Dec’06 ALICE ATLAS CMS LHCb SC4 SC3 rerun CMS CSA06 ATLAS T0-2006/2 ALICE MDC7 ATLAS T0-2006/1
8
Olof Bärring (IT/FIO/FS) 8 ALICE Smooth migration STAGE_HOST switched in group environment Simple castor usage: rfcp from WN or xrootd servers 4 disk servers (~20TB) xrootd cache Challenges ALICE MDC7 running now 1.2GB from ALICE pit castor2 tape Special requirements xrootd support as internal protocol Pools default: 16TB wan: 73TB alimdc: 112TB
9
Olof Bärring (IT/FIO/FS) 9 ALICEMDC7 22 disk servers 25 tape drives (12 IBM 3592, 13 T10K)
10
Olof Bärring (IT/FIO/FS) 10 ATLAS Smooth migration STAGE_HOST switched in group environment Usage Production: mostly rfcp Users: long-lived RFIO and ROOT streams Challenges ATLAS T0-2006/1 (July) and T0-2006/2 (October) CDR + reconstruction + data export Special requirements ‘Durable’ disk pools Special SRM v11 endpoint: srm-durable-atlas.cern.ch Pools default: 38TB wan: 23TB (being removed) atldata: 15TB no GC analysis: 5TB t0merge: 15TB t0export: 130TB atlprod: 30TB (being created)
11
Olof Bärring (IT/FIO/FS) 11 CMS Migration Users: quite smooth, STAGE_HOST switched in group environment CDR: still some testbeam activity on stagepublic (castor1) Usage Production: rfcp and direct RFIO access Users: long-lived RFIO streams Challenges CMS CSA’06 (October) Special requirements Low request latency (new LSF plugin) Pools default: 89TB wan: 22TB cmsprod: 22TB t0input: 64TB (no GC) t0export: 155TB
12
Olof Bärring (IT/FIO/FS) 12 CSA06
13
Olof Bärring (IT/FIO/FS) 13 LHCb Migration Production: smooth, mostly done already in March Users: some difficulties Dependency on old ROOT3 data delayed migration Flip of STAGE_HOST not sufficient: gridjobs have no CERN specific env most WN access to stagepublic (castor1) Usage RFIO and ROOT access Challenges Lots of tape writing at CERN in early summer Data export in July - August Special requirements ‘Durable’ disk pools Special SRM endpoint: srm-durable-lhcb.cern.ch Pools default: 28TB wan: 51TB lhcbdata: 5TB no GC lhcblog: 5TB no GC
14
Olof Bärring (IT/FIO/FS) 14 Plans for non-LHC migration Plan is to migrate all non-LHC groups to a shared CASTOR2 instance: castorpublic Dedicated pools for large groups Small groups will share ‘default’ Also used for the dteam background transfers and by the ‘repack’ service NA48 first out: plan is to switch off stagena48 end of January 2007 COMPASS Complications Engineering community may require windows client How to migrate small groups without computing coordinators?
15
Olof Bärring (IT/FIO/FS) 15 Main problems and workarounds prepareForMigration: Deadlocks resulted in CASTOR NS not updated file remains 0 size while tape segment>0 Tedious cleanup GC Long period of instabilities during the summer. Now OK since 2.1.0-6 Stager_qry: Now you see your file, now you don’t… users confused Used by operational procedure for draining disk server: manual and tedious workaround for INVALID status bug LSF plugin related problems Meltdown Limit PENDing jobs to 1000 workaround But may result in a rmmaster meltdown instead Problematic with ‘durable’ pools which are not properly managed Recent problem with lsb_postjobmsg ‘Bad file descriptor’. Plugin cannot recover, workaround is to restart LSF Missing cleanups Accumulation of stageRm subrequests, diskcopies in FAILED, … Looping migrators NBTAPECOPIESINFS inconsistency. Workaround in hotfix of early September reduced the impact on the tape mounting but manual cleanup is still required Looping recallers Due to zero-size files (see above) Due to a stageRm bug (insufficient cleanup) Client/server (in)compatibility matrix Request mixing…!
16
Olof Bärring (IT/FIO/FS) 16 Tape service (TSI section) Both T10K and 3592 used in production during 2006 No preference buy both Current drive park: 40 SUN T10K 40 IBM 3592 6 LTO3 44 9940B Current robot park 1 SUN SL8500 6 SUN powderhorns (recently dismounted 6) 1 IBM 3485 Buying for next year 10 more drives of each T10K and 3592 50 of each in total 1 more SUN SL8500 Enough media to fill the new robotics ~18k pieces of media: 12k T10K, 6k 3592 (700GB)
17
Olof Bärring (IT/FIO/FS) 17 Tape / Robots IBM 3584 Tape Library Monolithic Solution - 40 x 3592E05 IBM Tape Drives - ~6000 Tape Slots - 2 Accessors - ~38 m 2 of Floor Space SUN/STK SL8500 Tape Library Modular Solution - 40 x SUN T10K Tape Drives - 21 x LTO-3 Tape Drives - 10 x 9940B Tape Drives - ~8000 Tape Slots - 2 x 4 Handbots - Pass-Through Mechanism - ~19 m 2 of Floor Space SUN/STK SL8500 IBM 3584
18
Olof Bärring (IT/FIO/FS) 18 Repack of 22k 9940B tapes Leave 4 powderhorn silos for 9940B tapes to be repacked to new media Some tapes have a huge number of small files Record: 165k files on a single 9940B tape (200GB). Will take ~1month to repack that tape alone…
19
Olof Bärring (IT/FIO/FS) 19 SRM service SRM v11 Shared facility accessed through a single endpoint: srm.cern.ch 9 CPU servers, DNS loadbalanced 1 CPU server used for the request repository Some dirty workaround for ‘durable’ space required setting up some extra endpoints (srm-durable-xyz.cern.ch) All transfers initiated through srm.cern.ch (== castorsrm.cern.ch) are redirected to the disk servers. The old castorgrid gateway only used by non-LHC for non-SRM access (e.g. NA48 and compass) All CASTOR2 diskservers are on LCG network (also visible to the Tier-2 sites through the HTAR) SRM v22 Test facility up and running (srm-v2.cern.ch) No need for additional endpoints for ‘durable’ storage: durable space is addressed through SRM spacetokens
20
Olof Bärring (IT/FIO/FS) 20 Conclusions 4 LHC experiments successfully migrated to CASTOR2 All major SC4 milestones completed successfully Non-LHC migration has ‘started’ New tape equipment running in production without any major problem Our next challenges Dare to remove dirty workarounds when bugs get fixed SRM v22 operation and support Repack 22k 9940B tapes to new media
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.