CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t Tape-dev update Castor F2F meeting, 14/10/09 Nicola Bessone, German Cancio, Steven Murray,

Slides:



Advertisements
Similar presentations
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
Advertisements

Distributed Storage March 12, Distributed Storage What is Distributed Storage?  Simple answer: Storage that can be shared throughout a network.
16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN.
CASTOR Project Status CASTOR Project Status CERNIT-PDP/DM February 2000.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle and Streams Diagnostics and Monitoring Eva Dafonte Pérez Florbela Tique Aires.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS New tape server software Status and plans CASTOR face-to-face.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status Tony Cass (With thanks to Miguel Coelho dos Santos & Alex Iribarren) LCG-LHCC.
CERN IT Department CH-1211 Genève 23 Switzerland t Plans and Architectural Options for Physics Data Analysis at CERN D. Duellmann, A. Pace.
Building Advanced Storage Environment Cheng Yaodong Computing Center, IHEP December 2002.
CERN IT Department CH-1211 Genève 23 Switzerland t Experience with Windows Vista at CERN Rafal Otto Internet Services Group IT Department.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Summary of CASTOR incident, April 2010 Germán Cancio Leader,
DMC Technology Server Re-fresh. Contents Evaluate vendor Blade technology Develop a Blade architecture that meets DMC/PBB requirements Develop a stand-alone.
CERN IT Department CH-1211 Genève 23 Switzerland t Data Management Evolution and Strategy at CERN G. Cancio, D. Duellmann, A. Pace With input.
CERN IT Department CH-1211 Geneva 23 Switzerland t Storageware Flavia Donno CERN WLCG Collaboration Workshop CERN, November 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t Castor development status Alberto Pace LCG-LHCC Referees Meeting, May 5 th, 2008 DRAFT.
Functional description Detailed view of the system Status and features Castor Readiness Review – June 2006 Giuseppe Lo Presti, Olof Bärring CERN / IT.
CERN - IT Department CH-1211 Genève 23 Switzerland Castor External Operation Face-to-Face Meeting, CNAF, October 29-31, 2007 CASTOR2 Disk.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Tape Monitoring Vladimír Bahyl IT DSS TAB Storage Analytics.
McLean HIGHER COMPUTER NETWORKING Lesson 15 (a) Disaster Avoidance Description of disaster avoidance: use of anti-virus software use of fault tolerance.
CERN - IT Department CH-1211 Genève 23 Switzerland t CASTOR Status March 19 th 2007 CASTOR dev+ops teams Presented by Germán Cancio.
CASTOR: CERN’s data management system CHEP03 25/3/2003 Ben Couturier, Jean-Damien Durand, Olof Bärring CERN.
Report from CASTOR external operations F2F meeting held at RAL in February Barbara Martelli INFN - CNAF.
New stager commands Details and anatomy CASTOR external operation meeting CERN - Geneva 14/06/2005 Sebastien Ponce, CERN-IT.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
CERN - IT Department CH-1211 Genève 23 Switzerland t OIS Deployment of Exchange 2010 mail platform Pawel Grzywaczewski, CERN IT/OIS HEPIX.
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
CERN IT Department CH-1211 Genève 23 Switzerland t Possible Service Upgrade Jacek Wojcieszuk, CERN/IT-DM Distributed Database Operations.
CERN - IT Department CH-1211 Genève 23 Switzerland Tier-0 CCRC’08 May Post-Mortem Miguel Santos Ricardo Silva IT-FIO-FS.
Status SC3 SARA/Nikhef 20 juli Status & results SC3 throughput phase SARA/Nikhef Mark van de Sanden.
CERN IT Department CH-1211 Genève 23 Switzerland t Load Testing Dennis Waldron, CERN IT/DM/DA CASTOR Face-to-Face Meeting, Feb 19 th 2009.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Castor incident (and follow up) Alberto Pace.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS New tape server software Status and plans CASTOR face-to-face.
CERN IT Department CH-1211 Genève 23 Switzerland t HEPiX Conference, ASGC, Taiwan, Oct 20-24, 2008 The CASTOR SRM2 Interface Status and plans.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Operational experiences Castor deployment team Castor Readiness Review – June 2006.
CERN IT Department CH-1211 Genève 23 Switzerland t ALICE XROOTD news New xrootd bundle release Fixes and caveats A few nice-to-know-better.
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
Advances in Bit Preservation (since DPHEP’2015) 3/2/2016 DPHEP / WLCG Workshop1 Germán Cancio IT Storage Group CERN DPHEP / WLCG Workshop Lisbon, 3/2/2016.
CERN - IT Department CH-1211 Genève 23 Switzerland Tape Operations Update Vladimír Bahyl IT FIO-TSI CERN.
CERN - IT Department CH-1211 Genève 23 Switzerland CCRC Tape Metrics Tier-0 Tim Bell January 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
CASTOR Operations Face to Face 2006 Miguel Coelho dos Santos
CERN - IT Department CH-1211 Genève 23 Switzerland CASTOR F2F Monitoring at CERN Miguel Coelho dos Santos.
Developments for tape CERN IT Department CH-1211 Genève 23 Switzerland t DSS Developments for tape CASTOR workshop 2012 Author: Steven Murray.
8 August 2006MB Report on Status and Progress of SC4 activities 1 MB (Snapshot) Report on Status and Progress of SC4 activities A weekly report is gathered.
CERN IT Department CH-1211 Genève 23 Switzerland t CCRC’08 Review from a DM perspective Alberto Pace (With slides from T.Bell, F.Donno, D.Duelmann,
CERN IT Department CH-1211 Genève 23 Switzerland t Increasing Tape Efficiency Original slides from HEPiX Fall 2008 Taipei RAL f2f meeting,
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
Tape write efficiency improvements in CASTOR Department CERN IT CERN IT Department CH-1211 Genève 23 Switzerland DSS Data Storage.
Tape archive challenges when approaching Exabyte-scale CHEP 2010, Taipei G. Cancio, V. Bahyl, G. Lo Re, S. Murray, E. Cano, G. Lee, V. Kotlyar CERN IT-DSS.
CASTOR new stager proposal CASTOR users’ meeting 24/06/2003 The CASTOR team.
ASGC incident report ASGC/OPS Jason Shih Nov 26 th 2009 Distributed Database Operations Workshop.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR and EOS status and plans Giuseppe Lo Presti on behalf.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR status and development HEPiX Spring 2011, 4 th May.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
CERN - IT Department CH-1211 Genève 23 Switzerland CERN Tape Status Tape Operations Team IT/FIO CERN.
CERN - IT Department CH-1211 Genève 23 Switzerland Castor External Operation Face-to-Face Meeting, CNAF, October 29-31, 2007 CASTOR Overview.
CTA: CERN Tape Archive Rationale, Architecture and Status
CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc
CASTOR: possible evolution into the LHC era
Giuseppe Lo Re Workshop Storage INFN 20/03/2006 – CNAF (Bologna)
Ákos Frohner EGEE'08 September 2008
CTA: CERN Tape Archive Overview and architecture
CASTOR: CERN’s data management system
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t Tape-dev update Castor F2F meeting, 14/10/09 Nicola Bessone, German Cancio, Steven Murray, Giulia Taurelli

CERN IT Department CH-1211 Genève 23 Switzerland t Slide 2 Drive Scheduler Drive Scheduler Tape Server Tape Server Disk Server Disk Server Disk Server Disk Server Disk Server Disk Server Stager requests a drive 2. Drive is allocated 3. Data is transferred to/from disk/tape based on file list given by stager 3 3 Legend Data Control messages Host Stager Current Architecture 1 data file = 1 tape file Reminder – last F2F

CERN IT Department CH-1211 Genève 23 Switzerland t Stager New Architecture Slide 3 Drive Scheduler Drive Scheduler Tape Server Disk Server Disk Server Disk Server Disk Server Disk Server Disk Server Stager Legend Data to be stored Control messages Host Server process(es) Tape Gateway Tape Aggregator n data files = 1 tape file The tape gateway will replace RTCPClientD The tape gateway will be stateless The tape aggregator will wrap RTCPD Reminder – last F2F

CERN IT Department CH-1211 Genève 23 Switzerland t New software Goals: code refresh (unmaintained/unknown), component reuse (Castor C++ / DB framework), improved (DB) consistency, enhanced stability -> performance, ground work for future new tape format (block-based metadata) 2 new daemons developed:  tapegatewayd (on the stager) -> replaces rtcpclientd / recaller / migrator.  aggregatord (on the tape server) -> acts as a proxy or bridge between rtcpd and tapegatewayd. (No new tape format yet) Rewritten migHunter  Transactional handling (at stager DB level) of new migration candidates Slide 4

CERN IT Department CH-1211 Genève 23 Switzerland t Status software has been installed on CERN’s stress test instance (ITDC) ~4w ago, started end-to-end tests and stress tests (~20 tape servers, ~25 disk servers) So far, significant improvements in terms of stability (no software- related tape unmounts during migrations and recalls) However: testing not completed yet, many issues found on the way unveiled by the new software  See next slides New migHunter to be released ASAP ( if tests with rtcpclientd ok) Tape gateway + aggregator to be released in x as optional component - not part of the default deployment, and there are no dependencies on it from the rest of the CASTOR software. Slide 5

CERN IT Department CH-1211 Genève 23 Switzerland t Test findings (1) Performance degradations during migrations  Already observed in production, but difficult to trace down as long-lived migration streams rarely occur (cf savannah)  Found to be a misconfiguration in the rtcpd / syslog config, causing log messages to be generated O(n*n),, n=migrated files Another problem to be understood is stager DB time for disk server/ fs selection, which tends to grow during migration lifetime. Currently not limited by this but could become a bottleneck Slide 6

CERN IT Department CH-1211 Genève 23 Switzerland t Test findings (2) Migration slowdown on IBM drives  Castor at fault? Towards end of tape? End of mount? Slide 7

Tpsrv150, 23/9/09 Tpsrv151, 23/9/09 Tpsrv001, 23/9/09 Tpsrv235, 23/9/09 Tpsrv204, 23/9/09 Tpsrv203, 24/9/09 Tpsrv204, 24/9/09

CERN IT Department CH-1211 Genève 23 Switzerland t Test findings (2) Migration slowdown on IBM drives  Castor at fault? Towards end of tape? End of mount?  correlation between where the tape is being written and performance of writing. Confirmed by writing a Castor- independent test writing Castor-like AUL files  Traced down to be an IBM hardware specific issue. After analysis, TapeOps confirmed this to be part of an optimisation on IBM drives called “virtual back hitch”. This optimisation allows small files to be written at higher speeds by reserving a special cache area on tape, while the tape is not getting full.  NVC can be switched off, but performance drops to ~15MB/s 9

CERN IT Department CH-1211 Genève 23 Switzerland t Test findings (3) Under (yet) unknown circumstances, IBM tapes hit end-of- tape at 10-30% less than their nominal capacity. Read performance on these tapes is also suboptimal Seems to be related to a suboptimal working of NVC / virtual back hitch  Does not occur when NVC is switched off To be reported to IBM Slide 10 reading tape with urandom- generated 100MB files to /dev/null using dd (X: seconds, Y: MB/s throughput). The tape contains 8222 AUL files of 100M each

CERN IT Department CH-1211 Genève 23 Switzerland t Test findings (4) Suboptimal file placement strategy on recalls?  which apparently causes interference Recall using default Castor file placement Same recall using 2 dedicated disk servers per tape server 11 3 tape servers recalling on 7 disk servers (all files distributed over all disk servers/file systems 3 tape and 6 disk servers (all filesystems), same as above yields ~ MB/s

CERN IT Department CH-1211 Genève 23 Switzerland t Test findings (5) Recall performance limited by central element (gateway/stager/..?)  a central limitation which prevents performance to go higher than a threshold, even if distinct pools are being used 12 c2itdc total throughput c2itdc/ pool 1 c2itdc / pool 2 shortly after 21:30, the tape recall on pool 1finished. recall performance of the second pool goes up from then on, and that the total recall performance (both disk pools) stays at ~255MB/s. No DB / network contention.

CERN IT Department CH-1211 Genève 23 Switzerland t Test findings (7) Performance degradation on recalls on new tape server HW  observed that new-generation tape servers (Dell 4core) are capable to read out data from tape at a higher than rtcpd is capable to process it. This eventually causes the attached drives to stall. It happens equally if an IBM or an STK drive is attached. The stalling problem does not happen on all other older servers (elonex 2core, clustervision) as there, the drives read out at lower speeds.  Traced down (yesterday..) to a too verbose logging of the tape positioning executable (posovl) when using the new syslog-based DLF. 13,

CERN IT Department CH-1211 Genève 23 Switzerland t “tape” bug fixes in “tape” = repack, VDQM, VMGR, rcpclientd, rtcpd, taped, and the new components  orReleasePlan orReleasePlan (planned)  orReleasePlan orReleasePlan X  orTapeReleasePlan219X orTapeReleasePlan219X 14