CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t Increasing Tape Efficiency Original slides from HEPiX Fall 2008 Taipei RAL f2f meeting,

Slides:



Advertisements
Similar presentations
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR Status Alberto Pace.
Advertisements

Andrew Hanushevsky7-Feb Andrew Hanushevsky Stanford Linear Accelerator Center Produced under contract DE-AC03-76SF00515 between Stanford University.
Categories of I/O Devices
Lectures on File Management
Disk Fundamentals. More than one platter (round cylinders)
11 BACKING UP AND RESTORING DATA Chapter 4. Chapter 4: BACKING UP AND RESTORING DATA2 CHAPTER OVERVIEW Describe the various types of hardware used to.
CASTOR Project Status CASTOR Project Status CERNIT-PDP/DM February 2000.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Hugo HEPiX Fall 2005 Testing High Performance Tape Drives HEPiX FALL 2005 Data Services Section.
Secondary Storage CSCI 444/544 Operating Systems Fall 2008.
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
Working with SQL and PL/SQL/ Session 1 / 1 of 27 SQL Server Architecture.
File System. NET+OS 6 File System Architecture Design Goals File System Layer Design Storage Services Layer Design RAM Services Layer Design Flash Services.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
Data Management for Decision Support Session-5 Prof. Bharat Bhasker.
Elements of a Computer System Dr Kathryn Merrick Thursday 4 th June, 2009.
CERN IT Department CH-1211 Genève 23 Switzerland t Tape-dev update Castor F2F meeting, 14/10/09 Nicola Bessone, German Cancio, Steven Murray,
CERN IT Department CH-1211 Genève 23 Switzerland t Plans and Architectural Options for Physics Data Analysis at CERN D. Duellmann, A. Pace.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Principles of I/0 hardware.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Summary of CASTOR incident, April 2010 Germán Cancio Leader,
CERN IT Department CH-1211 Genève 23 Switzerland t Data Management Evolution and Strategy at CERN G. Cancio, D. Duellmann, A. Pace With input.
A Comparative Study of the Linux and Windows Device Driver Architectures with a focus on IEEE1394 (high speed serial bus) drivers Melekam Tsegaye
I/O Management and Disk Structure Introduction to Operating Systems: Module 14.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
LZRW3 Decompressor dual semester project Characterization Presentation Students: Peleg Rosen Tal Czeizler Advisors: Moshe Porian Netanel Yamin
CERN IT Department CH-1211 Genève 23 Switzerland t Castor development status Alberto Pace LCG-LHCC Referees Meeting, May 5 th, 2008 DRAFT.
CERN - IT Department CH-1211 Genève 23 Switzerland Castor External Operation Face-to-Face Meeting, CNAF, October 29-31, 2007 CASTOR2 Disk.
"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Update on Windows 7 at CERN & Remote Desktop.
Report from CASTOR external operations F2F meeting held at RAL in February Barbara Martelli INFN - CNAF.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
© 2008 Cisco Systems, Inc. All rights reserved.CIPT1 v6.0—1-1 Getting Started with Cisco Unified Communications Manager Installing and Upgrading Cisco.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Priorities update Andrea Sciabà IT/GS Ulrich Schwickerath IT/FIO.
CERN IT Department CH-1211 Genève 23 Switzerland t Load Testing Dennis Waldron, CERN IT/DM/DA CASTOR Face-to-Face Meeting, Feb 19 th 2009.
Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana Cardiff University, UK.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
THE FILE SYSTEM Files long-term storage RAM short-term storage Programs, data, and text are all stored in files, which is stored on.
© 2006 EMC Corporation. All rights reserved. The Host Environment Module 2.1.
DFS DBufferCache DBuffer VirtualDisk startRequest(dbuf, r/w) ioComplete() blockID = getBlockID() buf[] = getBuffer() read(), write() startFetch(), startPush()
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS New tape server software Status and plans CASTOR face-to-face.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
CERN IT Department CH-1211 Genève 23 Switzerland t HEPiX Conference, ASGC, Taiwan, Oct 20-24, 2008 The CASTOR SRM2 Interface Status and plans.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Copyright © Genetic Computer School 2008 Computer Systems Architecture SA 8- 0 Lesson 8 Secondary Management.
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
CERN - IT Department CH-1211 Genève 23 Switzerland CCRC Tape Metrics Tier-0 Tim Bell January 2008.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Data architecture challenges for CERN and the High Energy.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
CERN - IT Department CH-1211 Genève 23 Switzerland CASTOR F2F Monitoring at CERN Miguel Coelho dos Santos.
Developments for tape CERN IT Department CH-1211 Genève 23 Switzerland t DSS Developments for tape CASTOR workshop 2012 Author: Steven Murray.
CERN - IT Department CH-1211 Genève 23 Switzerland t OIS Update on the anti spam system at CERN Pawel Grzywaczewski, CERN IT/OIS HEPIX fall.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Tape write efficiency improvements in CASTOR Department CERN IT CERN IT Department CH-1211 Genève 23 Switzerland DSS Data Storage.
Tape archive challenges when approaching Exabyte-scale CHEP 2010, Taipei G. Cancio, V. Bahyl, G. Lo Re, S. Murray, E. Cano, G. Lee, V. Kotlyar CERN IT-DSS.
CASTOR new stager proposal CASTOR users’ meeting 24/06/2003 The CASTOR team.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR status and development HEPiX Spring 2011, 4 th May.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Storage plans for CERN and for the Tier 0 Alberto Pace (and.
CTA: CERN Tape Archive Rationale, Architecture and Status
Module 11: File Structure
Ákos Frohner EGEE'08 September 2008
Storage Virtualization
Operation System Program 4
Disk Storage, Basic File Structures, and Buffer Management
ICOM 6005 – Database Management Systems Design
Chapter 2: Operating-System Structures
Chapter 2: Operating-System Structures
Page Cache and Page Writeback
Introduction to Operating Systems
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t Increasing Tape Efficiency Original slides from HEPiX Fall 2008 Taipei RAL f2f meeting, 19/2/09Slide 1 Nicola Bessone, German Cancio, Steven Murray, Giulia Taurelli

CERN IT Department CH-1211 Genève 23 Switzerland t Contents Tape efficiency project Problem areas What has been done What is under development Roadmap and current status Slide 2

CERN IT Department CH-1211 Genève 23 Switzerland t Tape Efficiency Project All functionality dealing directly with storage on and management of tapes  Volume database  Migrations/recalls  Tape drive scheduling  Low-level tape positioning and read/write Team is from IT/DM Contributions from IT/FIO Slide 3

CERN IT Department CH-1211 Genève 23 Switzerland t Problem Areas Write more data per tape mount Use a more efficient tape format  The current tape format does not deal efficiently with small files Improve read efficiency  Require modifications from tape->disk Slide 4 Work done Current work More ideas

CERN IT Department CH-1211 Genève 23 Switzerland t What has been done Slide 5

CERN IT Department CH-1211 Genève 23 Switzerland t Read/write More Per Mount Recall/migration policies  Freight train approach  Hold back requests based on the amount of data and elapsed time Production managers rule  Production managers plan relatively large workloads for CASTOR  Access control lists give production managers a relatively larger percentage of resources  User and group based priorities encourage users to work with their production managers Slide 6

CERN IT Department CH-1211 Genève 23 Switzerland t Repack: Measurements Slide 7 4 drives reading 7 drives writing 400MBytes/s Writing the current ANSI AUL format is approximately twice as slow as reading. Repack uses the cache to support asymmetric read/write drive allocation

CERN IT Department CH-1211 Genève 23 Switzerland t What is under development Slide 8

CERN IT Department CH-1211 Genève 23 Switzerland t Writing Small Files is Slow Slide 9 Users were encouraged to store large files in Castor Unfortunately Castor contains many small files

CERN IT Department CH-1211 Genève 23 Switzerland t Why Small Files are Slow ANSI AUL format 3 tape marks per file 2 to 3 second per tape mark 9 seconds per data file independent of its size Slide 10 hdr1 hdr2 uh1 tm data file tm eof1 eof2 utl1 tm HeaderTrailer Tape marks 1 data file

CERN IT Department CH-1211 Genève 23 Switzerland t New Tape Format Multi-file block format within the ANSI AUL format Header per block for “self description” 3 tape marks per n files n will take into account:  A configurable maximum number of files  A configurable maximum size  A configurable maximum amount of time to wait Slide 11 data file 1 … … data file n hdr1 hdr2 uh1 tm eof1 eof2 utl1 tm TrailerN data filesHeader Each 256 KB data file block written to tape includes a 1 KB header

CERN IT Department CH-1211 Genève 23 Switzerland t Block Header Format Slide 12 #Meta-data nameExplanationExamples Bytes for Data 1VERSION_NUMBERThe version of the block format HEADER_SIZEHeader size in bytes CHECKSUM_ALGORITH MName of the checksum algorithmAdler HEADER_CHECKSUMAdler-32 checksum TAPE_MARK_COUNTSequential number addressing the migration-files on the tape BLOCK_SIZEBlock size in bytes inclusive of header BLOCK_COUNT Block offset from the beginning of the tape. Tape marks and labels are included in the count BLOCK_TIME_STAMP Time since the Epoch (00:00:00 UTC, January 1, 1970), measured in seconds STAGER_VERSIONThe version of the stager software STAGER_HOSTThe DNS name of the stager host including the domain c2cms2stager.cern.c h 30 11DRIVE_NAMEWill be provided by a local configuration file DRIVE_SERIALWill be provided by a local configuration file DRIVE_FIRMWAREWill be provided by a local configuration fileD3I0_C DRIVE_HOSTThe DNS name of the host including the domaintpsrv250.cern.ch30 15VOL_DENSITYThe storage capacity of the tape700.00GB10 16VOL_IDSite specific numbering system (the sticker on a tape)T VOL_SERIALVolume Serial NumberT DEVICE_GROUP_NAMEThe device group name that linked the tape to the drive3592B110 19FILE_SIZEThe size of the data file in bytes FILE_CHECKSUMAdler-32 checksum FILE_NS_HOSTThe DNS name of the host including the domaincastorns.cern.ch30 22FILE_NS_IDThe name server ID of the data file FILE_PROGESSIVE_CH ECKSUM Adler-32. Progressive checksum of all the blocks written to tape so far for the current data file FILE_BLOCK_COUNTBlock offset from the beginning of the data file Header size before file_name :375 25FILE_NAME Last “x” bytes of the filename from the name server. This field acts as a padding to the nearest KiB.649 Header size :1024 VERSION_NUMBER HEADER_SIZE CHECKSUM_ALGORITHM HEADER_CHECKSUM TAPE_MARK_COUNT BLOCK_SIZE BLOCK_COUNT BLOCK_TIME_STAMP STAGER_VERSION STAGER_HOST DRIVE_NAME DRIVE_SERIAL DRIVE_FIRMWARE DRIVE_HOST VOL_DENSITY VOL_ID VOL_SERIAL DEVICE_GROUP_NAME FILE_SIZE FILE_CHECKSUM FILE_NS_HOST FILE_NS_ID FILE_PROGESSIVE_CHECKSUM FILE_BLOCK_COUNT FILE_NAME

CERN IT Department CH-1211 Genève 23 Switzerland t Predicted Tape Format Performance Slide 13 New tape formatAUL tape format Less than 1 year with the new tape format Greater than 4 years with the AUL tape format Hardware manufactures increasing tape density impose a 2 year cycle The future of repack may be to run continuously to verify checksums and ensure data integrity

CERN IT Department CH-1211 Genève 23 Switzerland t Architecture Needs to Change The new tape format is only half of the story An aggregator needs to be inserted into the disk ↔ tape data streams Anything old that is replaced is an opportunity for code re-use and increased maintainability via the Castor framework Slide 14

CERN IT Department CH-1211 Genève 23 Switzerland t Slide 15 Drive Scheduler Drive Scheduler Tape Server Tape Server Disk Server Disk Server Disk Server Disk Server Disk Server Disk Server Stager requests a drive 2. Drive is allocated 3. Data is transferred to/from disk/tape based on file list given by stager 3 3 Legend Data Control messages Host Stager Current Architecture 1 data file = 1 tape file

CERN IT Department CH-1211 Genève 23 Switzerland t Stager New Architecture Slide 16 Drive Scheduler Drive Scheduler Tape Server Disk Server Disk Server Disk Server Disk Server Disk Server Disk Server Stager Legend Data to be stored Control messages Host Server process(es) Tape Gateway Tape Aggregator n data files = 1 tape file The tape gateway will replace RTCPClientD The tape gateway will be stateless The tape aggregator will wrap RTCPD

CERN IT Department CH-1211 Genève 23 Switzerland t Roadmap Slide 17 DateActions Beginning Q4 2008Put repack into full production will at least 20 drives. Expecting 700 MB/s. Conclude new tape format architecture. End Q1 2009Release first functional prototype of new tape format. End Q2 2009Write new tape format with repack only. Read new tape format everywhere. End Q3 2009Read and write everywhere Beginning Q1 2010Replace RTCPD with tape aggregator Current status (Feb 2009):  New tape format specified  Classes for reading/writing new format coded  Communication between rtcpclientd/rtcpd/VDQM reverse engineered and understood  Tape aggregator and gateway prototypes underway; first goal to fully support non- aggregated AUL read/writes; aggregations to come later