Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mass StorageCS510 Computer ArchitecturesLecture 18 - 1 Lecture 18 Mass Storage Devices: RAID, Data Library, MSS.

Similar presentations


Presentation on theme: "Mass StorageCS510 Computer ArchitecturesLecture 18 - 1 Lecture 18 Mass Storage Devices: RAID, Data Library, MSS."— Presentation transcript:

1 Mass StorageCS510 Computer ArchitecturesLecture 18 - 1 Lecture 18 Mass Storage Devices: RAID, Data Library, MSS

2 Mass StorageCS510 Computer ArchitecturesLecture 18 - 2 Review: Improving Bandwidth of Secondary Storage Processor performance growth phenomenal I/O? I/O certainly has been lagging in the last decade Seymour Cray, Public Lecture (1976) Also, I/O needs a lot of work David Kuck, Keynote Address, (1988)

3 Mass StorageCS510 Computer ArchitecturesLecture 18 - 3 Network Attached Storage High Performance Storage Service on a High Speed Network Decreasing Disk Diameters 14" » 10" » 8" » 5.25" » 3.5" » 2.5" » 1.8" » 1.3" »... high bandwidth disk systems based on arrays of disks Increasing Network Bandwidth 3 Mb/s » 10Mb/s » 50 Mb/s » 100 Mb/s » 1 Gb/s » 10 Gb/s networks capable of sustaining high bandwidth transfers Network provides well defined physical and logical interfaces: separate CPU and storage system! Network File Services OS structures supporting remote file access

4 Mass StorageCS510 Computer ArchitecturesLecture 18 - 4 RAID

5 Mass StorageCS510 Computer ArchitecturesLecture 18 - 5 Manufacturing Advantages of Disk Arrays Disk Product Families 3.5 Disk Array: 1 disk design 14 105.253.5 Conventional: 4 disk designs Low End High End

6 Mass StorageCS510 Computer ArchitecturesLecture 18 - 6 Replace Small Number of Large Disks with Large Number of Small Disks IBM 3390 (K) 20 GBytes 97 cu. ft. 3 KW 15 MB/s 600 I/Os/s 250 KHrs $250K IBM 3.5" 0061 320 MBytes 0.1 cu. ft. 11 W 1.5 MB/s 55 I/Os/s 50 KHrs $2K x70 23 GBytes 11 cu. ft. 1 KW 120 MB/s 3900 IOs/s ??? Hrs $150K Disk Arrays have potential for large data and I/O rates high MB per cu. ft., high MB per KW awful reliability Data Capacity Volume Power Data Rate I/O Rate MTTF Cost

7 Mass StorageCS510 Computer ArchitecturesLecture 18 - 7 Redundant Arrays of Disks Files are "striped" across multiple spindles to gain throughput Increase of the number of disks reduces the reliability Redundancy yields high data availability Disks will fail Contents reconstructed from data redundantly stored in the array – Capacity penalty to store it – Bandwidth penalty to update Techniques: Mirroring/Shadowing (high capacity cost) Horizontal Hamming Codes (overkill) Parity & Reed-Solomon Codes Failure Prediction (no capacity overhead!) VaxSimPlus - Technique is controversial

8 Mass StorageCS510 Computer ArchitecturesLecture 18 - 8 Array Reliability Reliability of N disks = Reliability of 1 Disk N 50,000 Hours 70 disks = 700 hours Disk system MTTF: Drops from 6 years to 1 month! Arrays without redundancy too unreliable to be useful! Hot spares support reconstruction in parallel with access: very high media availability can be achieved

9 Mass StorageCS510 Computer ArchitecturesLecture 18 - 9 Redundant Arrays of Disks(RAID) High I/O Rate Parity Array Interleaved parity blocks Independent reads and writes Logical write = 2 reads + 2 writes Parity + Reed-Solomon codes Disk Mirroring, Shadowing Each disk is fully duplicated onto its "shadow" Logical write = two physical writes 100% capacity overhead 1001001110010011 Shadow 1001001110010011 Parity Data Bandwidth Array Parity computed horizontally Recovery purpose instead of fault detection Logically a single high data bw disk Parity 1001001110010011 1011100110111001 1011001110110011 0110011001100110

10 Mass StorageCS510 Computer ArchitecturesLecture 18 - 10 Problems of Disk Arrays: Small Writes RAID-5: Small Write Algorithm 1 Logical Write = 2 Physical Reads + 2 Physical Writes D0D1D2 D3 P D0' D1D2 D3 + 2. Read old parity XOR + 1. Read old data XOR D0' new data 3. Write new data P' 4. Write new parity

11 Mass StorageCS510 Computer ArchitecturesLecture 18 - 11 Redundant Arrays of Disks: RAID 1: Disk Mirroring/Shadowing Each disk is fully duplicated onto its "shadow" Very high availability can be achieved Bandwidth sacrifice on write: Logical write = two physical writes Reads may be optimized Most expensive solution: 100% capacity overhead Targeted for high I/O rate, high availability environments recovery group

12 Mass StorageCS510 Computer ArchitecturesLecture 18 - 12 Redundant Arrays of Disks: RAID 3: Parity Disk 10010011 11001101 10010011... logical record 0011001000110010 1001001110010011 1100110111001101 1001001110010011 Striped physical records Parity computed across recovery group to protect against hard disk failures 33% capacity cost for parity in this configuration wider arrays reduce capacity costs, decrease expected availability, increase reconstruction time Arms logically synchronized, spindles rotationally synchronized logically a single high capacity, high transfer rate disk Targeted for high bandwidth applications: Scientific, Image Processing P 1111111111111111

13 Mass StorageCS510 Computer ArchitecturesLecture 18 - 13 Redundant Arrays of Disks: RAID 5+: High I/O Rate Parity Independent accesses occur in llel A logical write is 4 physical I/Os; 2 Reads and 2 Writes Independent writes, 1 data and 1 parity, possible because of interleaved parity Reed-Solomon Codes ("Q") for protection during reconstruction Targeted for mixed applications Stripe Unit Stripe Unit D0D1D2 D3 P0P0 D4D5D6 P1P1 D7 D8D9P2P2 D10 D11 D12P3P3 D13 D14 D15 P4P4 D16D17 D18 D19 D20D21D22 D23 P5P5.............................. Disk Columns a disk

14 Mass StorageCS510 Computer ArchitecturesLecture 18 - 14 Subsystem Organization host array controller single board disk controller single board disk controller single board disk controller single board disk controller host adapter physical device control often piggy-backed in small format devices Striping software off-loaded from host to array controller No applications modifications No reduction of host performance control, buffering, parity logic manages interface to host, DMA

15 Mass StorageCS510 Computer ArchitecturesLecture 18 - 15 System Availability: Orthogonal RAIDs Redundant Support Components: fans, power supplies, controller, cables End to End Data Integrity: internal parity protected data paths Data Recovery Group: unit of data redundancy

16 Mass StorageCS510 Computer ArchitecturesLecture 18 - 16 System-Level Availability Recovery Group Goal: No Single Points of Failure with duplicated paths, higher performance can be obtained when there are no failures Fully dual redundant I/O Controller Array Controller......... host

17 Mass StorageCS510 Computer ArchitecturesLecture 18 - 17 Magnetic Tapes

18 Mass StorageCS510 Computer ArchitecturesLecture 18 - 18 Memory Hierarchies General Purpose Computing Environment Memory Hierarchy Access Time Capacity Cost per bit File Cache Hard Disk Tapes

19 Mass StorageCS510 Computer ArchitecturesLecture 18 - 19 Memory Hierarchies File Cache Hard Disk Tapes General Purpose Computing Environment Memory Hierarchy 1980 Off Line Storage On-Line Near-Line Disk Arrays Memory Hierarchy 1995 File Cache SSD High I/O Rate Disks High Data Rate Disks Optical Juke Box Automated Tape Libraries Low $/Actuator Low $/MB Remote Archive

20 Mass StorageCS510 Computer ArchitecturesLecture 18 - 20 Storage Trends: Distributed Storage Storage Hierarchy 1980 Storage Hierarchy 1990 Local Magnetic Disk File Cache MagneticTape Server Remote Magnetic Disk LocalArea Network Server Cache Client Workstation File Server Declining $/MByte Increasing Access Time Capacity File Cache Magnetic Disk MagneticTape

21 Mass StorageCS510 Computer ArchitecturesLecture 18 - 21 Storage Trends: Wide-Area Storage 1995 Typical Storage Hierarchy Conventional disks replaced by disk arrays Near-line storage emerges between disk and tape Local Area Network DiskArray Server Cache Shelved Magnetic or OpticalTape Optical Disk Jukebox Magnetic or OpticalTape Library Client Cache On-line Storage Near-line Storage Off-line Storage WideArea Network Internet

22 Mass StorageCS510 Computer ArchitecturesLecture 18 - 22 What's All This About Tape? Tape is used for: Backup Storage for Hard Disk Data – Written once, very infrequently (hopefully never!) read Software Distribution – Written once, read once Data Interchange – Written once, read once File Retrieval – Written/Rewritten, files occasionally read – Near Line Archive – Electronic Image Management Relatively New Application For Tape

23 Mass StorageCS510 Computer ArchitecturesLecture 18 - 23 Alternative Data Storage Technologies * Second Generation 8mm: 5000 MB, 500KB/s ** Second Generation 4mm: 10000 GB Conventional Tape: Reel-to-Reel (.5")1406250180.11549minutes Cartridge (.25")150120001041.2592minutes CapBPITPIBPI*TPIData Xfer Acc Time Technology(MB)(Million)(KByte/s) Helical Scan Tape: VHS (.5")25001743565011.33120minutes Video (8mm)*23004320081935.28246minutes DAT (4mm)**1300610001870114.0718320 seconds Disk: Hard Disk (5.25")76030552166750.94137320 ms Floppy Disk (3.5")2174341352.35921 second CD ROM (3.5")5402760015875438.151831 second

24 Mass StorageCS510 Computer ArchitecturesLecture 18 - 24 R-DAT Technology Two Competing Standards DDS (HP, Sony) * 22 frames/group * 1870 tpi * Optimized for serial writes DataDAT (Hitachi, Matsushita, Sharp) * Two modes: streaming (like DDS) and update in place * Update in place sacrifices transfer rate and capacity Spare data groups, inter-group gaps, preformatted tapes

25 Mass StorageCS510 Computer ArchitecturesLecture 18 - 25 R-DAT Technology Advantages: * Small Formfactor, easy handling/loading * 200X speed search on index fields (40 sec. max, 20 sec. avg.) * 1000X physical positioning (8 sec. max, 4 sec. avg.) * Inexpensive media ($10/GBytes) * Volumetric Efficiency: 1 GB in 2.5 cu. in; 1 TB in 1 cu. ft. Disadvantages: * Two incompatible standards (DDS, DataDAT) * Slow XFER rate * Lower capacity vs. 8mm tape * Small bit size (13 x 0.4 sq. micron) effect on archive stability

26 Mass StorageCS510 Computer ArchitecturesLecture 18 - 26 R-DAT Technical Challenges Tape Capacity * Data Compression is key Tape Bandwidth * Data Compression * Striped Tape

27 Mass StorageCS510 Computer ArchitecturesLecture 18 - 27 MSS Tape: No Perfect? Tape Drive Best 2 out of 3 Cost, Size, Speed Expensive (Fast & big) Cheap (Slow & big)

28 Mass StorageCS510 Computer ArchitecturesLecture 18 - 28 Data Compression Issues Peripheral Manufacturer Approach: Host SCSI HBA Embedded Controller Transport Host SCSI HBA Video Compression Audio Compression Image Compression Text Compression... Embedded Controller Transport System Approach: Compression Done Here 20:1 2,3:1 Data Specific Compression Hints from Host

29 Mass StorageCS510 Computer ArchitecturesLecture 18 - 29 Striped Tape Speed Matching Buffers Embedded Controller Transport Embedded Controller Transport Embedded Controller Transport Embedded Controller Transport To/From Host 180 KB/s Challenges: * Difficult to logically synchronize tape drives * Unpredictable write times R after W verify, Error Correction Schemes, N Group Writing, Etc.

30 Mass StorageCS510 Computer ArchitecturesLecture 18 - 30 Automated Media Handling Tape Carousels Gravity Feed 3.5" formfactor tape reader 19" Carousel 4mm Tape Reader

31 Mass StorageCS510 Computer ArchitecturesLecture 18 - 31 Automated Media Handling Front View Side View Tape Readers Tape Cassette Tape Pack: Unit of Archive

32 Mass StorageCS510 Computer ArchitecturesLecture 18 - 32 MSS: Automated Tape Library 116 x 5 GB 8 mm tapes = 0.6 TBytes (1991) 4 tape readers 1991, 8 half height readers now 4 x.5 MByte/second = 2 MBytes/s $40,000 O.E.M. Price Predict 1995: 3 TBytes; 2000: 9 TBytes

33 Mass StorageCS510 Computer ArchitecturesLecture 18 - 33 Open Research Issues Hardware/Software attack on very large storage systems » File system extensions to handle terabyte sized file systems » Storage controllers able to meet bandwidth and capacity demands Compression/decompression between secondary and tertiary storage » Hardware assist for on-the-fly compression » Application hints for data specific compression » More effective compression over large buffered data » DB indices over compressed data Striped tape: is large buffer enough? * Applications: Where are the Terabytes going to come from? » Image Storage Systems » Personal Communications Network multimedia file server

34 Mass StorageCS510 Computer ArchitecturesLecture 18 - 34 MSS: Applications of Technology Robo-Line Library Books/Bancroft x Pages/book x bytes/page = Bancroft 372,910 400 4000 = 0.54 TB Full text Bancroft Near Line = 0.5 TB; Pages images ? 20 TB Predict: "RLB" (Robo-Line Bancroft) = $250,000 Bancroft costs: Catalogue a book: $20 / book Reshelve a book: $1/ book % new books purchased per year never checked out: 20%

35 Mass StorageCS510 Computer ArchitecturesLecture 18 - 35 MSS: Summary Access Time (ms) 0.0001 0.001 0.01 0.1 1 10 100 1000 10000 100000 $0.00$0.01$0.10$1.00 $10.00 $100.00 Robo-Line Tape Magnetic Disk DRAM Access Gap #1 Access Gap #2 $ / MB


Download ppt "Mass StorageCS510 Computer ArchitecturesLecture 18 - 1 Lecture 18 Mass Storage Devices: RAID, Data Library, MSS."

Similar presentations


Ads by Google