Download presentation
Presentation is loading. Please wait.
Published byVirgil Newman Modified over 9 years ago
1
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems
2
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Why Worry About Storage Systems Response time = CPU time + I/O time Suppose 95% of work is done in the CPU, 5% is I/O –If the CPU is improved by a factor of 100, what is the Speedup? –Sp = 1 / (1 - 0.95 + (0.95 / 100)) = 16.8 –83.2% of the improvement is squandered due to poor I/O Future performance gains must consider faster I/O Amdahl’s Law All parts of a system need to be improved somewhat equally
3
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Magnetic Disk Physically consist of –Tracks –Sectors –Platters –Heads –Cylinders Performance characteristics are: –Rotational latency (delay) – time to rotate to the required sector (RPM) –Seek time – time to find the required track –Transfer rate – time to transfer the data
4
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Magnetic Disk Layout Platter Track Platters Sectors Tracks COPYRIGHT 1998 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED
5
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Zone Bit Recording Originally, all tracks had the same number of sectors –Each sector = 512 Bytes Inefficient! Limited by the density of the smallest tracks Outer tracks can hold more sectors (store more data) than inner tracks called Zone Bit Recording (ZBR) The sequence of information recorded on each sector is –sector number –gap –data including error correction code bits –gap
6
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Disk Performance What is the average time to read / write a 512 byte sector given: –Avg seek time = 9 ms –Rotation = 7200 rpm –Transfer rate = 150 MB / sec (e.g. serial ATA) Avg access time = seek time + rotational delay + transfer time = 9 ms + (½ rev) / 7200 rpm + 512 bytes / (150 MB / sec) =.009 + (½ * 60) / 7200 + 512 / (150 * 2 20 ) =.009 +.004167 +.000003 =.01317 sec = 13.17 ms Suppose the rotation rate was 5400 rpm? Avg Access time =.009 + (½ * 60) / 5400 + 0.000003 = 9 + 5.556 + 0.003 = 14.56 ms
7
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Disk Reliability vs Availability In processing, our main concern is performance (and cost) In I/O, our main concern is reliability (and cost) Reliability is anything broken? Availability is the system still available to the user, even if it is broken? RAID technology is designed to provide increased availability for potentially unreliable devices RAID – Redundant Array of Inexpensive Disks –Patterson / Katz / Gibson -- 1987
8
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst RAID Provide a set of physical disks that appear to be a single logical drive Distribute the data across the drives in the array Allow levels of redundancy to permit recovery of failed disks 7 (basic) levels of RAID (RAID0 – RAID6) RAID is NOT a backup system! –Its made to maintain uptime through failures
9
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst RAID0 No redundancy (therefore not really RAID at all) Allocate sectors (or larger units) from a single logical drive in stripes across multiple physical drives PhysicalDrive 0 0 4 8 Physical Drive 1 1 5 9 Physical Drive 2 2 6 10 Physical Drive 3 3 7 11 LogicalDrive 0 1 2 3 4 stripes
10
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst RAID0 Summary Called striping the data –Each stripe is 1 sector Advantage: –access to large sections of contiguous data can be done in parallel over all disks Disadvantage: –no redundancy
11
Physical Drive 6 2 6 10 Physical Drive 5 1 5 9 PhysicalDrive 4 0 4 8 Physical Drive 7 3 7 11 RAID1 Mirrored data Make a complete mirror (i.e. duplicate) of all data PhysicalDrive 0 0 4 8 Physical Drive 1 1 5 9 Physical Drive 2 2 6 10 Physical Drive 3 3 7 11 mirrors
12
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst RAID1 Summary Uses stripes (of sectors) across drives just like RAID0 Replicates each disk with an exact copy Advantage: –Allows parallel access (just like RAID0) –Failure of any number of drives does not impact availability Disadvantage: –Writes must occur over both “arrays” –Very expensive (2 x disks)
13
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst RAID2 Redundancy through Hamming codes Store enough extra data in order to detect and correct errors, or in order to provide availability in the case of a failed drive Stripes are small 1 bit (!) per stripe originally later bytes/words All disk heads are synchronized – does not permit parallel access as in RAID0 and RAID1 Requires log 2 (#disks) extra disks to implement
14
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Hamming Codes A technique for enabling error detection & correction (and therefore redundancy as well) Ex: ECC Scheme: –store 1110, but suppose an error changes this to 1100 1 1 10 0 0 1 Even parity bit 1 1 0 0 0 0 1 Wrong parity bit! Error in data Error detected
15
Physical Drive 5 0xxx xxxx Physical Drive 4 1xxx xxxx Physical Drive 6 0xxx xxxx RAID2 Disk Layout Physical Drive 0 1xxx xxxx Physical Drive 1 1xxx xxxx Physical Drive 2 1xxx xxxx Physical Drive 3 0xxx xxxx Parity bits Data Stripes of bytes or words
16
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst RAID2 Summary Error correction is done across disks Advantage: –Useful in a high failure environment Disadvantage: –Expensive –Modern disks do not exhibit high failure rate Not used in practice
17
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Physical Drive 4 1xxx xxxx RAID3 Bit interleaved parity Like RAID2 but use only 1 parity drive Physical Drive 0 1xxx xxxx Physical Drive 1 1xxx xxxx Physical Drive 2 1xxx xxxx Physical Drive 3 0xxx xxxx Stripes of bytes or words Parity drive
18
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst RAID3 Summary Small stripes (byte or word) interleaved across disks Parity = XOR of all bits in the byte / word Advantage: –All disks participate in all data accesses high transfer rate –If a single drive fails, lost bit (stripe) can be reconstructed easily Disadvantage: –Only 1 access per “time” –Can detect / correct only single failures
19
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Physical Drive 4 P(0-3) P(4-7) P(8-11) RAID4 Block level parity Same parity scheme as RAID3 but uses large blocks per stripe Physical Drive 0 Block0 Block4 Block8 Physical Drive 1 Block1 Block5 Block9 Physical Drive 2 Block2 Block6 Block10 Physical Drive 3 Block3 Block7 Block11 Stripes of large “blocks” Parity drive
20
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst RAID4 Summary Interleaved blocks across disks Advantage: –Allows independent access due to large stripes can support multiple independent reads –Error detection / correction supported up to single bit errors Disadvantage: –On parallel writes penalty incurred since all writes require access to the same parity disk to update the parity Not used in practice
21
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Physical Drive 4 P(0-3) Block7 Block11 RAID5 Block level distributed parity Same scheme as RAID4 but parity is interleaved with the data Physical Drive 0 Block0 Block4 Block8 Physical Drive 1 Block1 Block5 Block9 Physical Drive 2 Block2 Block6 P(8-11) Physical Drive 3 Block3 P(4-7) Block10
22
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst RAID5 Summary Interleaved blocks across disks and interleaved with data Advantage: –Can support multiple independent reads as long as access is to independent disks –Can support multiple independent writes as long as write and parity disks are independent –Error detection / correction supported up to single bit errors –Reduces I/O bottleneck Most common approach used in practice
23
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst RAID6 Summary AKA “P + Q Redundancy” Same as RAID5 (interleaved blocks and interleaved parity) –But has a second set of parity bits Advantage: –Same as RAID5 –Plus the ability to tolerate a second failure (either disk or operator) Disadvantage: –Takes one more extra drive worth of space (more expensive) In case you want to be extra careful
24
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst RAID Systems – Failure and Recovery RAID1 and RAID5 are the most common in practice –RAID1 read performance/redundancy –RAID5 cost/redundancy (performance not bad either) If a RAID system loses a disk, how do you fix it? –When a disk fails, we’ve entered a “danger zone” where one more error causes data loss Repairs need to be made quickly to minimize this time –Hot-swappable drives are doable, but expensive/difficult –Some systems use standby spares which are automatically used as a replacement when an error occurs Once a new disk is provided, the RAID system will reconstruct the data from redundancy or parity information –All done during uptime
25
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst What RAID Can and Cannot Do RAID Can protect uptime –Downtime = $$$ RAID Can improve performance on certain applications –Anytime you’re accessing/moving around large files, striping gives you lots of access bandwidth RAID Cannot protect the data on the array! –One file system single point of failure –Corruption, Virus, Fire, Flood, etc. – Not an alternative to a separate backup! RAID Cannot improve performance on all applications
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.