Instructor: Justin Hsia 8/08/2013Summer 2013 -- Lecture #271 CS 61C: Great Ideas in Computer Architecture Dependability: Parity, RAID, ECC.

Slides:



Advertisements
Similar presentations
Redundant Array of Independent Disks (RAID) Striping of data across multiple media for expansion, performance and reliability.
Advertisements

A CASE FOR REDUNDANT ARRAYS OF INEXPENSIVE DISKS (RAID) D. A. Patterson, G. A. Gibson, R. H. Katz University of California, Berkeley.
What is RAID Redundant Array of Independent Disks.
1 Lecture 18: RAID n I/O bottleneck n JBOD and SLED n striping and mirroring n classic RAID levels: 1 – 5 n additional RAID levels: 6, 0+1, 10 n RAID usage.
RAID (Redundant Arrays of Independent Disks). Disk organization technique that manages a large number of disks, providing a view of a single disk of High.
I/O Errors 1 Computer Organization II © McQuain RAID Redundant Array of Inexpensive (Independent) Disks – Use multiple smaller disks (c.f.
RAID Redundant Array of Independent Disks
CSCE430/830 Computer Architecture
 RAID stands for Redundant Array of Independent Disks  A system of arranging multiple disks for redundancy (or performance)  Term first coined in 1987.
RAID- Redundant Array of Inexpensive Drives. Purpose Provide faster data access and larger storage Provide data redundancy.
Lecture 36: Chapter 6 Today’s topic –RAID 1. RAID Redundant Array of Inexpensive (Independent) Disks –Use multiple smaller disks (c.f. one large disk)
RAID Technology. Use Arrays of Small Disks? 14” 10”5.25”3.5” Disk Array: 1 disk design Conventional: 4 disk designs Low End High End Katz and Patterson.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
Performance/Reliability of Disk Systems So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
Computer ArchitectureFall 2007 © November 28, 2007 Karem A. Sakallah Lecture 24 Disk IO and RAID CS : Computer Architecture.
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
Lecture 3: A Case for RAID (Part 1) Prof. Shahram Ghandeharizadeh Computer Science Department University of Southern California.
Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh Lecture 24 Disk IO.
RAID Systems CS Introduction to Operating Systems.
CSE 451: Operating Systems Winter 2010 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 39: IO Disks Instructor: Dan Garcia 1.
Storage System: RAID Questions answered in this lecture: What is RAID? How does one trade-off between: performance, capacity, and reliability? What is.
CS 61C: Great Ideas in Computer Architecture Dependability Instructor: David A. Patterson 1Spring Lecture.
Redundant Array of Inexpensive Disks (RAID). Redundant Arrays of Disks Files are "striped" across multiple spindles Redundancy yields high data availability.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
Computer Organization CS224 Fall 2012 Lesson 51. Measuring I/O Performance  I/O performance depends on l Hardware: CPU, memory, controllers, buses l.
Lecture 11: Storage Systems Disk, RAID, Dependability Kai Bu
Chapter 6 RAID. Chapter 6 — Storage and Other I/O Topics — 2 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f.
Eng. Mohammed Timraz Electronics & Communication Engineer University of Palestine Faculty of Engineering and Urban planning Software Engineering Department.
Lecture 4 1 Reliability vs Availability Reliability: Is anything broken? Availability: Is the system still available to the user?
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
CSE 321b Computer Organization (2) تنظيم الحاسب (2) 3 rd year, Computer Engineering Winter 2015 Lecture #4 Dr. Hazem Ibrahim Shehata Dept. of Computer.
RAID Storage EEL4768, Fall 2009 Dr. Jun Wang Slides Prepared based on D&P Computer Architecture textbook, etc.
N-Tier Client/Server Architectures Chapter 4 Server - RAID Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept RAID – Redundant Array.
1 Chapter 7: Storage Systems Introduction Magnetic disks Buses RAID: Redundant Arrays of Inexpensive Disks.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
CSI-09 COMMUNICATION TECHNOLOGY FAULT TOLERANCE AUTHOR: V.V. SUBRAHMANYAM.
CS 61C: Great Ideas in Computer Architecture Lecture 25: Dependability and RAID Instructor: Sagar Karandikar
CS 61C: Great Ideas in Computer Architecture Dependability and RAID
Copyright © Curt Hill, RAID What every server wants!
Data and Computer Communications Chapter 6 – Digital Data Communications Techniques.
The concept of RAID in Databases By Junaid Ali Siddiqui.
CS 61C: Great Ideas in Computer Architecture Dependability and RAID Lecture 25 Instructors: John Wawrzynek & Vladimir Stojanovic
1 Lecture 27: Disks Today’s topics:  Disk basics  RAID  Research topics.
CS 61C: Great Ideas in Computer Architecture Dependability - ECC Nicholas Weaver & Vladimir Stojanovic 1.
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
CS 110 Computer Architecture Lecture 25: Dependability and RAID Instructor: Sören Schwertfeger School of Information Science.
RAID TECHNOLOGY RASHMI ACHARYA CSE(A) RG NO
I/O Errors 1 Computer Organization II © McQuain RAID Redundant Array of Inexpensive (Independent) Disks – Use multiple smaller disks (c.f.
Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 40 Input Output Systems (RAID and I/O System Design) Prof. Dr. M. Ashraf Chughtai.
CS Introduction to Operating Systems
RAID.
Multiple Platters.
What every server wants!
Vladimir Stojanovic & Nicholas Weaver
Error Correcting Code.
Krste Asanović & Randy H. Katz
RAID RAID Mukesh N Tekwani
Krste Asanović & Randy H. Katz
ICOM 6005 – Database Management Systems Design
Lecture 28: Reliability Today’s topics: GPU wrap-up Disk basics RAID
CSE 451: Operating Systems Winter 2009 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
TECHNICAL SEMINAR PRESENTATION
RAID Redundant Array of Inexpensive (Independent) Disks
UNIT IV RAID.
Mark Zbikowski and Gary Kimura
CSE 451: Operating Systems Winter 2012 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
RAID RAID Mukesh N Tekwani April 23, 2019
Presentation transcript:

Instructor: Justin Hsia 8/08/2013Summer Lecture #271 CS 61C: Great Ideas in Computer Architecture Dependability: Parity, RAID, ECC

Review of Last Lecture MapReduce Data Level Parallelism – Framework to divide up data to be processed in parallel – Mapper outputs intermediate (key, value) pairs – Optional Combiner in-between for better load balancing – Reducer “combines” intermediate values with same key – Handles worker failure and does redundant execution 8/08/2013Summer Lecture #272

Agenda Dependability Administrivia RAID Error Correcting Codes 8/08/2013Summer Lecture #273

Six Great Ideas in Computer Architecture 1.Layers of Representation/Interpretation 2.Technology Trends 3.Principle of Locality/Memory Hierarchy 4.Parallelism 5.Performance Measurement & Improvement 6.Dependability via Redundancy 8/08/2013Summer Lecture #274

Great Idea #6: Dependability via Redundancy Redundancy so that a failing piece doesn’t make the whole system fail 8/08/2013Summer Lecture # =2 1+1=1 1+1=2 2 of 3 agree FAIL!

Great Idea #6: Dependability via Redundancy Applies to everything from datacenters to memory – Redundant datacenters so that can lose 1 datacenter but Internet service stays online – Redundant routes so can lose nodes but Internet doesn’t fail – Redundant disks so that can lose 1 disk but not lose data (Redundant Arrays of Independent Disks/RAID) – Redundant memory bits of so that can lose 1 bit but no data (Error Correcting Code/ECC Memory) 8/08/2013Summer Lecture #276

Dependability Fault: failure of a component – May or may not lead to system failure – Applies to any part of the system 8/08/2013 Summer Lecture #277 Service accomplishment Service delivered as specified Service interruption Deviation from specified service FailureRestoration

Dependability Measures Reliability: Mean Time To Failure (MTTF) Service interruption: Mean Time To Repair (MTTR) Mean Time Between Failures (MTBF) – MTBF = MTTR + MTTF Availability = MTTF / (MTTF + MTTR) = MTTF / MTBF Improving Availability – Increase MTTF: more reliable HW/SW + fault tolerance – Reduce MTTR: improved tools and processes for diagnosis and repair 8/08/2013Summer Lecture #278

Reliability Measures 1)MTTF, MTBF measured in hours/failure – e.g. average MTTF is 100,000 hr/failure 2)Annualized Failure Rate (AFR) – Average rate of failures per year (%) – 8/08/2013Summer Lecture #279 Total disk failures/yr

Availability Measures Availability = MTTF / (MTTF + MTTR) usually written as a percentage (%) Want high availability, so categorize by “number of 9s of availability per year” – 1 nine:90%=>36 days of repair/year – 2 nines:99%=>3.6 days of repair/year – 3 nines:99.9%=>526 min of repair/year – 4 nines:99.99%=>53 min of repair/year – 5 nines:99.999%=>5 min of repair/year 8/08/2013Summer Lecture #2710

Dependability Example 1000 disks with MTTF = 100,000 hr and MTTR = 100 hr – MTBF = MTTR + MTTF = 100,100 hr – Availability = MTTF/MTBF = = 99.9% 3 nines of availability! – AFR = 8760/MTTF = = 8.76% Faster repair to get 4 nines of availability? – ×MTTF = ×MTTR – MTTR = hr 8/08/2013Summer Lecture #2711

Dependability Design Principle No single points of failure – “Chain is only as strong as its weakest link” Dependability Corollary of Amdahl’s Law – Doesn’t matter how dependable you make one portion of system – Dependability limited by part you do not improve 8/08/2013Summer Lecture #2712

Question: There’s a hardware glitch in our system that makes the Mean Time To Failure (MTTF) decrease. Are the following statements TRUE or FALSE? 1)Our system’s Availability will increase. 2)Our system’s Annualized Failure Rate (AFR) will increase. F (A) FTFT (B) TFTF (C) (D)

Agenda Dependability Administrivia RAID Error Correcting Codes 8/08/2013Summer Lecture #2714

Administrivia Project 3 (individual) due Sun 8/11 Final Review – Tue 8/13, 7‐10pm in 10 Evans Final – Fri 8/16, 9am‐12pm, 155 Dwinelle – 2 nd half material + self-modifying MIPS – MIPS Green Sheet provided again – Two two‐sided handwritten cheat sheets Can re-use your midterm cheat sheet! “Dead Week” – WTh 8/14-15 Optional Lab 13 (EC) due anytime next week 8/08/2013Summer Lecture #2715

Agenda Dependability Administrivia RAID Error Correcting Codes 8/08/2013Summer Lecture #2716

Can smaller disks be used to close the gap in performance between disks and CPUs? Arrays of Small Disks 8/08/2013Summer Lecture # ”10”5.25”3.5” Disk Array: 1 disk type Conventional: 4 disk types Low End High End

Replace Large Disks with Large Number of Small Disks! 8/08/2013Summer Lecture #2718 Capacity Volume Power Data Rate I/O Rate MTTF Cost IBM 3390K 20 GBytes 97 cu. ft. 3 KW 15 MB/s 600 I/Os/s 250 KHrs $250K IBM 3.5" MBytes 0.1 cu. ft. 11 W 1.5 MB/s 55 I/Os/s 50 KHrs $2K x72 23 GBytes 11 cu. ft. 1 KW 120 MB/s 3900 IOs/s ??? Hrs $150K Disk Arrays have potential for large data and I/O rates, high MB/ft 3, high MB/KW, but what about reliability? 9X 3X 8X 6X (Data from 1988 disks) ~700 Hrs

RAID: Redundant Arrays of Inexpensive Disks Files are “striped” across multiple disks – Concurrent disk accesses improve throughput Redundancy yields high data availability – Service still provided to user, even if some components (disks) fail Contents reconstructed from data redundantly stored in the array – Capacity penalty to store redundant info – Bandwidth penalty to update redundant info 8/08/2013Summer Lecture #2719

RAID 0: Data Striping “Stripe” data across all disks – Generally faster accesses (access disks in parallel) – No redundancy (really “AID”) – Bit-striping shown here, can do in larger chunks 8/08/2013Summer Lecture # … logical record striped physical records

RAID 1: Disk Mirroring Each disk is fully duplicated onto its “mirror” –Very high availability can be achieved Bandwidth sacrifice on write: –Logical write = two physical writes –Logical read = one physical read Most expensive solution: 100% capacity overhead 8/08/2013Summer Lecture #2721 recovery group

Parity Bit Describes whether a group of bits contains an even or odd number of 1’s – Define 1 = odd and 0 = even – Can use XOR to compute parity bit! Adding the parity bit to a group will always result in an even number of 1’s (“even parity”) – 100 Parity: 1, 101 Parity: 0 If we know number of 1’s must be even, can we figure out what a single missing bit should be? – 10?11 → missing bit is 1 8/08/2013Summer Lecture #2722

RAID 3: Parity Disk 8/08/2013Summer Lecture #2723 P 0-2 P 3-5 P 6-8 P XYZ D0D3D6D0D3D6 D2D4D7D2D4D7 D3D5D8D3D5D8

Updating the Parity Data Examine small write in RAID 3 (1 byte) – 1 logical write = 2 physical reads + 2 physical writes – Same concept applies for later RAIDs, too 8/08/2013Summer Lecture #2724 D0D0 D1D1 D2D2 D3D3 P D0’D0’ new data + old data (1. Read) XOR 1 only if bit changed old parity (2. Read) flip if changed + XOR D0’D0’D0D0 PP’ D0’D0’D1D1 D2D2 D3D3 (3. Write) P (4. Write) P’ What if writing halfword (2 B)? Word (4 B)?

RAID 4: Higher I/O Rate 8/08/2013Summer Lecture #2725 Logical data is now block-striped across disks Parity disk P contains all parity blocks of other disks Because blocks are large, can handle small reads in parallel – Must be blocks in different disks Still must update Parity data on EVERY write – Logical write = min 2 to max N physical reads and writes – Performs poorly on small writes

8/08/2013Summer Lecture #2726 D0D1D2 D3 P D4D5D6 PD7 D8D9 PD10 D11 D12 PD13 D14 D15 P D16D17 D18 D19 D20D21D22 D23 P Disk Columns Increasing Logical Disk Address Stripe Insides of 5 disks Example: small read D0 & D5, large write D12- D15 RAID 4: Higher I/O Rate

Inspiration for RAID 5 When writing to a disk, need to update Parity Small writes are bottlenecked by Parity Disk: Write to D0, D5 both also write to P disk 8/08/2013Summer Lecture #2727 D0 D1D2 D3 P D4 D5 D6 P D7

RAID 5: Interleaved Parity 8/08/2013Summer Lecture #2728 Independent writes possible because of interleaved parity Independent writes possible because of interleaved parity D0D1D2 D3 P D4D5D6 P D7 D8D9P D10 D11 D12PD13 D14 D15 PD16D17 D18 D19 D20D21D22 D23 P Disk Columns Increasing Logical Disk Addresses Example: write to D0, D5 uses disks 0, 1, 3, 4

Agenda Dependability Administrivia RAID Error Correcting Codes 8/08/2013Summer Lecture #2729

Error Detection/Correction Codes Memory systems generate errors (accidentally flipped-bits) – DRAMs store very little charge per bit – “Soft” errors occur occasionally when cells are struck by alpha particles or other environmental upsets – “Hard” errors occur when chips permanently fail – Problem gets worse as memories get denser and larger 8/08/2013Summer Lecture #2730

Error Detection/Correction Codes Protect against errors with EDC/ECC Extra bits are added to each M-bit data chunk to produce an N-bit “code word” – Extra bits are a function of the data – Each data word value is mapped to a valid code word – Certain errors change valid code words to invalid ones (i.e. can tell something is wrong) 8/08/2013Summer Lecture #2731

Space of all possible bit patterns: Detecting/Correcting Code Concept Detection: fails code word validity check Correction: can map to nearest valid code word 8/08/2013Summer Lecture # N patterns, but only 2 M are valid code words Error changes bit pattern to an invalid code word.

Hamming Distance Hamming distance = # of bit changes to get from one code word to another p = , q = , Hdist(p,q) = 2 p = , q = , Hdist(p,q) = ? If all code words are valid, then min Hdist between valid code words is 1 – Change one bit, at another valid code word 8/08/2013Summer Lecture #2733 Richard Hamming ( ) Turing Award Winner 3

3-Bit Visualization Aid Want to be able to see Hamming distances – Show code words as nodes, Hdist of 1 as edges For 3 bits, show each bit in a different dimension: 8/08/2013Summer Lecture #2734 Bit 0 Bit 1 Bit 2

Minimum Hamming Distance 2 8/08/2013Summer Lecture #2735 Let 000 be valid If 1-bit error, is code word still valid? – No! So can detect If 1-bit error, know which code word we came from? – No! Equidistant, so cannot correct Half the available code words are valid

Minimum Hamming Distance 3 8/08/2013Summer Lecture #2736 Let 000 be valid How many bit errors can we detect? – Two! Takes 3 errors to reach another valid code word If 1-bit error, know which code word we came from? – Yes! Only a quarter of the available code words are valid Nearest 000 (one 1) Nearest 111 (one 0)

Parity: Simple Error Detection Coding Add parity bit when writing block of data: Check parity on block read: – Error if odd number of 1s – Valid otherwise 8/08/2013Summer Lecture #2737 Minimum Hamming distance of parity code is 2 Parity of code word = 1 indicates an error occurred: –2-bit errors not detected (nor any even # of errors) –Detects an odd # of errors b7b6b5b4b3b2b1b0pb7b6b5b4b3b2b1b0p + b7b6b5b4b3b2b1b0pb7b6b5b4b3b2b1b0p error +

Parity Examples 1)Data – 4 ones, even parity now – Write to memory to keep parity even 2)Data – 5 ones, odd parity now – Write to memory: to make parity even 3)Read from memory – 4 ones → even parity, so no error 4)Read from memory – 5 ones → odd parity, so error What if error in parity bit? – Can detect! 8/08/2013Summer Lecture #2738

Get To Know Your Instructor Summer Lecture #2739 8/08/2013

Agenda Dependability Administrivia RAID Error Correcting Codes (Cont.) 8/08/2013Summer Lecture #2740

How to Correct 1-bit Error? Recall: Minimum distance for correction? – Three Richard Hamming came up with a mapping to allow Error Correction at min distance of 3 – Called Hamming ECC for Error Correction Code 8/08/2013Summer Lecture #2741

Hamming ECC (1/2) Use extra parity bits to allow the position identification of a single error – Interleave parity bits within bits of data to form code word – Note: Number bits starting at 1 from the left 1)Use all bit positions in the code word that are powers of 2 for parity bits (1, 2, 4, 8, 16, …) 2)All other bit positions are for the data bits (3, 5, 6, 7, 9, 10, …) 8/08/2013Summer Lecture #2742

Hamming ECC (2/2) 3)Set each parity bit to create even parity for a group of the bits in the code word – The position of each parity bit determines the group of bits that it checks – Parity bit p checks every bit whose position number in binary has a 1 in the bit position corresponding to p Bit 1 ( ) checks bits 1,3,5,7, … (XXX1 2 ) Bit 2 ( ) checks bits 2,3,6,7, … (XX1X 2 ) Bit 4 ( ) checks bits 4-7, 12-15, … (X1XX 2 ) Bit 8 ( ) checks bits 8-15, 24-31, … (1XXX 2 ) 8/08/2013Summer Lecture #2743

Hamming ECC Example (1/3) A byte of data: Create the code word, leaving spaces for the parity bits: _ 1 _ _ _ /08/2013Summer Lecture #2744

Hamming ECC Example (2/3) Calculate the parity bits: – Parity bit 1 group (1, 3, 5, 7, 9, 11): ? _ 1 _ _ → – Parity bit 2 group (2, 3, 6, 7, 10, 11): 0 ? 1 _ _ → – Parity bit 4 group (4, 5, 6, 7, 12): ? _ → – Parity bit 8 group (8, 9, 10, 11, 12): ? → 8/08/2013Summer Lecture #

Hamming ECC Example (3/3) Valid code word: Recover original data: Suppose we see instead – fix the error! Check each parity group – Parity bits 2 and 8 are incorrect – As 2+8=10, bit position 10 is the bad bit, so flip it! Corrected value: /08/2013Summer Lecture #2746

Hamming ECC “Cost” Space overhead in single error correction code – Form p + d bit code word, where p = # parity bits and d = # data bits Want the p parity bits to indicate either “no error” or 1-bit error in one of the p + d places – Need 2 p ≥ p + d + 1, thus p ≥ log 2 (p + d + 1) – For large d, p approaches log 2 (d) Example: d = 8 → p = ⌈ log 2 (p+8+1) ⌉ → p = 4 – d = 16 → p = 5; d = 32 → p = 6; d = 64 → p = 7 8/08/2013Summer Lecture #2747

Hamming Single Error Correction, Double Error Detection (SEC/DED) Adding extra parity bit covering the entire SEC code word provides double error detection as well! p 1 p 2 d 1 p 3 d 2 d 3 d 4 p 4 Let H be the position of the incorrect bit we would find from checking p 1, p 2, and p 3 (0 means no error) and let P be parity of complete code word – H=0 P=0, no error – H≠0 P=1, correctable single error (p 4 =1 → odd # errors) – H≠0 P=0, double error detected (p 4 =0 → even # errors) – H=0 P=1, an error occurred in p 4 bit, not in rest of word 8/08/2013Summer Lecture #2748

SEC/DED: Hamming Distance 4 8/08/2013Summer Lecture # bit error (one 1) Nearest bit error (one 0) Nearest bit error (two 0’s, two 1’s) halfway between

Modern Use of RAID and ECC (1/3) Typical modern code words in DRAM memory systems: – 64-bit data blocks (8 B) with 72-bit codes (9 B) – d = 64 → p = 7, +1 for DED What happened to RAID 2? – Bit-striping with extra disks just for ECC parity bits – Very expensive computationally and in terms of physical writes – ECC implemented in all current HDDs, so RAID 2 became obsolete (redundant, ironically) 8/08/2013Summer Lecture #2750

Modern Use of RAID and ECC (2/3) RAID 6: Recovering from two disk failures! – RAID 5 with an extra disk’s amount of parity blocks (also interleaved) – Extra parity computation more complicated than Double Error Detection (not covered here) – When useful? Operator replaces wrong disk during a failure Disk bandwidth is growing more slowly than disk capacity, so MTTR a disk in a RAID system is increasing (increases the chances of a 2 nd failure during repair) 8/08/2013Summer Lecture #2751

Modern Use of RAID and ECC (3/3) Common failure mode is bursts of bit errors, not just 1 or 2 – Network transmissions, disks, distributed storage – Contiguous sequence of bits in which first, last, or any number of intermediate bits are in error – Caused by impulse noise or by fading signal strength; effect is greater at higher data rates Other tools: cyclic redundancy check, Reed- Solomon, other linear codes 8/08/2013Summer Lecture #2752

Summary Great Idea: Dependability via Redundancy – Reliability: MTTF & Annual Failure Rate – Availability: % uptime = MTTF/MTBF RAID: Redundant Arrays of Inexpensive Disks – Improve I/O rate while ensuring dependability – Memory Errors: – Hamming distance 2: Parity for Single Error Detect – Hamming distance 3: Single Error Correction Code + encode bit position of error – Hamming distance 4: SEC/Double Error Detection 8/08/2013Summer Lecture #2753