CS717 Algorithm-Based Fault Tolerance Matrix Multiplication Greg Bronevetsky.

Slides:



Advertisements
Similar presentations
Mahdi Barhoush Mohammad Hanaysheh
Advertisements

Cyclic Code.
Applied Algorithmics - week7
Error Control Code.
10.1 Chapter 10 Error Detection and Correction Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
296.3Page :Algorithms in the Real World Error Correcting Codes II – Cyclic Codes – Reed-Solomon Codes.
15-853:Algorithms in the Real World
Information and Coding Theory
Theoretical Program Checking Greg Bronevetsky. Background The field of Program Checking is about 13 years old. Pioneered by Manuel Blum, Hal Wasserman,
Quantum Error Correction SOURCES: Michele Mosca Daniel Gottesman Richard Spillman Andrew Landahl.
The Goldreich-Levin Theorem: List-decoding the Hadamard code
EEE377 Lecture Notes1 EEE436 DIGITAL COMMUNICATION Coding En. Mohd Nazri Mahmud MPhil (Cambridge, UK) BEng (Essex, UK) Room 2.14.
Low Density Parity Check Codes LDPC ( Low Density Parity Check ) codes are a class of linear bock code. The term “Low Density” refers to the characteristic.
Error detection and correction
Orthogonality and Least Squares
6 6.3 © 2012 Pearson Education, Inc. Orthogonality and Least Squares ORTHOGONAL PROJECTIONS.
15-853Page :Algorithms in the Real World Error Correcting Codes I – Overview – Hamming Codes – Linear Codes.
3F4 Error Control Coding Dr. I. J. Wassell.
Hamming Code Rachel Ah Chuen. Basic concepts Networks must be able to transfer data from one device to another with complete accuracy. Data can be corrupted.
exercise in the previous class (1)
Hamming Codes 11/17/04. History In the late 1940’s Richard Hamming recognized that the further evolution of computers required greater reliability, in.
Linear codes 1 CHAPTER 2: Linear codes ABSTRACT Most of the important codes are special types of so-called linear codes. Linear codes are of importance.
Linear Codes.
DIGITAL COMMUNICATION Error - Correction A.J. Han Vinck.
USING THE MATLAB COMMUNICATIONS TOOLBOX TO LOOK AT CYCLIC CODING Wm. Hugh Blanton East Tennessee State University
PEDS: Parallel Error Detection Scheme for TCAM Devices David Hay, Politecnico di Torino Joint work with Anat Bremler Barr (IDC, Israel), Danny Hendler.
Information and Coding Theory Linear Block Codes. Basic definitions and some examples. Juris Viksna, 2015.
Matrix Sparsification. Problem Statement Reduce the number of 1s in a matrix.
AN ORTHOGONAL PROJECTION
CODING/DECODING CONCEPTS AND BLOCK CODING. ERROR DETECTION CORRECTION Increase signal power Decrease signal power Reduce Diversity Retransmission Forward.
Error Coding Transmission process may introduce errors into a message.  Single bit errors versus burst errors Detection:  Requires a convention that.
Error Control Code. Widely used in many areas, like communications, DVD, data storage… In communications, because of noise, you can never be sure that.
Codes Codes are used for the following purposes: - to detect errors - to correct errors after detection Error Control Coding © Erhan A. Ince Types: -Linear.
1 Network Coding and its Applications in Communication Networks Alex Sprintson Computer Engineering Group Department of Electrical and Computer Engineering.
MIMO continued and Error Correction Code. 2 by 2 MIMO Now consider we have two transmitting antennas and two receiving antennas. A simple scheme called.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 251 Introduction to Computer Organization.
Error Control Code. Widely used in many areas, like communications, DVD, data storage… In communications, because of noise, you can never be sure that.
§6 Linear Codes § 6.1 Classification of error control system § 6.2 Channel coding conception § 6.3 The generator and parity-check matrices § 6.5 Hamming.
DIGITAL COMMUNICATIONS Linear Block Codes
15-853:Algorithms in the Real World
1 Introduction to Quantum Information Processing CS 667 / PH 767 / CO 681 / AM 871 Richard Cleve DC 2117 Lecture 20 (2009)
Linear codes of good error control performance Tsonka Baicheva Institute of Mathematics and Informatics Bulgarian Academy of Sciences Bulgaria.
ADVANTAGE of GENERATOR MATRIX:
Linear Block Code 指導教授:黃文傑 博士 學生:吳濟廷
Information and Coding Theory Cyclic codes Juris Viksna, 2015.
Information Theory Linear Block Codes Jalal Al Roumy.
10.1 Chapter 10 Error Detection and Correction Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The parity bits of linear block codes are linear combination of the message. Therefore, we can represent the encoder by a linear system described by matrices.
Perfect and Related Codes
Error Detection and Correction – Hamming Code
Some Computation Problems in Coding Theory
Error Detection and Correction
Elementary Coding Theory Including Hamming and Reed-Solomom Codes with Maple and MATLAB Richard Klima Appalachian State University Boone, North Carolina.
Digital Communications I: Modulation and Coding Course Term Catharina Logothetis Lecture 9.
Data Communications and Networking
Error-Detecting and Error-Correcting Codes
Sense making in linear algebra Lee Peng Yee Bangkok
Error Control Coding. Purpose To detect and correct error(s) that is introduced during transmission of digital signal.
Richard Cleve DC 2117 Introduction to Quantum Information Processing QIC 710 / CS 667 / PH 767 / CO 681 / AM 871 Lecture (2011)
ECE 442 COMMUNICATION SYSTEM DESIGN LECTURE 10. LINEAR BLOCK CODES Husheng Li Dept. of EECS The University of Tennessee.
Channel Coding: Part I Presentation II Irvanda Kurniadi V. ( ) Digital Communication 1.
RS – Reed Solomon Error correcting code. Error-correcting codes are clever ways of representing data so that one can recover the original information.
Part 2 Linear block codes
Coding Theory Dan Siewiorek June 2012.
II. Linear Block Codes.
RS – Reed Solomon List Decoding.
Information Redundancy Fault Tolerant Computing
RAID Redundant Array of Inexpensive (Independent) Disks
Error Detection and Correction
Types of Errors Data transmission suffers unpredictable changes because of interference The interference can change the shape of the signal Single-bit.
Presentation transcript:

CS717 Algorithm-Based Fault Tolerance Matrix Multiplication Greg Bronevetsky

CS717 Problem at Hand Have matrices A and B Want to compute their product: AB Ask a matrix-matrix-multiply (MMM) implementation to compute product Answer: C Question: Is C the correct answer? How could we know for sure?

CS717 Algorithm-Based Fault Tolerance Encode input matrices via error-correcting code Run regular MMM algorithm on encoded matrices –Encoding invariant under MMM Naturally outputs encoded matrices Encoding guarantees: –If upto t errors in output, will detect error –If upto c<t errors in output, can decode correct output matrix

CS717 Outline Linear Error Correcting Codes Algorithm-Based Fault Tolerance ABFT = Linear Encoding of Matrices

CS717 Error Correcting Codes Map f:  k   n –k-long data words  n-long codewords –We use  ={0, 1} Code of length n is a “sparse” subset of  n –Very few possible words are valid codewords Rate of code Amount of information communicated by each codeword

CS717 Minimum Distance Minimum Distance: d() = Hamming distance Hamming distance: number of spots where words differ Measures difficulty of decoding/correcting corrupted codewords

CS717 Detection and Correction Code may detect errors in  d min spots –No error can morph one codeword into another May correct errors in  (d min -1)/2 spots –Can still find “closest” codeword More details later… Each codeword defines circle around itself of radius d min /2

CS717 Linear Codes Codewords form linear subspace inside  n In rowspace of generator matrix G: a (n=7, k=3) code

CS717 Property 1 Linear combination of any codewords is also a codeword: For any x,y  C, (x+y)  C Codeword*constant is codeword For any z  C, k*z  C always a codeword Proof: basic properties of linear spaces

CS717 Property 2 Minimum distance of linear code = Where Proof:

CS717 Parity Check Matrix H: dual matrix to G –Contains basis of space orthogonal to G’s row space –n-k dimentional space H is (n-k)xn Space defined as: Note: H also defines a linear code

CS717 Property 3 d min =min # of columns of H that can sum to 0 Proof:

CS717 Property 4 Minimum distance of linear code  n-k+1 Proof –Total n dimensions (since codewords are n-vectors) –G’s rowspace rank = k –Thus, H’s columspace rank = n-k –Thus, n-k+1 columns will be linearly dependent Add up to 0 –By Property 3, this is  d min

CS717 Outline Linear Error Correcting Codes Algorithm-Based Fault Tolerance ABFT = Linear Encoding of Matrices

CS717 Encoding a Matrix Algorithm-Based Fault Tolerance introduced by Huang and Abraham in 1984 Encode each row of matrix via extra column Column entries = sums of matrix rows

CS717 Encoding a Matrix Encode each column of matrix via extra row Row entries = sums of matrix columns Full Encoding:

CS717 Detecting Errors Suppose matrix A is corrupted to matrix  –entry â i,j is wrong Can detect error’s exact position:

CS717 Correcting Errors Can correct error using row or col checksum

CS717 Big Trick: Preservation of Encoding Column-encoded mtx * Row-encoded mtx = = Fully-encoded mtx Can check MMM computation by checking encoding of output If product matrix has an erroneous entry –Can detect –Can correct

CS717 Applications Matrix Multiplication –Given encoded A and B, –Check whether MMM result C (?=AB) has valid encoding Matrix Factorization –Given a factorization A=WZ –Verify correctness by verifying encodings of factors Factors row- OR column-encoded Can only detect, not correct errors

CS717 Weighted ABFT Oftentimes need to check row- or column- encoded matrices –Ex: factorization, data integrity check Can only detect errors in such matrices Can we also correct? Yes, by generalizing to weighted checking rows/columns

CS717 Weighting Suppose we have d n-vectors w 1 …w d Can column-encode matrix A: Lets try out:

CS717 Weighted Error Detection

CS717 Weighted Error Correction Weighted encoding Detects and Corrects single errors –Even for non full-encoding

CS717 Outline Linear Error Correcting Codes Algorithm-Based Fault Tolerance ABFT = Linear Encoding of Matrices

CS717 “Surprise” But this is all just a linear code! Generator matrix for above scheme:

CS717 Generating Encodings Given m= as message word (or matrix row/column)

CS717 Surprise?? Not too surprising really Why else would MMM preserve encoding? Another possibility: –Efficient: can be implemented via bit shifts Room open for using any linear code!

CS717 Error Detection/Correction in General To show for linear codes: –Can detect  d min errors –Can correct  (d min -1)/2 errors Let be original codeword Let be the corrupted codeword –e: error vector

CS717 Error Detection in General –s called the “syndrome vector” –Independent of original codeword Note: weight(e) <d min since <d min errors Thus: Detection: if, then ERROR

CS717 Error Correction in General Clearly e is correction vector – corrects error in Sufficient to prove: weight(e)  (d min -1)/2  H is isomorphism: correction vectors  syndrome vectors –i.e. for each correction vector (want to know)  unique syndrome vector Thus, possible to correct any error –may not be efficient

CS717 H is Onto weight(e)  (d min -1)/2 < d min rank(H) = n-k  (d min -1)/2 Thus, rank(H)  weight(e) and He  0 –Not enough 1’s in e to sum H’s columns to 0 H maps onto its range Thus,

CS717 H is 1-1 Let e 1 and e 2 be correction vectors, e 1  e 2 Suppose that: –weight(e 1 &e 2 )  (d min -1)/2 –He 1 = He 2 = s He 1 -He 2 = H(e 1 -e 2 ) = s-s = 0 And so, (e 1 -e 2 ) is a codeword Thus, weight(e 1 -e 2 )  d min But weight(e 1 &e 2 )  (d min -1)/2 and so weight(e 1 -e 2 )  d min -1 Contradiction! e 1 = e 2

CS717 Other Encoding Schemes Linear codes preserved by matrix multiplication Presumably, fancier codes might be preserved by fancier computations Limit: –S. Winograd showed in 1962 that any code s.t. f(x  y) = f(x)  f(y) has rate (k/n) or minimum weight  0 as k  How general can we get? Do good solutions exist for small k? –k=64 bits should be good enough

CS717 Summary For Matrix Multiplication can encode input via linear codes Solutions exist for more complex codes –Ex: Fourier Transforms On parallel systems must ensure: –No processor touches >1 element per row/column –Else, if one processor fails, encoding overwhelmed with errors –To ensure this must modify algorithm Separate check placement theory