Presentation is loading. Please wait.

Presentation is loading. Please wait.

Post-Manufacturing ECC Customization Based on Orthogonal Latin Square Codes and Its Application to Ultra-Low Power Caches Rudrajit Datta and Nur A. Touba.

Similar presentations


Presentation on theme: "Post-Manufacturing ECC Customization Based on Orthogonal Latin Square Codes and Its Application to Ultra-Low Power Caches Rudrajit Datta and Nur A. Touba."— Presentation transcript:

1 Post-Manufacturing ECC Customization Based on Orthogonal Latin Square Codes and Its Application to Ultra-Low Power Caches Rudrajit Datta and Nur A. Touba Computer Engineering Research Center Dept. of Electrical and Computer Engineering University of Texas at Austin

2 Motivation  For memories with high defect rates Reduce check-bit overhead Increase reliability  Applicable to low voltage caches

3 Agenda  Introduction  Proposed Approach  Application  Related Work  Orthogonal Latin Square (OLS) Codes  Customization  Results  Conclusion

4 Introduction  Tolerate high defect rates for memories Occurs in memories operating at ultra-low voltages Expected in future nanoscale technologies –Eg. nanoscale crossbar architectures  Conventional method ECC selected based on –Expected number of maximum defects per word

5 Introduction Data Check Bit Generator Memory Information Bits c ful l Check Bits Decoder c full Corrected Data

6 Observations  A priori information available for location of defects Through post-manufacturing memory tests –Obtain a defect map Use information to customize code –Reduce check bit storage in memory/caches

7 Proposed Approach Data Check Bit Generator Switch Network Memory Information Bits c used Check Bits Config. Bits Switch Network Decoder c full c used Corrected Data

8 Proposed Approach  Customize code by disabling rows of the H-matrix Possible if modular code used for ECC Current work looks at OLS codes Configuration Bits

9 Application - Low-voltage Caches  Microprocessor voltage lowered while idle Reduces power  Caches and memories susceptible at lower voltages Unreliable below V ccmin  Enable reliable cache operation at lower voltages At lower voltages use part of cache to store extra check bits

10 Related Work  Word-disable and Bit-fix [Wilkerson 08] Defect map –Identify vulnerable bits Mitigates only persistent errors Uses up half of the cache to store extra check-bits  Two-dimensional ECC [Kim 07] Slow Complicated decoding  Multi-bit segmented ECC [Chishti 09] Orthogonal Latin Square (OLS) code –Single step decodable High redundancy

11 Key Takeaways  Have full ECC on chip Can handle all defect maps  Generate defect map Disable part of the original code Reduces check bit redundancy Retain capability of original code w.r.t the defect map

12 One Step Majority Decoding  t-error correctable – information bit copied over 2t+1 times; each an independent copy  One copy – bit itself  Rest - 2t independent parity equations cscs + dpdp cpcp + dqdq cqcq + dsds didi Majority Voter corrected d i

13 Orthogonal Latin Square Codes  Latin Square m x m array Row-columns permutation of digits 0,1,…..m-1  Orthogonal Latin Squares Ordered pair of elements (r, c, s) appear only once  m 2 data bits, 2tm check bits, t-error correctable [Hsiao 70]  Single step decodable

14 Proposed Scheme  Implement full OLS code on chip  Run memory tests Generate defect map –At manufacturing time or at boot-time Identify vulnerable bits  Disable rows in OLS H-matrix On chip-by-chip basis, based on defect map Correct all erasures PLUS ‘e’ random error in each cache line Reduce redundancy while providing same reliability

15 Definitions  “good row” – for information bit d i Row of OLS H-matrix No ‘1’ in any other erasure position save bit d i −Holds true for all lines In cache  “bad row” – for information bit d i Row of OLS H-matrix ‘1’ in one or more erasure positions apart from bit d i Holds for at least one line of cache

16 “Good Rows” & “Bad Rows” d0d1d2d3d4d5d6d7 line1-E---E-- line2---E---- H-row110001011 H-row201101001 H-row310010110 H-row1G---G-GG H-row2-GB-B--B H-row3B--B-BB-

17 Necessary and Sufficient Conditions  Tolerate ‘e’ random errors “good rows” – “bad rows” ≥ 2(e + 1)  Original code – t-error correcting (Max vulnerable bits in any line) + e ≤ t

18 Row Selection  Covering problem Select enough good rows for each information bit d i Until constraint is satisfied NP-complete problem Apply heuristics H-row1G---G-GG H-row2-GB-B--B H-row3B--B-BB- “good rows” – “bad rows” 11-100010-11-1-1-1-1-1-1

19 Covering Problem  Solve for cache line with maximum erasures first  Apply solution to all other cache lines  If unsatisfactory, add erasures from one of unsolved lines  Repeat until solution fits entire cache

20 Implementation + dpdp cpcp + dqdq cqcq + dsds cscs didi Adjustable Threshold Voter corrected d i ctl p & ctl q & ctl s & ctl Majority Voter

21 Experimental Results Results for Word Size of 256 Bits and Bit-Error Rate of 10 ‑ 3 Cache Size (Bytes) Check bits for conventional OLS Check bits for customized OLS Percentage reduction in Max. Check Bits AvgMaxAvgMax 16 KB15522411714535.27 32 KB16625612514842.19 64 KB17525613415639.06 128 KB20825616317730.86

22 Experimental Results Results for Constant Cache Size of 64KB Word Size (Bits) Bit-error Rate Check bits for conventional OLS Check bits for customized OLS AvgMaxAvgMax 256 10 -3 175256138156 10 -4 9812884107 10 -5 661026468 484 10 -3 295396198230 10 -4 143176117139 10 -5 9213289115

23 Experimental Results 64 KB cache, 484-bit word, 10 -3 bit-error rate

24 Conclusion  Post-manufacturing customization Reduces large check-bit overhead Provides requisite reliability Applicable to systems with high defect rate


Download ppt "Post-Manufacturing ECC Customization Based on Orthogonal Latin Square Codes and Its Application to Ultra-Low Power Caches Rudrajit Datta and Nur A. Touba."

Similar presentations


Ads by Google