Download presentation
Presentation is loading. Please wait.
Published byMelvin Bown Modified over 10 years ago
1
LEVERAGING ACCESS LOCALITY FOR THE EFFICIENT USE OF MULTIBIT ERROR-CORRECTING CODES IN L2 CACHE By Hongbin Sun, Nanning Zheng, and Tong Zhang Joseph Schneider March 23, 2010
2
The Problem As CMOS technology shrinks, random defects increase Traditionally, these defects handled with redundant rows, columns, and words to replace defective ones As random defects increase, traditional defect strategy may no longer be sufficient
4
The Solution Extend the role of Error-Correcting Codes to compensate for defects Error-Correcting Codes (ECC) also used to compensate for transient soft errors Find a method that allows ECCs to be used for both defects and soft errors
5
Multi-bit ECC Multi-bit ECC – ECC that can correct multiple errors in one codeword Suffers larger latency and higher coding redundancy than single error correction Therefore unusable in L1 cache without suffering major performance issues
6
Overall Goal Implement multi-bit ECC in L2 cache design to correct L2 cache defects without causing significant IPC degradation, area use, or energy cost
7
Steps to Success 1. Apply multi-bit ECC only to cache blocks that require it 2. Implement buffers to limit repeated use of multi- bit ECC 3. Ensure data integrity for soft errors where ECC can no longer alone compensate for it
8
Limited multi-bit ECC Cache blocks with one or more defective cells identified during memory testing; Multi-bit ECC selectively applied then Content-Addressable Memory (CAM) then used to identify blocks requiring multi-bit ECC (referred to as m-blocks) ISSUE: CAM requires large energy consumption
9
Proposed Architecture Standard L2 cache core protecting all subblocks with single error correction, double error detection (SEC- DED) codes Multi-bit ECC core using fully associative multi-bit ECC cache (M-ECC cache), ECC encoder/decoder, and two buffers. M-ECC cache contains location tags and corresponding check bits Dirty Replication Cache to ensure soft error tolerance
10
Proposed Architecture
11
Multi-bit ECC Core In case of write, subblock data encoded and check bits stored In case of read, check bits fetched and decoded ISSUE: Constant use of multi-bit ECC will increase latency and energy consumption at higher defect densities Solution: Two additional buffers
12
Multi-bit ECC Core Buffers Pre-decoding Buffer: Small cache that keeps copies of mostly recently accessed m-blocks; Searched before accessing M-ECC cache Employs least recently used (LRU) policy for replacement when full; Successful due to cache access temporal locality Reduces large amount of ECC decoding and some M-ECC cache access
13
Multi-bit ECC Core Buffers FLU buffer – small CAM that keeps addresses of recently accessed cache blocks that are NOT m- blocks Also employs LRU policy Further reduces M-ECC cache access
14
M-ECC core Flow Chart
15
Soft Error Tolerance ISSUE: When ECC devoted to defect tolerance, defective subblock is vulnerable to soft errors Only necessary for blocks containing defects (including blocks with single defects protected by SEC-DED rather than multi-bit ECC) Further, only necessary when cache block is dirty; Clean blocks can redirect to memory when soft error detected
16
Dirty Replication Cache Use of Dirty Replication (DR) cache When cache block made dirty, data is also kept in this cache When data leaves this cache, a write is performed to main memory Ensures a backup is always available
18
Evaluation Cache defect density set at 0.5% Multi-ECC: BCH-based DEC-TED code (double error correction, triple error detection); Subblocks with more than two errors repaired by redundancy Cache subblocks contain 64 bits BCH DEC-TED decoder has parallelism of 2, uses PGZ decoding algorithm- resulting latency of 82 cycles Cacti 5 used to model caches; Through verilog, determined extra logic is 0.2% of area of L2 cache core
19
Evaluation Compared on four bases: Base: Defect-free L2 cache with no defect tolerant functions M-ECC only; No buffers M-ECC-pbuf: Use of predecoding buffer M-ECC-pfbuf: Use of predecoding and FLU buffers First, determine best size of buffers for use; Then compare performance of IPC and power consumption
20
Size of precoding Buffer
21
Size of FLU buffer
22
Normalize IPC comparison
23
Normalized Power Consumption
24
Results Similar IPC performance, M-ECC core power performance 30% of L2 cache core, which itself is about 10% of the entire system cache
25
DR Write-back hit rates L2 cache fixed at 1 MB 8-way associative, DR varies
26
DR Write-back hit rates DR fully associative with 64 blocks, 1 MB L2 cache varies
27
Conclusions Goal was to effectively use multi-bit ECC for L2 cache defect tolerance at minimal performance and implementation cost Multi-bit ECC implemented only where more than one defect found Two small buffers included to reduce performance impact of multi-bit ECC Dirty Replication Cache included to ensure soft error tolerance
28
Conclusions IPC performance nearly the same as defect-free cache M-ECC cache has less than 2.5% of area overhead and 36% of energy consumption overhead Dirty replication cache has area overhead of only 0.3%, storing 96.4% of write-back data from L1 cache
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.