Download presentation
Presentation is loading. Please wait.
Published byIsaac McDowell Modified over 6 years ago
1
TRAMS PROJECT PTC meeting June 23rd 2011 WP3 progress
FP
2
TASK 3.1 TASK 3.2 TASK 3.3
3
DYNAMIC REDUNDUNDANCY INFORMATION REDUNDANCY
MECHANISMS TO IMPROVE RELIABILITY FOR A YIELD OF 90% AND DIFFERENT CELL PPF YIELD = 90% NO REDUNDUNDANCY (0%) DYNAMIC REDUNDUNDANCY RECONFIGURATION (0-100%) HARDWARE REDUNDANCY RMR (200% …) INFORMATION REDUNDANCY ECC SEC-DED (37.5%) PPF = (10-3 … 1) 3
4
DYNAMIC REDUNDUNDANCY INFORMATION REDUNDANCY
MECHANISMS TO IMPROVE RELIABILITY FOR A YIELD OF 90% AND DIFFERENT CELL PPF YIELD = 90% NO REDUNDUNDANCY (0%) DYNAMIC REDUNDUNDANCY RECONFIGURATION (0-100%) D3.1 D3.2 HARDWARE REDUNDANCY RMR (200% …) D3.4 D3.2 INFORMATION REDUNDANCY ECC SEC-DED (37.5%) PPF = (10-3 … 1) 4
5
DYNAMIC REDUNDUNDANCY INFORMATION REDUNDANCY
Minimum size (D3.6) MECHANISMS TO IMPROVE RELIABILITY FOR A YIELD OF 90% AND DIFFERENT CELL PPF YIELD = 90% NO REDUNDUNDANCY (0%) DYNAMIC REDUNDUNDANCY RECONFIGURATION (0-100%) HARDWARE REDUNDANCY RMR (200% …) INFORMATION REDUNDANCY ECC SEC-DED (37.5%) PPF = (10-3 … 1) 32nm 22nm 18,16nm 13nm 45nm 5
6
DYNAMIC REDUNDUNDANCY INFORMATION REDUNDANCY
Optimized size (D3.6) MECHANISMS TO IMPROVE RELIABILITY FOR A YIELD OF 90% AND DIFFERENT CELL PPF YIELD = 90% NO REDUNDUNDANCY (0%) DYNAMIC REDUNDUNDANCY RECONFIGURATION (0-100%) 18,16nm HARDWARE REDUNDANCY RMR (200% …) 22nm 13nm INFORMATION REDUNDANCY ECC SEC-DED (37.5%) PPF = (10-3 … 1) 32nm 45nm 6
7
Task 3.1 Mitigation mechanisms
D3.1 Report on mitigation mechanism mechanims at layout and circuit level D3.2 Report on new architectures based on redundancy
8
D3.1 Report on mitigation mechanism mechanims at layout and circuit level
V and R mitigating mechanisms at layout level: regularity, proximity effect, dimensions roughness Mechanisms to mitigate the effect of degradation mechanisms, (mainly NBTI and PBTI): Temperature control, voltage scaling, relaxation phases, strained channels, starting burning (Esteve Amat). 3. Robustness enhancement techniques considering Voltage biasing and device sizing TRANSISTOR LEVEL: Width and length, Same aspect ratio, Same area, Threshold voltag CIRCUIT LEVEL: VDD and VSS, Full boost, Partial boost VBL ARCHITECTURAL LEVEL: Word dimension, Column dimension 4. Performance variability mitigation mechanisms considering device sizing All the work will be concentrated, as main vehicle, on 6T SRAM cells, and the Si-bulk technologies investigated in D3.6. Potentially other cells (like 3T1D) and other technologies (like Finfet) could be included.
9
D3.2 Report on new architectures based on redundancy
Fault-Tolerant Architectures (uncertainties: permanent and transient faults) Redundancy (1956 Von Neumann) Static Redundancy (built into the system, masks the faults effects) Dynamic Redundancy (fault detection, location, containment, recovery) Space (Hardware) Reconfiguration T3.2 D3.4 Time T3.1 D3.2 Information
10
Well-known HR-techniques
R-Fold Modular Redundancy (RMR) Cascaded R-Fold Modular Redundancy (CRMR) R-Fold Interwoven Redundancy (RIR) Multiplexing Techniques Voters
11
Averaging Design Averaging and Adaptable Thresholding Decision Gates
Inspired by Biological Systems Numerous cells, autonomous, significant variability, sensitive to external factors, faulty elements Evolvable in time and space Redundancy and plasticity for learning and adaption Robustness to overcome deficient components and transmission lines Analog computation Averaging and thresholding Perceptron as an Artificial Neural Network (ANN)
12
Averaging Design Adaptive Averaging Cell (AD-AVG)
Adaption according to optimum weights
13
INFORMATION REDUNDANCY (CODES) (D3.2 too)
Modified Berger adaptive codes for high efficient K-error correction.
14
D3.3 (M24) Mechanisms to detect latency
15
Idea Behind the Concept
In essence, a system that can discretize chips based on static variations and sense dynamic variations over time. Prime Requirements Use gradient sensing rather than absolute sensing. Minimum area over head. Should be hidden from program execution (shadowed). Provide a platform for cross-layer optimizations.
16
Characteristics of the 3T1D Cells
We place the 3T1D next to a 6T on the same wordline to measure the access time and leakage. Transistors of 3T1D are sized to make sure they have similar access times as the 6T to avoid synchronization and control overhead. The access time variation with temperature is similar to 6T but at high temperatures it performs better. The retention time (function of leakage), can vary as much as 7X across chips maintained at same temperature due to high dependence of leakage of physical parameters such as channel length and threshold.
17
Embedded 3T1D (Shadow Cell)
Due to close proximity, it is safe to assume that both 3T1D and 6T will suffer the same amount of intra-die and systematic variation. As read access lies on the critical path, by measuring this we can have a rough estimation about performance of a given block. As retention time is a strong function of leakage, by approximating the retention time it is possible to estimate the leakage (dominant source of power).
18
Power/Performance Binning - Result
We tested the proposal on a 32KB cache built with shadow cells and complete memory periphery with 45nm PTM. Nearly 40% chips fall into high-performance low-power indicating the goodness of the yield. It can be noted that max-power low-performance bin has no chips. This is characteristic to our scheme. As we use a grouped-bin scheme, the bounds of every bin are very loose. Thus lot of chips in that bin are distributed into adjacent bins along the cartesian co-ordinates.
19
Conclusions and Future Work
Reliability is restored by making circuits aware of their composition. Power/Performance is improved by providing fine-grain guardbands. Variation-tolerant 3T1D cells can be used for classification based on power/performance. The existing scheme can track both high and low frequency variations. On-going Work Understand the impact of random variations and see if there is scope for classification within a memory array. Determine the total number of cells required for monitoring the entire cache structure. Extend the scheme as standalone mechanism for logic. Provide this information to the above layers (microarchitecture or OS) for cross-layer optimizations.
20
D3.4 (M24) Compensating/reconfiguration mechanism to reduce variability and improve reliability
21
Directions Resiliency in high-variability scenarios through run-time sensing and dynamic fine-grain body bias to leverage recoverable errors. Use bias to increase speed or reduce power Decode-directed fine-grain tuning. Activate each block with the optimal bias/error detection combination Proactive reconfiguration
22
Expected Outcomes Lifetime adaptability:
Through the use of sensors and feedback from error correcting codes, the PRMU detects any reduction/increase in the power, performance or error rate of each block. Architecture enabled to dynamically adapt the circuits to meet the performance and power goals set. PVT impact reduction and Yield enhancement: Due to the dynamic self-tuning adaptability of the architecture.
23
Potential: DFGBB at the L1 cache level
Experimental framework 32KB cache under process variations at 45nm (PTM), 1KB blocks. σ for systematic and random variation of Vth is 6.4% and 3.7% for Leff. Inter-die variations are 3%. Vdd=1V. 500 samples simulated on HSPICE.
24
Potential: DFGBB at the L1 cache level
Effect on yield of FGBB FBB voltages and corresponding yield
25
Conclusions Extensive use of forward and reverse bias provides inherent resiliency to fabrication and run-time variations. Plus, it offers an “on/off” functionality and extraordinary power savings. Other reconfiguration techniques under evaluation.
26
New proactive reconfiguration mechanisms to improve reliability
and enlarge litetime.
27
Proactive Recovery/Reconfiguration
Motivation: BTI mechanisms are recoverable The redundancy is used to allow non faulty microarchitecture to be temporarily deactivated and activated on a rotating basis. Transitioning between two modes of active mode and recovery mode Has advantages over reactive mode specially in recoverable mechanisms like NBTI/PBTI: Prolongs the time that failure happens Balancing the life time amount all units
28
Recovery Methods Recovery modes: Natural recovery Power off
Strong Recovery Single pair 2 transistors Double pair 4 transistors 4PR worst/ 4PR best 4 Transistor recovery circuit
29
Preliminary conclusions about proactive reconfiguration:
In a memory system with spare parts and reconfiguration mechanism, proactive continuous proactive reconfiguration allows an increase of lifetime Drawback: overhead Potential advantage 7X lifetime enlarge (for a typical case)
30
WP3. T3.3 Task 3.3 Design Flow for timing monitor insertion for runtime monitoring of an ASIC during synthesis. [Task Leader: IMEC (13 pms) Duration: M13-M24]. Design flow for the automatic insertion of timing monitors in RTL descriptions for near-failure timing violation detection. Input: RTL level system description of the ASIC Output: Synthesized description of the block including monitoring circuitry. D3.5: Report on the method that instantiates the monitor insertion in ASICs descriptions. (IMEC) T0+24
31
Two main types of low cost: Timing Monitors (Reactive)
Copyright IMEC Two main types of low cost: Timing Monitors (Reactive) G O R G O R DFF DFF != Logic Image source: Notes: Papers of Reactive-Proactive Monitors Razor, Crystal ball, .... etc DFF clock T. Austin, D. Blaauw, T. Mudge, K. Flautner, “Making typical silicon matter with Razor”, IEEE Computer Society, Vol. 37, Iss.3, pp.57-65, March 2004.
32
Two main types of low cost: Timing Monitors (Proactive)
Copyright IMEC Two main types of low cost: Timing Monitors (Proactive) G O R Sh G O R DFF DFF != Logic Sh Image source: Notes: Papers of Reactive-Proactive Monitors Razor, Crystal ball, .... etc DFF clock M. Eireiner, et al “In-Situ Delay Characterization and Local Supply Voltage Adjustment for Compensation of Local Parametric Variations”, IEEE Journal of Solid-State Circuits, Vol.42, No.7, July 2007
33
RT level monitor insertion flow
Single Step RTL2RTL Monitor (knob) circuit in RTL SoC Design in RTL Elaboration script Selected Statistical Critical Paths Tool Properties Verify After Monitor insertion Automated insert, connect, &route Single step before synthesis Builds on the top of standard existing tools Non invasive: no change in design interface Transparent to designer Re-use the original TestBenches Automated extended testing RTL-Netlist Synthesis and Place & Route
34
Next steps Implement the flow
Apply it to existing RTL ASIC design (imec internal) Benchmark the outcome: Area overhead Power overhead Design time overhead
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.