Presentation Title Greg Snider QSR, Hewlett-Packard Laboratories

Slides:



Advertisements
Similar presentations
Survey of Detection, Diagnosis, and Fault Tolerance Methods in FPGAs
Advertisements

CSCI 4717/5717 Computer Architecture
LEVERAGING ACCESS LOCALITY FOR THE EFFICIENT USE OF MULTIBIT ERROR-CORRECTING CODES IN L2 CACHE By Hongbin Sun, Nanning Zheng, and Tong Zhang Joseph Schneider.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
10/14/2005Caltech1 Reliable State Machines Dr. Gary R Burke California Institute of Technology Jet Propulsion Laboratory.
Fault-Tolerant Systems Design Part 1.
Copyright 2007 Koren & Krishna, Morgan-Kaufman Part.1.1 FAULT TOLERANT SYSTEMS Part 1 - Introduction.
(C) 2005 Daniel SorinDuke Computer Engineering Autonomic Computing via Dynamic Self-Repair Daniel J. Sorin Department of Electrical & Computer Engineering.
Imprecise Computing Yavuz Yetim. Overview Motivation Background Definition and Causes of Imprecision Solution Approaches Discussion of Two Methods Future.
1 Foundations of Software Design Lecture 3: How Computers Work Marti Hearst Fall 2002.
Quantum Error Correction SOURCES: Michele Mosca Daniel Gottesman Richard Spillman Andrew Landahl.
CMOL: Device, Circuits, and Architectures Konstantin K.Likharev and Dmitri B. Strukov Stony Brook University 697GG Nano Computering Fall 2005 Prepared.
Xin Li, Weikang Qian, Marc Riedel, Kia Bazargan & David Lilja A Reconfigurable Stochastic Architecture for Highly Reliable Computing Electrical & Computer.
Embedded Systems Laboratory Informatics Institute Federal University of Rio Grande do Sul Porto Alegre – RS – Brazil SRC TechCon 2005 Portland, Oregon,
Array-Based Architecture for FET-Based, Nanoscale Electronics André DeHon 2003 Presented By Mahmoud Ben Naser.
Susmit Biswas A Pageable Defect Tolerant Nanoscale Memory System Susmit Biswas, Tzvetan S. Metodi, Frederic T. Chong, Ryan Kastner
NanoPLAs Mike Gregoire. Overview ► Similar to CMOS PLA (Programmable Logic Array) ► Uses NOR-NOR logic to implement any logical function ► Like other.
1 Fault-Tolerant Computing Systems #2 Hardware Fault Tolerance Pattara Leelaprute Computer Engineering Department Kasetsart University
Building Cad Prototyping Tool for Emerging Nanoscale Fabrics Catherine Dezan Joined work between Lester( France.
Cosc 2150: Computer Organization
ELN5622 Embedded Systems Class 10 Spring, 2003 Aaron Itskovich
Computer Engineering Group Brandenburg University of Technology at Cottbus 1 Ressource Reduced Triple Modular Redundancy for Built-In Self-Repair in VLIW-Processors.
Memory Intro Computer Organization 1 Computer Science Dept Va Tech March 2006 ©2006 McQuain & Ribbens Built using D flip-flops: 4-Bit Register Clock input.
Part.1.1 In The Name of GOD Welcome to Babol (Nooshirvani) University of Technology Electrical & Computer Engineering Department.
Fault-Tolerant Systems Design Part 1.
EE5393, Circuits, Computation, and Biology Computing with Probabilities 1,1,0,0,0,0,1,0 1,1,0,1,0,1,1,1 1,1,0,0,1,0,1,0 a = 6/8 c = 3/8 b = 4/8.
1 Fault Tolerant Computing Basics Dan Siewiorek Carnegie Mellon University June 2012.
ECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Availability Copyright 2004 Daniel J. Sorin Duke University.
1 Note on Testing for Hardware Components. 2 Steps in successful hardware design (basic “process”): 1.Understand the requirements (“product’) 2.Write.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 Bundled Execution.
Fault-Tolerant Systems Design Part 1.
CS/EE 3700 : Fundamentals of Digital System Design
ELE 523E COMPUTATIONAL NANOELECTRONICS W10: Defects and Reliability, 16/11/2015 FALL 2015 Mustafa Altun Electronics & Communication Engineering Istanbul.
Silicon Programming--Testing1 Completing a successful project (introduction) Design for testability.
Paper by F.L. Kastensmidt, G. Neuberger, L. Carro, R. Reis Talk by Nick Boyd 1.
Introduction to the FPGA and Labs
Self-Checking Circuits
ELE 523E COMPUTATIONAL NANOELECTRONICS
ECE 753: FAULT-TOLERANT COMPUTING
Micro-programmed Control
CFTP ( Configurable Fault Tolerant Processor )
nZDC: A compiler technique for near-Zero silent Data Corruption
NanoMadeo Introduction of Error Correcting Schemes in the Design Process of Self-Healing Circuits for Nanoscale Fabrics.
SEU Mitigation Techniques for Virtex FPGAs in Space Applications
Fault Tolerance In Operating System
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
CPE/EE 428/528 VLSI Design II – Intro to Testing (Part 3)
Computer Architecture & Operations I
Lecture 7 Fault Simulation
ECE 553: TESTING AND TESTABLE DESIGN OF DIGITAL SYSTES
Biological Processes…
Hwisoo So. , Moslem Didehban#, Yohan Ko
Presentation Title Stan Williams October 25, 2005
Threshold for Life Prof. Isaac Chuang MIT Media Laboratory.
Sequential circuits and Digital System Reliability
ELE 523E COMPUTATIONAL NANOELECTRONICS
Mattan Erez The University of Texas at Austin July 2015
Introduction to Fault Tolerance
Design of a ‘Single Event Effect’ Mitigation Technique for Reconfigurable Architectures SAJID BALOCH Prof. Dr. T. Arslan1,2 Dr.Adrian Stoica3.
COP 5611 Operating Systems Spring 2010
RAID Redundant Array of Inexpensive (Independent) Disks
Presentation Title Greg Snider QSR, HP Laboratories
CS 325: CS Hardware and Software Organization and Architecture
ELE 523E COMPUTATIONAL NANOELECTRONICS
A New Hybrid FPGA with Nanoscale Clusters and CMOS Routing Reza M. P
FAULT-TOLERANT TECHNIQUES FOR NANOCOMPUTERS
ELE 523E COMPUTATIONAL NANOELECTRONICS
4-Bit Register Built using D flip-flops:
Fault Mitigation of Switching Lattices under the Stuck-At Model
Presentation transcript:

Presentation Title Greg Snider QSR, Hewlett-Packard Laboratories Nano Architectures II Greg Snider QSR, Hewlett-Packard Laboratories

Today’s talk Living in an imperfect world Quick recap of Wednesday’s talk Transient faults History (von Neumann) Approaches (coding theory) Static defects Background (Teramac) Empirical studies Nano / micro interface DEMO: 4-bit nanoprocessor November 8, 2018

Configurable Tile November 8, 2018

Tile Types November 8, 2018

Mosaics November 8, 2018

1. n-FET / resistor logic GND A A B B C C AB + C V+ November 8, 2018

2. p-FET / resistor logic V+ A A B B C C AB + C GND November 8, 2018

3. n-FET / p-FET logic + V Ground November 8, 2018

One of the first papers on Nanoelectronics “Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components” November 8, 2018

One of the first papers on Nanoelectronics “Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components” J. von Neumann, 1955 ! November 8, 2018

Circuit von Neumann’s Worry pArt Static Defects: initially bad wear out PARt Circuit part pArt Dynamic Faults: transient intermittent November 8, 2018

The Solution (in order of desirability) Fewer parts November 8, 2018

The Solution (in order of desirability) Fewer parts Better parts November 8, 2018

The Solution (in order of desirability) Fewer parts Better parts Redundancy November 8, 2018

The Solution (in order of desirability) Fewer parts Better parts Redundancy November 8, 2018

Triple Modular Redundancy (von Neumann) x y f (x, y) z November 8, 2018

Triple Modular Redundancy (von Neumann) f (x, y) Voter assumed reliable! voter small coarse-grained x y x y f (x, y) z f (x, y) majority vote z f (x, y) November 8, 2018

What if voters are flaky? November 8, 2018

What if voters are flaky? Probabilistic approach Each logic signal  “fuzzy” value (0…1) 0.0 0.5 1.0 November 8, 2018

What if voters are flaky? Probabilistic approach Each logic signal  “fuzzy” value (0…1) false 0.0 0.5 1.0 November 8, 2018

What if voters are flaky? Probabilistic approach Each logic signal  “fuzzy” value (0…1) false true 0.0 0.5 1.0 November 8, 2018

What if voters are flaky? Probabilistic approach Each logic signal  “fuzzy” value (0…1) false mostly true 0.0 0.5 1.0 November 8, 2018

What if voters are flaky? Probabilistic approach Each logic signal  “fuzzy” value (0…1) mostly false mostly true 0.0 0.5 1.0 November 8, 2018

What if voters are flaky? Probabilistic approach Each logic signal  “fuzzy” value (0…1) mostly false failure! mostly true 0.0 0.5 1.0 November 8, 2018

Parallel Restitution (von Neumann) f (x, y) x z y 1. Replace each wire with “bundle” November 8, 2018

Parallel Restitution (von Neumann) x1 x2 x3 x4 f (x, y) z1 z2 z3 y1 y2 y3 y4 z4 1. Replace each wire with “bundle” November 8, 2018

Parallel Restitution (von Neumann) x1 x2 x3 x4 f (x, y) z1 z2 z3 y1 y2 y3 y4 z4 2. Replace function with redundant version, F November 8, 2018

Parallel Restitution (von Neumann) x1 x2 x3 x4 F(x, y) z1 z2 z3 y1 y2 y3 y4 z4 2. Replace function with redundant version, F November 8, 2018

Parallel Restitution (von Neumann) f (x, y) random permute majority vote x1 x2 x3 x4 z1 f (x, y) Each signal becomes a bundle of N signals. Voters can be flaky! => fine-grained. majority vote z2 f (x, y) majority vote z3 y1 y2 y3 y4 f (x, y) majority vote z4 F(x, y) November 8, 2018

Parallel Restitution N Localized: Replace each wire with bundle of N wires. Replicate gates, scramble inputs, add voters. November 8, 2018

Parallel Restitution How does it work? Bundle = stochastic variable (0.0 to 1.0), value is fraction of wires in HI state: false failure true 0.0 .7 .93 1.0 1 Majority gates act as “stochastic amplifiers,” reducing entropy of computation. .5 1 November 8, 2018

Parallel Restitution How does it work? 1 1 1 1 November 8, 2018

Parallel Restitution How does it work? 1 .98 .03 .02 .01 1 .94 1 1 .95 .99 November 8, 2018

Parallel Restitution. Practical? Wires in bundle Probability of failure 1 5.0 x 10-3 1,000 2.7 x 10-2 2,000 2.6 x 10-3 5,000 4.0 x 10-6 10,000 1.6 x 10-10 25,000 1.2 x 10-23 November 8, 2018

…so now what? Coding theory to the rescue… Error detecting codes Error correcting codes November 8, 2018

Error Detecting Codes 1 1 0 0 1 0 1 Noisy channel 1 0 0 0 1 0 1 November 8, 2018

Error Detecting Codes 1 1 0 0 1 0 1 Noisy channel 1 0 0 0 1 0 1 Bit error November 8, 2018

Error Detecting Codes 1 1 0 0 1 0 1  even # 1’s Noisy channel check bit 1 1 0 0 1 0 1  even # 1’s Noisy channel 1 0 0 0 1 0 1  odd # 1’s ERROR! Bit error November 8, 2018

Error Correcting Codes (ECC) check bits 0 1 0 1 1 0 0 1 0 1 Noisy channel 0 1 0 1 0 0 0 1 0 1 Bit error November 8, 2018

Error Correcting Codes (ECC) check bits 0 1 0 1 1 0 0 1 0 1 Noisy channel correction circuit 0 1 0 1 0 0 0 1 0 1 1 1 0 0 1 0 1 Bit error November 8, 2018

Error Correcting Codes (ECC) check bits 0 1 0 1 1 0 0 1 0 1 or noisy circuit! Noisy channel correction circuit 0 1 0 1 0 0 0 1 0 1 1 1 0 0 1 0 1 Bit error November 8, 2018

Self-correcting circuits encode decode f in 1 h out von Neumann’s approach, BUT his error correcting code was very inefficient. (repetition code) encode g in 2 Error correcting code November 8, 2018

Self-correcting circuits encode decode f in 1 h out More efficient codes? Memories: yes Add, sub, shift: yes AND, OR: no! encode g in 2 Error correcting code November 8, 2018

Self-checking circuits Error detection is cheaper than correction: Execute machine cycle. If no errors: latch results, advance state machine, Otherwise restart current cycle. Dynamic faults only. Non-deterministic execution time. Cheaper than in-circuit error correction. November 8, 2018

Totally Self-Checking Circuits Xd Xc Zd Yd Zc Yc no fault => legal codeword output fault => illegal codeword output November 8, 2018

Adder Fault Detection totally self-checking adder Xd + mod M Zd = Xd + Yd Xc C Zc Yd Yc * different? error checker November 8, 2018

Who Checks the Checker? Totally self-checking checkers, of course! a1 b1 a0 b0 Totally self- checking equality checker. 1-out-of-2 November 8, 2018

Totally Self-Checking Networks function unit totally self-checking checker error indication November 8, 2018

Hybrid Approach Self-correcting circuits where critical: Nano / micro interface Self-checking otherwise November 8, 2018

Static defects Defect: Permanent structural imperfection that can be discovered by testing. November 8, 2018

Defect Tolerance: Case Study Teramac (1990 –1994) Logic Simulator: 1,000,000 gates 1 MHz 2 hour compile time November 8, 2018

Teramac 864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables November 8, 2018

Teramac We could not afford perfect parts! 864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables We could not afford perfect parts! November 8, 2018

Teramac  defective We could not afford perfect parts! 864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables  defective We could not afford perfect parts! November 8, 2018

Teramac  defective  defective We could not afford perfect parts! 864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables  defective  defective We could not afford perfect parts! November 8, 2018

Teramac  defective  defective  defective 864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables  defective  defective  defective We could not afford perfect parts! November 8, 2018

Teramac  defective  defective  defective  defective 864 FPGAs: “PLASMA” chip, custom design 145,000 signals between FPGAs: Multi chip module traces Printed circuit board traces Flat ribbon cables  defective  defective  defective  defective We could not afford perfect parts! November 8, 2018

Teramac Defects Resource Total Defective %defective Logic cell 221,000 23,000 10.4 % Xbar line 4,880,000 146,000 3.0 % Buffer 2,420,000 37,000 1.5 % Interchip 145,000 13,800 9.5 % 7,670,000 220,000 2.9 % November 8, 2018

Teramac Defect Handling Defects were located with tests. Compiler avoided defective resources. November 8, 2018

Crossbar Compilation bool function(bool a, b, c) { bool result = return result; } AB + C A GND V+ B C November 8, 2018

But…Defects! November 8, 2018

Defects broken wires November 8, 2018

Defects “stuck open” November 8, 2018

Defects “stuck closed” November 8, 2018

Defect Avoidance V+ November 8, 2018

Resource Allocation A C D A B C B + = D November 8, 2018

+ = Resource Allocation Embedding problem (graph monomorphism) A C A B November 8, 2018

Questions How do defect rates affect ability to allocate resources? What compilation strategies are best for different defect rates? November 8, 2018

Application: written in C int game3Response(int moveNumber, int humanMove) { int response; if (moveNumber == 1) response = I; else if (moveNumber == 2) { if (humanMove == E) response = G; else response = E; } else if (moveNumber == 3) { if (humanMove == D) response = H; response = D; } return response; . November 8, 2018

Target: diode crossbar November 8, 2018

Application compiled onto target November 8, 2018

2-level logic 1.0 Prob. of successful allocation ( 20 compiles per point) .5 28 x 24 rel. area = 1.0 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

2-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 32 x 28 rel. area = 1.3 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

2-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 40 x 38 rel. area = 2.3 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

2-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 48 x 48 rel. area = 3.4 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

2-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 56 x 58 rel. area = 4.8 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

2-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 80 x 78 rel. area = 9.2 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

2-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 96 x 96 rel. area = 13.7 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

Area = f(defect rate) 10 8 2-level 4 2 1 0% 10% 20% 30% 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 24 x 32 rel. area = 1.0 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 28 x 40 rel. area = 1.5 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 32 x 48 rel. area = 2.0 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 40 x 56 rel. area = 2.9 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 48 x 64 rel. area = 4.0 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 56 x 80 rel. area = 5.8 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 64 x 96 rel. area = 8.0 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

multi-level logic 1.0 Prob. of successful allocation ( 20 compiles .5 Prob. of successful allocation ( 20 compiles per point) 80 x 112 rel. area = 12 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

Area = f(defect rate) 10 8 2-level 4 multi-level 2 1 0% 10% 20% 30% 0% 10% 20% 30% Defective junctions (stuck open) November 8, 2018

What about larger circuits? November 8, 2018

4-bit Nanoprocessor 3.0 relative area 6 inputs 2.0 4 inputs 1.0 2 4 6 8 10 12 14 16 18 20 % defects November 8, 2018

4-bit Nanoprocessor More information: “CMOS-like logic in defective, nanoscale crossbars,” Snider, Kuekes, Williams, Nanotechnology 15, 881-891. Can download free for another week at: http://www.iop.org/EJ/abstract/0957-4484/15/8/003 November 8, 2018

Nano / Micro interface Want to access large number of nanowires with small number of microwires. November 8, 2018

Demultiplexer Interface November 8, 2018

Demultiplexer Interface November 8, 2018

Demultiplexer Interface November 8, 2018

Demultiplexer Interface November 8, 2018

Demultiplexer 1 2 3 4 5 6 7 In A0 A1 A2 November 8, 2018

Demultiplexer 1 2 3 4 5 6 7 In A0 A1 A2 0 0 0 November 8, 2018

Demultiplexer 1 2 3 4 5 6 7 In A0 A1 A2 1 0 1 November 8, 2018

Crossbar Demultiplexers 1 2 3 4 5 6 7 A0 A0 A1 A1 A2 A2 In November 8, 2018

Crossbar Demultiplexers …diodes, FETs, resistors can all be used 1 2 3 4 5 6 7 A0 A0 A1 A1 A2 A2 In November 8, 2018

Crossbar Demultiplexers …diodes, FETs, resistors can all be used 1 2 3 4 5 6 7 Defects! A0 A0 A1 A1 A2 A2 In November 8, 2018

Defect-tolerant Crossbar Demultiplexers 1 2 3 4 5 6 7 A0 A0 A1 A1 A2 A2 In November 8, 2018

Defect-tolerant Crossbar Demultiplexers Error correcting codes 1 2 3 4 5 6 7 A0 A0 A1 A1 A2 A2 In November 8, 2018

Key Points Transient errors Static defects Error-correcting circuits Difficult to do efficiently for some computations Necessary in certain cases Can handle small number of static defects Error-detecting circuits General (totally self-checking circuits) Reasonably efficient Static defects Locate with tests Avoid in compiler November 8, 2018

Presentation Title