Spring 2008 CSE 591 Compilers for Embedded Systems Aviral Shrivastava Department of Computer Science and Engineering Arizona State University.

Slides:



Advertisements
Similar presentations
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
Advertisements

Topics Left Superscalar machines IA64 / EPIC architecture
NC STATE UNIVERSITY 1 Assertion-Based Microarchitecture Design for Improved Fault Tolerance Vimal K. Reddy Ahmed S. Al-Zawawi, Eric Rotenberg Center for.
CS 7810 Lecture 4 Overview of Steering Algorithms, based on Dynamic Code Partitioning for Clustered Architectures R. Canal, J-M. Parcerisa, A. Gonzalez.
Computer Structure 2014 – Out-Of-Order Execution 1 Computer Structure Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
+ CS 325: CS Hardware and Software Organization and Architecture Internal Memory.
Microprocessor Reliability
1 Saad Arrabi 2/24/2010 CS  Definition of soft errors  Motivation of the paper  Goals of this paper  ACE and un-ACE bits  Results  Conclusion.
Chapter 8. Pipelining.
Sim-alpha: A Validated, Execution-Driven Alpha Simulator Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin.
Instruction-Level Parallelism (ILP)
IVF: Characterizing the Vulnerability of Microprocessor Structures to Intermittent Faults Songjun Pan 1,2, Yu Hu 1, and Xiaowei Li 1 1 Key Laboratory of.
Using Hardware Vulnerability Factors to Enhance AVF Analysis Vilas Sridharan RAS Architecture and Strategy AMD, Inc. International Symposium on Computer.
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
CS 7810 Lecture 25 DIVA: A Reliable Substrate for Deep Submicron Microarchitecture Design T. Austin Proceedings of MICRO-32 November 1999.
Microarchitectural Approaches to Exceeding the Complexity Barrier © Eric Rotenberg 1 Microarchitectural Approaches to Exceeding the Complexity Barrier.
Computer Architecture 2011 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
Opportunities and Challenges for Better Than Worst­Case Design Todd Austin (presenter) Valeria Bertacco David Blaauw Trevor Mudge University of Michigan.
Computer Architecture 2011 – out-of-order execution (lec 7) 1 Computer Architecture Out-of-order execution By Dan Tsafrir, 11/4/2011 Presentation based.
Cost-Effective Register File Soft Error reduction Pablo Montesinos, Wei Liu and Josep Torellas, University of Illinois at Urbana-Champaign.
Cost-Efficient Soft Error Protection for Embedded Microprocessors
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.
Computer Architecture 2010 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
University of Michigan Electrical Engineering and Computer Science 1 A Microarchitectural Analysis of Soft Error Propagation in a Production-Level Embedded.
Pipelining. Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined organization.
Roza Ghamari Bogazici University.  Current trends in transistor size, voltage, and clock frequency, future microprocessors will become increasingly susceptible.
CML CSE 591: Advances in Reliable Computing Aviral Shrivastava.
Transient Fault Detection via Simultaneous Multithreading Shubhendu S. Mukherjee VSSAD, Alpha Technology Compaq Computer Corporation.
Copyright © 2008 UCI ACES Laboratory Kyoungwoo Lee 1, Aviral Shrivastava 2, Nikil Dutt 1, and Nalini Venkatasubramanian 1.
1 Embedded Systems Computer Architecture. Embedded Systems2 Memory Hierarchy Registers Cache RAM Disk L2 Cache Speed (faster) Cost (cheaper per-byte)
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
SiLab presentation on Reliable Computing Combinational Logic Soft Error Analysis and Protection Ali Ahmadi May 2008.
Lecture 16: Storage and I/O EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
CML CML Compiler-Managed Protection of Register Files for Energy-Efficient Soft Error Reduction Jongeun Lee, Aviral Shrivastava* Compiler Microarchitecture.
ECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Availability Copyright 2004 Daniel J. Sorin Duke University.
Spring 2008 CSE 591 Compilers for Embedded Systems Aviral Shrivastava Department of Computer Science and Engineering Arizona State University.
Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.
Spring 2003CSE P5481 Advanced Caching Techniques Approaches to improving memory system performance eliminate memory operations decrease the number of misses.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /3/2013 Lecture 9: Memory Unit Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.
UltraSPARC III Hari P. Ananthanarayanan Anand S. Rajan.
Copyright © 2010 Houman Homayoun Houman Homayoun National Science Foundation Computing Innovation Fellow Department of Computer Science University of California.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
Eduardo L. Rhod, Álisson Michels, Carlos A. L. Lisbôa, Luigi Carro ETS 2006 Fault Tolerance Against Multiple SEUs using Memory-Based Circuits to Improve.
Architectural Vulnerability Factor (AVF) Computation for Address-Based Structures Arijit Biswas, Paul Racunas, Shubu Mukherjee FACT Group, DEG, Intel Joel.
Methodology to Compute Architectural Vulnerability Factors Chris Weaver 1, 2 Shubhendu S. Mukherjee 1 Joel Emer 1 Steven K. Reinhardt 1, 2 Todd Austin.
OOO Pipelines - III Smruti R. Sarangi Computer Science and Engineering, IIT Delhi.
Static Analysis to Mitigate Soft Errors in Register Files Jongeun Lee, Aviral Shrivastava Compiler Microarchitecture Lab Arizona State University, USA.
1 Lecture 3: Pipelining Basics Today: chapter 1 wrap-up, basic pipelining implementation (Sections C.1 - C.4) Reminders:  Sign up for the class mailing.
CS717 1 Hardware Fault Tolerance Through Simultaneous Multithreading (part 2) Jonathan Winter.
Out-of-order execution Lihu Rappoport 11/ MAMAS – Computer Architecture Out-Of-Order Execution Dr. Lihu Rappoport.
High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.
Gill 1 MAPLD 2005/234 Analysis and Reduction Soft Delay Errors in CMOS Circuits Balkaran Gill, Chris Papachristou, and Francis Wolff Department of Electrical.
A Novel, Highly SEU Tolerant Digital Circuit Design Approach By: Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.
MAPLD 2005/213Kakarla & Katkoori Partial Evaluation Based Redundancy for SEU Mitigation in Combinational Circuits MAPLD 2005 Sujana Kakarla Srinivas Katkoori.
Rad (radiation) Hard Devices used in Space, Military Applications, Nuclear Power in-situ Instrumentation Savanna Krassau 4/21/2017 Abstract: Environments.
SE-Aware HPC Extension : Selective Data Protection for reducing failures due to soft errors 7/20/2006 Kyoungwoo Lee.
nZDC: A compiler technique for near-Zero silent Data Corruption
Exam 2 Review Two’s Complement Arithmetic Ripple carry ALU logic and performance Look-ahead techniques, performance and equations Basic multiplication.
Smruti R. Sarangi Computer Science and Engineering, IIT Delhi
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
Computer Architecture & Operations I
UnSync: A Soft Error Resilient Redundant Multicore Architecture
Pipelining: Advanced ILP
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Smruti R. Sarangi Computer Science and Engineering, IIT Delhi
Dynamic Prediction of Architectural Vulnerability
Dynamic Prediction of Architectural Vulnerability
Design of a ‘Single Event Effect’ Mitigation Technique for Reconfigurable Architectures SAJID BALOCH Prof. Dr. T. Arslan1,2 Dr.Adrian Stoica3.
Chapter 8. Pipelining.
Spring 2019 Prof. Eric Rotenberg
Presentation transcript:

Spring 2008 CSE 591 Compilers for Embedded Systems Aviral Shrivastava Department of Computer Science and Engineering Arizona State University

Lecture 3: Soft Errors Models and Techniques

Outline □Soft Errors Recap □Process Technology and Packaging Solutions □Gate-level and Circuit-level Solutions □Microarchitectural Solutions □Single-core □Multi-threaded □Software Solutions □Multi Bit Upsets (MBUs) □Single Event Latchup

Phenomenon of Soft Error □Transient Faults □Random and spontaneous bit-changes in system □Can be caused by □Circuit noise □Cross-talk □More than 50% due to radiation strike

Metrics □FIT: Failure in Time □No. of failures in 1 billion hours of operation □MTTF: Mean Time To Failure □1000 FITs => MTTF of 114 years □1 GByte of 500 FIT/Mbit can expect an error every two weeks □ECC reduces failure rate by 2 orders of magnitude □hypothetical Terabyte system would experience a soft error every few minutes

Trends □DRAM □System error rate of DRAMs is fairly constant □SRAM □Increasing exponentially □Logic □Increasing exponentially

Masking Effects □Logic Masking □Occurs when particle strikes a portion of combinational logic that is blocked from affecting the output due to a subsequent gate whose result is completely determined by its other input values □Electrical Masking □Occurs when the pulse resulting from a particle strike is attenuated by subsequent logic gates, and does not affect the result of the circuit □Latching Window Masking □Occurs when the pulse resulting from a particle strike reaches a latch, but not at the clock transition where the latch captures its input values □Microarchitectural Masking □Occurs when the incorrect value in the latch is ignored in evaluation of a program variable □Software Masking □Occurs when an incorrect value of a variable is ignored by the software while computing the outputs

Faults, Errors, Failures (“Fault Tolerant Computer Systems”, by Pradhan) □Fault □Defect in hardware or software component □defect for cosmic ray = upset from high-energy neutron strike □Error □manifestation of a fault, resulting in deviation from accuracy □faults cause errors (but, not vice versa) □a masked fault is not an error! □vulnerability factor = fraction of faults that cause errors □Failure □non-performance of expected action □ errors cause failures (but not vice versa) □ a corrected error doesn’t cause a failure

Fault Tolerance in Microprocessors □Information Redundancy □Protecting data words with information coding □Parity or Hamming codes □ECC codes mainly in memory arrays □Cost is extra/additional storage for coding overhead, and checking logic □Space Redundancy □Carrying out the same computation on multiple independent hardware at the same time □Errors are exposed by checking the independent results □Cause large hardware overhead □Good for permanent faults □Time Redundancy □Execute the same computation on the same hardware at different times

The Soft Error Opportunity □Key differences with classical fault tolerance □FIT budget 100x – 1000x more than Tandem-style machines □Traditional “big hammer” solutions too expensive for volume market & can be an overkill □Why architecture plays a critical role? □error often defined in architecture & microarchitecture □e.g., strike on a branch predictor doesn’t cause an error □architectural solutions are often more cost-effective □one bit of parity can protect 64 bits, overhead < 2% □radiation-hardened cells can have overhead around 20-40%

Outline □Soft Errors Recap □Process Technology and Packaging Solutions □Gate-level and Circuit-level Solutions □Microarchitectural Solutions □Single-core □Multi-threaded □Software Solutions □Multi Bit Upsets (MBUs) □Single Event Latchup

Processing and Packaging Solutions □Reduce the number of particles that strike □Reduce upsets □Use of highly purified fabrication materials □Remove traces of boron and heavy metals □Surround by metallic frame □Reduce low-energy particles □But neutrons can pass through > 10 ft of concrete □Process Technology Solutions □Partially depleted SOI: no help after 250 nm □Fully depleted SOI: very expensive

Transistor Level Techniques □Normally CMOS inverter is scaled with 2:1 ratio between p- and n-channel devices □To compensate for electron and hole mobilities □Changing this ratio can increase the tolerance

Gate-Level Techniques □Some gates are more vulnerable than others □Radiation hardened designs use NAND gates □When all inputs are low, drive of p-stack is low, high leakage of n-transistors  rise in the output slow  functional failure □Gates vulnerability may change by 5X depending on the state □NAND gate □Extremely vulnerable when inputs 10 □Not vulnerable when inputs 00 □How to synthesize to minimize vulnerability

Circuit-Level Techniques □Adding resistance introduces additional time constants that filter out the very fast SEU-induced transients □High temperature coefficients of poly-silicon resistors □Difficult to control variation of resistance

Outline □Soft Errors Recap □Process Technology and Packaging Solutions □Gate-level and Circuit-level Solutions □Microarchitectural Solutions □Single-core □Multi-threaded □Software Solutions □Multi Bit Upsets (MBUs) □Single Event Latchup

Architectural Vulnerability Factor □AVF: Probability that a fault in a particular structure will results in system failure □AVF of branch predictor = 0% □AVF of PC = 100% □ACE-bit: “Architectural bits” that must be correct for “Correct Execution” □Count number of ACE-bits in a structure □Indentifying Un-ACE bits □Microarchitectural Un-ACE bits: Cannot influence correct instruction execution □Idle or Invalid state, e.g., inputs to un-chosen paths of mux □Mis-speculated state, e.g., wrong path instruction □Predictor structures, e.g., branch predictor □Ex-ACE state, e.g., registers □Architectural Un-ACE bits: Affect correct path execution, but does not change the output □NOP-instructions □Prefetch instructions □Predicated false instructions □Dynamically dead instructions, FDD, TDD □Computing AVF from a Performance Model □Gather the number of ACE-bits in each cycle

Vulnerability Contributions □DCache - largest contributor to vulnerability □Data + tags □ICache: Close second □Instructions only □Tags are (almost) not vulnerable □Register File, Pipeline □Rate of errors may be higher in Pipeline and RF □Compute Cache and Register File Vulnerability

Vulnerability Variations □System vulnerability changes with time □How can you use this information?

Copyright 2005, M. Tahoori 20 D-Cache: Flushing 4x reduction in vulnerability

Copyright 2005, M. Tahoori 21 D-Cache: Write Policy 10x reduction in vulnerability

Copyright 2005, M. Tahoori 22 D-Cache: Refresh 3x reduction in vulnerability using write-thru (30x total)

DIVA Microarchitecture BPredI-$ Dec/Ren IQALUD-$ Rename Regs Arch Regs LR3 + LR7  LR Storage Check Rd LR3 and LR7 from Arch Regs and confirm it equals 4 and 8 ALU Check Add 4+8 and confirm it equals 12 If both checks succeed, write 12 into LR15

Microarchitecture Details Instructions are fed to checker in order during commit The logic and storage checks detect errors in ALUs and datapath The checker core is a simple in-order pipeline – easy to design and verify An error in an earlier stage (LR3 instead of LR2) can be detected by also adding a ren/decode stage to the checker In-order core has no stalls (need bypass for register file) – no data dependences, cache misses, branch mispredicts Contention for register file and data cache can degrade primary thread

Recovery The architected register file and data cache are ECC protected – when an error is detected, it is assumed that checker and architected state are correct Primary core is re-started from faulting instruction A fault in the primary core may result in deadlock: e.g. instruction that produces R5 is waiting for R5 to be produced (instead of R4) A timeout in the checker signals an error

Memory FNCFC Main Cache Mini Cache PPC (Partially Protected Caches) □2 Caches at the same level of memory hierarchy □Main Cache, and the protected mini- cache □Mini-cache □low power, low latency □Timing slack to harden it □Compiler maps data to the two caches □Map Failure-Critical data to the protected mini-cache □Map Not Failure-Critical data to unprotected main cache □Intuition is to provide protection to only the FC data □In multimedia applications, the multimedia data is NOT failure critical □An error  Loss in Quality of Service □How to use PPCs for general applications? Processor Pipeline Unprotected Main Cache Protected Mini Cache HPC Processor Memory Controller Page Mapping PPC FNC FC

Razor □Originally proposed to tolerate process variations □Shadow latch clocked with a delayed clock □If difference in values latched, raise error □How to use it to detect soft errors?