University of Michigan Electrical Engineering and Computer Science 1 Top 5 Reasons Reliability is the Biggest Fallacy in Computer Architecture Research.

Slides:



Advertisements
Similar presentations
Subthreshold SRAM Designs for Cryptography Security Computations Adnan Gutub The Second International Conference on Software Engineering and Computer Systems.
Advertisements

University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science University of Michigan.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science University of Michigan.
University of Michigan Advanced Computer Architecture Laboratory StageWeb: Interweaving Pipeline Stages into a Wearout and Variation Tolerant CMP Fabric.
Radiation Tolerant Circuitry. Project Objective In order to improve the reliability of deep sub-micron digital designs, especially for the electrical.
Mapping for Better Than Worst-Case Delays In LUT-Based FPGA Designs Kirill Minkovich and Jason Cong VLSI CAD Lab Computer Science Department University.
Device Tradeoffs Greg Stitt ECE Department University of Florida.
A tour of new discovery introducing XpertCapture Your ultimate data capturing solution.
© 2005 Pace Micro Technology Challenges ahead in low power set-top box design Presentation at the ‘International Stakeholder Meeting’ San Francisco - 29th.
An Overview of RAID Chris Erickson Graduate Student Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science University of Michigan.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science August 20, 2009 Enabling.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 Self-calibrated.
Cost-Efficient Soft Error Protection for Embedded Microprocessors
Unreliable Silicon: Myth or Reality? Shubu Mukherjee Principal Engineer Director, SPEARS Group (SPEARS = Simulation & Pathfinding of Efficient And Reliable.
Introduction to Software Engineering CS-300 Fall 2005 Supreeth Venkataraman.
University of Michigan Electrical Engineering and Computer Science 1 Processor Acceleration Through Automated Instruction Set Customization Nathan Clark,
University of Michigan Electrical Engineering and Computer Science 1 StageNet: A Reconfigurable CMP Fabric for Resilient Systems Shantanu Gupta Shuguang.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 Maestro: Orchestrating.
SM3121 Software Technology Mark Green School of Creative Media.
University of Michigan Electrical Engineering and Computer Science 1 Online Timing Analysis for Wearout Detection Jason Blome, Shuguang Feng, Shantanu.
1 Enhancing Random Access Scan for Soft Error Tolerance Fan Wang* Vishwani D. Agrawal Department of Electrical and Computer Engineering, Auburn University,
University of Michigan Electrical Engineering and Computer Science 1 A Microarchitectural Analysis of Soft Error Propagation in a Production-Level Embedded.
Storage System: RAID Questions answered in this lecture: What is RAID? How does one trade-off between: performance, capacity, and reliability? What is.
By: Muhib Mansuri. What is Electronic Engineering?  Electronic engineers design and develop electronic parts, devices, and systems for consumer use 
The Impact of Programming Language Theory on Computer Security Drew Dean Computer Science Laboratory SRI International.
Todd Austin University of Michigan X-Stack Energy Optimization: Fact or Fiction.
Basic Computer Components. What’s inside your computer?
Introduction CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Electrical & Electronic Engineering The University of Hong Kong B.Eng. Computer Engineering Jointly offered by the Department of Electrical & Electronic.
Nilufa Rahim C2PRISM Fellow Sept. 12, What is Engineering? Engineering is the field of applying Science and Mathematics to develop solutions that.
Daniel Falcone Daniel Falcone b CMIS 102 b PowerPoint HW Assignment #2 b February 24, 2004 b Electrical Engineering b Video on Electrical Engineering.
Software Measurement & Metrics
SiLab presentation on Reliable Computing Combinational Logic Soft Error Analysis and Protection Ali Ahmadi May 2008.
Computer Engineering Group Brandenburg University of Technology at Cottbus 1 Ressource Reduced Triple Modular Redundancy for Built-In Self-Repair in VLIW-Processors.
Introduction to Reconfigurable Computing Greg Stitt ECE Department University of Florida.
Institute of Applied Microelectronics and Computer Engineering College of Computer Science and Electrical Engineering, University of Rostock WIRE WIDENING.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 Encore: Low-Cost,
Occupation PowerPoint
10/03/05 Johan Muskens ( TU/e Computer Science, System Architecture and Networking.
Nonbehavioral Specifications Non-behavioral Characteristics Portability Portability Reliability Reliability Efficiency Efficiency Human Engineering.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 Bundled Execution.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science Adaptive Online Testing.
Jensen Berlitz 5th period
Domain Processes Know your customer.. Last Class - "Life Cycles" the process we will use to create the software product This Class - "Domain Processes"
1 Fault-Tolerant Computing Systems #1 Introduction Pattara Leelaprute Computer Engineering Department Kasetsart University
Hrushikesh Chavan Younggyun Cho Structural Fault Tolerance for SOC.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 The StageNet Fabric.
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
CPU-GPU Collaboration for Output Quality Monitoring Mehrzad Samadi and Scott Mahlke University of Michigan March 2014 Compilers creating custom processors.
Software Managed Resiliency Siva Hari Lei Chen, Xin Fu, Pradeep Ramachandran, Swarup Sahoo, Rob Smolenski, Sarita Adve Department of Computer Science University.
KAASHIV INFOTECH – A SOFTWARE CUM RESEARCH COMPANY IN ELECTRONICS, ELECTRICAL, CIVIL AND MECHANICAL AREAS
Motherboard By : Zachary Picht and Bailey Germain.
University of Michigan Electrical Engineering and Computer Science Dynamic Voltage/Frequency Scaling in Loop Accelerators using BLADES Ganesh Dasika 1,
Yuxi Liu The Chinese University of Hong Kong Circuit Timing Problem Driven Optimization.
Why Choose Computer Science?
Large Distributed Systems
Fault Tolerance & Reliability CDA 5140 Spring 2006
Making Routers Last Longer with ViAggre
Design a phone for a blind person to use
Introduction to Reconfigurable Computing
Get In Touch With Canon Printer Phone Number For Online Tech support
Career Opportunities in Engineering, Computer Science, and Software, Engineering Yinong Chen (Ph.D.) Arizona State University Tempe, AZ
We are the one of the best Windows 10 support provider in the whole world. If you want Windows 10 support number than contact us our toll free number.
Scott Mahlke University of Michigan
Maestro: Orchestrating Lifetime Reliability in Chip Multiprocessors
Introduction to Fault Tolerance
Saul Greenberg Human Computer Interaction Presented by: Kaldybaeva A., Aidynova E., 112 group Teacher: Zhabay B. University of International Relations.
Bonus Project Astronomy & You
Sound Engineering as carrer opportunity. By Gray Spark Audio Academy
Presentation transcript:

University of Michigan Electrical Engineering and Computer Science 1 Top 5 Reasons Reliability is the Biggest Fallacy in Computer Architecture Research Scott Mahlke University of Michigan Thanks to Jason Blome, Shuguang Feng, and Shantanu Gupta for putting their research on reliable systems on hold to help with this presentation.

University of Michigan Electrical Engineering and Computer Science 2 Disclaimer Space shuttle, airplanes, etc. Cost is not an issue – use high degrees of redundancy Still a need for high reliability designs for mission critical systems *The speaker may not agree with this position I would like to convince you reliability is a fallacy for mainstream computer systems used in consumer/business electronics*

University of Michigan Electrical Engineering and Computer Science 3 Reason 1: It’s the Software, Stupid! “Mature OS can have an MTTF measured in months, while newer OS may crash every few days.” – Peter Chen: Reliability Hierarchies, 1999 HOT OS. Sources: [1] [2] A system-level approach for memory robustness, ICMTD05 [3] Lifetime Reliability: Towards an architectural solution, IEEE Micro 2005 [4]

University of Michigan Electrical Engineering and Computer Science 4 Hmm… My ATM Does Not Work

University of Michigan Electrical Engineering and Computer Science 5 Reason 2: Disposable Electronics “The average working life of a mobile phone is 7 years, but the average consumer changes their mobile every 11 months.

University of Michigan Electrical Engineering and Computer Science 6 PCs/Laptops Not Far Behind “Take-away something.” –

University of Michigan Electrical Engineering and Computer Science 7 Reason 3: A Transient Fault is About As Likely As …

University of Michigan Electrical Engineering and Computer Science 8 Reason 4: Does Anyone Care? Can a human identify errors in video, images, or sound? Glitches are accepted by the consumer (dropped cell calls) Natural redundancy and resiliency in software 100% reliable operation of hardware is not important or worth extra cost in many situations Which is flawed?

University of Michigan Electrical Engineering and Computer Science 9 Reason 5: This Problem is Better Solved Closer to the Circuit Level Intra-die variations in ILD thickness Error_L Error comparator RAZOR FF clk_del Main Flip-Flop clk Shadow Latch Q1 D1 0 1 Electromigration in copper Lower overhead Many designs benefit In-situ solutions naturally handle variation

University of Michigan Electrical Engineering and Computer Science 10 Some Hope? The bottom line What if we assume reliability is a looming problem. Then we need solutions that are: 1. Low overhead, high rate of return solutions Joint circuit/architectural techniques 2. Domain specific solutions – know thy customer 3. Reliability features provide other benefits Its not just a tax