This project and the research leading to these results has received funding from the European Community's Seventh Framework Programme [FP7/2007-2013] under.

Slides:



Advertisements
Similar presentations
NC STATE UNIVERSITY 1 Assertion-Based Microarchitecture Design for Improved Fault Tolerance Vimal K. Reddy Ahmed S. Al-Zawawi, Eric Rotenberg Center for.
Advertisements

Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Department of Computer Science iGPU: Exception Support and Speculative Execution on GPUs Jaikrishnan Menon, Marc de Kruijf Karthikeyan Sankaralingam Vertical.
Using Hardware Vulnerability Factors to Enhance AVF Analysis Vilas Sridharan RAS Architecture and Strategy AMD, Inc. International Symposium on Computer.
KMemvisor: Flexible System Wide Memory Mirroring in Virtual Environments Bin Wang Zhengwei Qi Haibing Guan Haoliang Dong Wei Sun Shanghai Key Laboratory.
ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors Milos Prvulovic, Zheng Zhang, Josep Torrellas University.
Thread-Level Transactional Memory Decoupling Interface and Implementation UW Computer Architecture Affiliates Conference Kevin Moore October 21, 2004.
Microarchitectural Approaches to Exceeding the Complexity Barrier © Eric Rotenberg 1 Microarchitectural Approaches to Exceeding the Complexity Barrier.
Transactional Memory (TM) Evan Jolley EE 6633 December 7, 2012.
Efficient and Flexible Architectural Support for Dynamic Monitoring YUANYUAN ZHOU, PIN ZHOU, FENG QIN, WEI LIU, & JOSEP TORRELLAS UIUC.
Chia-Yen Hsieh Laboratory for Reliable Computing Microarchitecture-Level Power Management Iyer, A. Marculescu, D., Member, IEEE IEEE Transaction on VLSI.
UPC Reducing Misspeculation Penalty in Trace-Level Speculative Multithreaded Architectures Carlos Molina ψ, ф Jordi Tubella ф Antonio González λ,ф ISHPC-VI,
Scalable Processor Architecture (SPARC) Jeff Miles Joel Foster Dhruv Vyas.
An Integrated Framework for Dependable Revivable Architectures Using Multi-core Processors Weiding Shi, Hsien-Hsin S. Lee, Laura Falk, and Mrinmoy Ghosh.
What Great Research ?s Can RAMP Help Answer? What Are RAMP’s Grand Challenges ?
1 Razor: A Low Power Processor Design Presented By: - Murali Dharan.
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
PathExpander: Architectural Support for Increasing the Path Coverage of Dynamic Bug Detection S. Lu, P. Zhou, W. Liu, Y. Zhou, J. Torrellas University.
Projects Using gem5 ParaDIME (2012 – 2015) RoMoL (2013 – 2018)
Checkpoint Based Recovery from Power Failures Christopher Sutardja Emil Stefanov.
ECE 510 Brendan Crowley Paper Review October 31, 2006.
Software-Based Online Detection of Hardware Defects: Mechanisms, Architectural Support, and Evaluation Kypros Constantinides University of Michigan Onur.
Flexible Reference-Counting-Based Hardware Acceleration for Garbage Collection José A. Joao * Onur Mutlu ‡ Yale N. Patt * * HPS Research Group University.
Shuchang Shan † ‡, Yu Hu †, Xiaowei Li † † Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences.
Towards a Hardware-Software Co-Designed Resilient System Man-Lap (Alex) Li, Pradeep Ramachandran, Sarita Adve, Vikram Adve, Yuanyuan Zhou University of.
TASK ADAPTATION IN REAL-TIME & EMBEDDED SYSTEMS FOR ENERGY & RELIABILITY TRADEOFFS Sathish Gopalakrishnan Department of Electrical & Computer Engineering.
Presenter: Jyun-Yan Li Multiplexed redundant execution: A technique for efficient fault tolerance in chip multiprocessors Pramod Subramanyan, Virendra.
Defining Anomalous Behavior for Phase Change Memory
Different CPUs CLICK THE SPINNING COMPUTER TO MOVE ON.
1 Fault Tolerance in the Nonstop Cyclone System By Scott Chan Robert Jardine Presented by Phuc Nguyen.
Database Replication Policies for Dynamic Content Applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto EuroSys 2006: Leuven,
INTRODUCTION Crusoe processor is 128 bit microprocessor which is build for mobile computing devices where low power consumption is required. Crusoe processor.
Copyright © 2008 UCI ACES Laboratory Kyoungwoo Lee 1, Aviral Shrivastava 2, Nikil Dutt 1, and Nalini Venkatasubramanian 1.
Advanced Computer Architecture, CSE 520 Generating FPGA-Accelerated DFT Libraries Chi-Li Yu Nov. 13, 2007.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.
SafetyNet: improving the availability of shared memory multiprocessors with global checkpoint/recovery Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill,
SafetyNet Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill,
Eliminating Silent Data Corruptions caused by Soft-Errors Siva Hari, Sarita Adve, Helia Naeimi, Pradeep Ramachandran, University of Illinois at Urbana-Champaign,
ReSlice: Selective Re-execution of Long-retired Misspeculated Instructions Using Forward Slicing Smruti R. Sarangi, Wei Liu, Josep Torrellas, Yuanyuan.
Architectural Optimizations Ed Carlisle. DARA: A LOW-COST RELIABLE ARCHITECTURE BASED ON UNHARDENED DEVICES AND ITS CASE STUDY OF RADIATION STRESS TEST.
Title of Selected Paper: IMPRES: Integrated Monitoring for Processor Reliability and Security Authors: Roshan G. Ragel and Sri Parameswaran Presented by:
This project and the research leading to these results has received funding from the European Community's Seventh Framework Programme [FP7/ ] under.
\cpeg323-08F\Topic0.ppt1 CPEG 323 – Fall 2008 Topics in Computer System Engineering – Computer Organization and Design.
An Integrated Framework for Dependable and Revivable Architecture Using Multicore Processors Weidong ShiMotorola Labs Hsien-Hsin “Sean” LeeGeorgia Tech.
Copyright © 2010 Houman Homayoun Houman Homayoun National Science Foundation Computing Innovation Fellow Department of Computer Science University of California.
This project has received funding from the European Union's Seventh Framework Programme for research, technological development.
Availability in CMPs By Eric Hill Pranay Koka. Motivation RAS is an important feature for commercial servers –Server downtime is equivalent to lost money.
Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept. of Electrical and Science Computer Engineering Duke.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science Efficient Soft Error.
1 Aphirak Jansang Thiranun Dumrongson
University of Michigan Electrical Engineering and Computer Science 1 Low Cost Control Flow Protection Using Abstract Control Signatures Daya S Khudia and.
Multi-Core CPUs Matt Kuehn. Roadmap ► Intel vs AMD ► Early multi-core processors ► Threads vs Physical Cores ► Multithreading and Multi-core processing.
CS 352H: Computer Systems Architecture
Raghuraman Balasubramanian Karthikeyan Sankaralingam
Computer Architecture: Multithreading (III)
UnSync: A Soft Error Resilient Redundant Multicore Architecture
Supporting Fault-Tolerance in Streaming Grid Applications
Energy-Efficient Address Translation
Daya S Khudia, Griffin Wright and Scott Mahlke
Hwisoo So. , Moslem Didehban#, Yohan Ko
NVIDIA Fermi Architecture
Introduction of Week 13 Return assignment 11-1 and 3-1-5
José A. Joao* Onur Mutlu‡ Yale N. Patt*
CSC3050 – Computer Architecture
Co-designed Virtual Machines for Reliable Computer Systems
Dynamic Verification of Sequential Consistency
Fault Tolerant Systems in a Space Environment
University of Wisconsin-Madison Presented by: Nick Kirchem
Srinivas Neginhal Anantharaman Kalyanaraman CprE 585: Survey Project
Presentation transcript:

This project and the research leading to these results has received funding from the European Community's Seventh Framework Programme [FP7/ ] under grant agreement n° Gulay Yalcin, Anita Sobe, Alexey Voronin, Jons-Tobias Wamhoff, Derin Harmanci, Adrián Cristal, Osman Unsal, Pascal Felber, Christof Fetzer PDP2014, Turin, Italy 13 February 2014 Combining Error Detection and Transactional Memory for Energy-Efficient Computing below Safe Operation Margin

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Dark Silicon Phenomenon Number of transistors can be increased. In order to stay within a chip’s power budget, some must remain “dark”. One solution: Downscale the voltage. 2

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin How about Reliability? POWERRELIABILITYPERFORMANCE 3 When the V dd is reduced, the error rate increases exponentially [1]. [1] Dan Ernst et al. “Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation.” In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, pages 7–18, 2003 Our goal is: Investigating the edge cases on voltage reduction while the error recovery still leads to a reduced energy consumption.

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Agenda / Overview Motivation Experiment: Scaling V dd in a Real System Basics of Reliability Error Recovery with TM Error Detection Schemes Analysis Conclusion 4

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Reducing V dd in a Real System 5 AMD FX core CPU CPU-heavy execution Every 10 seconds reduce Vdd by 12.5mV Monitor Incorrect Result System Crash Machine Check Architecture The system encounters errors which can not be corrected by MCA even only after 10% reduction in V dd Errors are in instruction cache (37%), execution unit (61%) and others (less than 2%).

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Basics of Reliability 6 Error Recovery Global Checkpointing Coordinated Local Checkpointing Un-coordinated Local Checkpointing Error Detection Replication Assertions/Invariants Symptom-Based Encoded Processing Transactional Memory can provide a lightweight Coordinated Local Checkpoitning [2] [2] Gulay Yalcin et al. “FaulTM: Fault Tolerance Using Hardware Transactional Memory, DATE 2013

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin TM provides checkpointing/rollback 7 Processor 1 Checkpoint (Log Area) P2 P3 P4 Pn TM write-sets log the tentative memory updates. Synchronize checkpoints Data-Versioning provides a synchronization mechanism between checkpoints.

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Error Detection Schemes - Replication Execute instruction streams multiple times Compare the results of executions Less comparison with TM. Dual/Triple Modular Redundancy + High Error Detection Rate - High Energy Overhead 8

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Error Detection Schemes-Assertions/Invariants Assertions: Conditions referring to the current and previous state of the program. Check the state Adding manually or automatic TM facilitates inserting invariants Ex: 9

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Error Detection Schemes - Symptoms Monitor program executions to inspect if there is a symptom of hardware faults. Symptoms: Mispredictions in high confidence branches, high OS activity, fatal traps (e.g. undefined instruction code) Reliability at a low cost 10

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Error Detection Schemes- Encoded Processing Apply software coding (ECC-like) techniques The redundancy is added by applying arithmetic codes to the values. Arithmetic codes: AN, ANBDmem etc. With TM, the validation of a code word can be deferred until a TX commits. Ex: 11

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Comparing Error Detection Schemes 12

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Analysis Gem5 full system simulator 1GHz in-order cores 4 cores X86 ISA 64KB L1 data and instruction caches Unified 2MB L2 cache SPLASH2 benchmark suite. 13

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Energy Analysis 14 E ≈ C x V dd 2 V dd Error-free Overhead Recovery Overhead Fault Injection TX size Error Detection Rate

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Energy Reduction 15

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Reliability of the System 16

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Conclusion The energy consumption of CPUs can be reduced if we have efficient hardware support for Transactional Memory and for Error Detection. 17

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Future Work: Combining DMR and Symptoms 18

Combining Error Detection and TM for Energy-Efficient Computing below Safe Operation Margin Thanks! 19