Voltage Emergency Prediction: Using Signatures to Reduce Operating Margins V.J. Reddi, M.S. Gupta, G. Holloway, G. Wei, M.D. Smith, D. Brooks Presented.

Slides:



Advertisements
Similar presentations
Increasing the Energy Efficiency of TLS Systems Using Intermediate Checkpointing Salman Khan 1, Nikolas Ioannou 2, Polychronis Xekalakis 3 and Marcelo.
Advertisements

UPC Compiler Support for Trace-Level Speculative Multithreaded Architectures Antonio González λ,ф Carlos Molina ψ Jordi Tubella ф INTERACT-9, San Francisco.
1 A Hybrid Adaptive Feedback Based Prefetcher Santhosh Verma, David Koppelman and Lu Peng Louisiana State University.
Federation: Repurposing Scalar Cores for Out- of-Order Instruction Issue David Tarjan*, Michael Boyer, and Kevin Skadron* University of Virginia Department.
UPC Microarchitectural Techniques to Exploit Repetitive Computations and Values Carlos Molina Clemente LECTURA DE TESIS, (Barcelona,14 de Diciembre de.
This project and the research leading to these results has received funding from the European Community's Seventh Framework Programme [FP7/ ] under.
Using Virtual Load/Store Queues (VLSQs) to Reduce The Negative Effects of Reordered Memory Instructions Aamer Jaleel and Bruce Jacob Electrical and Computer.
NC STATE UNIVERSITY ASPLOS-XII Understanding Prediction-Based Partial Redundant Threading for Low-Overhead, High-Coverage Fault Tolerance Vimal Reddy Sailashri.
National & Kapodistrian University of Athens Dep.of Informatics & Telecommunications MSc. In Computer Systems Technology Advanced Computer Architecture.
Increasing the Cache Efficiency by Eliminating Noise Philip A. Marshall.
1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Transport Protocols.
June 20 th 2004University of Utah1 Microarchitectural Techniques to Reduce Interconnect Power in Clustered Processors Karthik Ramani Naveen Muralimanohar.
Power Savings in Embedded Processors through Decode Filter Cache Weiyu Tang, Rajesh Gupta, Alex Nicolau.
1 Lecture 7: Out-of-Order Processors Today: out-of-order pipeline, memory disambiguation, basic branch prediction (Sections 3.4, 3.5, 3.7)
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: branch prediction, out-of-order processors (Sections )
Data value prediction Bas van der Tol. Limits to ILP Instruction Level Parallelism is limited by Control flow Data flow: true dependencies.
Wish Branches A Review of “Wish Branches: Enabling Adaptive and Aggressive Predicated Execution” Russell Dodd - October 24, 2006.
1 Lecture 9: More ILP Today: limits of ILP, case studies, boosting ILP (Sections )
Hot-and-Cold: Using Criticality in the Design of Energy-Efficient Caches Rajeev Balasubramonian, University of Utah Viji Srinivasan, IBM T.J. Watson Sandhya.
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: static speculation and branch prediction (Sections )
Trace Processors Presented by Nitin Kumar Eric Rotenberg Quinn Jacobson, Yanos Sazeides, Jim Smith Computer Science Department University of Wisconsin-Madison.
(C) 2004 Daniel SorinDuke Architecture Using Speculation to Simplify Multiprocessor Design Daniel J. Sorin 1, Milo M. K. Martin 2, Mark D. Hill 3, David.
CS 7810 Lecture 21 Threaded Multiple Path Execution S. Wallace, B. Calder, D. Tullsen Proceedings of ISCA-25 June 1998.
Improving the Performance of Object-Oriented Languages with Dynamic Predication of Indirect Jumps José A. Joao *‡ Onur Mutlu ‡* Hyesoon Kim § Rishi Agarwal.
1 Storage Free Confidence Estimator for the TAGE predictor André Seznec IRISA/INRIA.
Revisiting Load Value Speculation:
Korea Univ B-Fetch: Branch Prediction Directed Prefetching for In-Order Processors 컴퓨터 · 전파통신공학과 최병준 1 Computer Engineering and Systems Group.
1 A 64 Kbytes ITTAGE indirect branch predictor André Seznec INRIA/IRISA.
ReSlice: Selective Re-execution of Long-retired Misspeculated Instructions Using Forward Slicing Smruti R. Sarangi, Wei Liu, Josep Torrellas, Yuanyuan.
Predicting Coherence Communication by Tracking Synchronization Points at Run Time Socrates Demetriades and Sangyeun Cho 45 th International Symposium in.
© Dennis Shasha, Philippe Bonnet 2001 Log Tuning.
1 A Cost-effective Substantial- impact-filter Based Method to Tolerate Voltage Emergencies Songjun Pan 1,2, Yu Hu 1, Xing Hu 1,2, and Xiaowei Li 1 1 Key.
Sampling Dead Block Prediction for Last-Level Caches
Advanced Computer Architecture Lab University of Michigan Compiler Controlled Value Prediction with Branch Predictor Based Confidence Eric Larson Compiler.
Increasing Cache Efficiency by Eliminating Noise Prateek Pujara & Aneesh Aggarwal {prateek,
Coherence Decoupling: Making Use of Incoherence J. Huh, J. Chang, D. Burger, G. Sohi ASPLOS 2004.
Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.
Thermal-aware Phase-based Tuning of Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing This work was supported.
Adaptive GPU Cache Bypassing Yingying Tian *, Sooraj Puthoor†, Joseph L. Greathouse†, Bradford M. Beckmann†, Daniel A. Jiménez * Texas A&M University *,
1/25 June 28 th, 2006 BranchTap: Improving Performance With Very Few Checkpoints Through Adaptive Speculation Control BranchTap Improving Performance With.
Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,
Hardware Architectures for Power and Energy Adaptation Phillip Stanley-Marbell.
A Protocol for Tracking Mobile Targets using Sensor Networks H. Yang and B. Sikdar Department of Electrical, Computer and Systems Engineering Rensselaer.
OOO Pipelines - III Smruti R. Sarangi Computer Science and Engineering, IIT Delhi.
Temporal Stream Branch Predictor (TS Predictor) Yongming Shen, Michael Ferdman.
Prophet/Critic Hybrid Branch Prediction B B B
1 Lecture: Out-of-order Processors Topics: branch predictor wrap-up, a basic out-of-order processor with issue queue, register renaming, and reorder buffer.
Best detection scheme achieves 100% hit detection with
1/25 HIPEAC 2008 TurboROB TurboROB A Low Cost Checkpoint/Restore Accelerator Patrick Akl 1 and Andreas Moshovos AENAO Research Group Department of Electrical.
Value Prediction Kyaw Kyaw, Min Pan Final Project.
Lecture: Out-of-order Processors
Speculative Lock Elision
Exploiting Sharing for Data Center Consolidation
Lecture: Branch Prediction
Protocols for Low Power
Dynamically Sizing the TAGE Branch Predictor
Moinuddin K. Qureshi ECE, Georgia Tech Gabriel H. Loh, AMD
Exploring Value Prediction with the EVES predictor
On-demand solution to minimize I-cache leakage energy
Milad Hashemi, Onur Mutlu, Yale N. Patt
Address-Value Delta (AVD) Prediction
Yiyu Shi*, Jinjun Xiong+, Howard Chen+ and Lei He*
Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt
Realizing Closed-loop, Online Tuning and Control for Configurable-Cache Embedded Systems: Progress and Challenges Islam S. Badreldin*, Ann Gordon-Ross*,
rePLay: A Hardware Framework for Dynamic Optimization
Lois Orosa, Rodolfo Azevedo and Onur Mutlu
Phase based adaptive Branch predictor: Seeing the forest for the trees
Project Guidelines Prof. Eric Rotenberg.
Presentation transcript:

Voltage Emergency Prediction: Using Signatures to Reduce Operating Margins V.J. Reddi, M.S. Gupta, G. Holloway, G. Wei, M.D. Smith, D. Brooks Presented by: Kelsey Rosenthal and Irving Olmedo

Motivation ● Current fluctuations cause voltage problems ● Conservative voltage margins (~20%) ● Wasted energy ● Unused performance potential

● Voltage emergencies ● Overly conservatively margins ● Sensor based throttling Targeted Problem Sensor based throttling

Solution Goals ● Avoid voltage emergencies ● Operate within aggressive margins (~4%) ● Rollback on timing violations

CPU Actuator Checkpoint-Recovery Proposed Solution: Predictor ● Monitor events ● Predict emergencies ● Throttle to avoid ● Recover if mistaken Monitor Control Flow and Microarchitectural Events Predictor Throttle On / Off Emergency Notification

Runtime Example Emergenc y Voltage Current Flush ALU Cache Issue Dispatch Commit

Voltage Signatures ● Recurring phases ● Locality ● Context of system ● uArch events (on/off) ● Control flow

Predictor Accuracy Tradeoff ● Types of events ● Number of events ● Space constraints ● PC anchor

Emergency Avoidance ● Non-zero delay throttling ● Misprediction ● Sensor based learning

Prediction Table Storage ●CAM ●Bloom Filter ●Thresholds

Predictor Accuracy

How does it stack up SchemePerformance Gain (%) Predictor Throttling Oracle14.2 Voltage Emergency Signature (VES)13.5 VES with 8KB Table11.1 Microarchitectural Event4.1 Ideal Sensor Throttling 2% Soft Threshold2.2 3% Soft Threshold9 Explicit Checkpoint and Recovery-13 Delayed Commit and Rollback (DeCoR)13

Conclusion ● uArch events create voltage signatures ● Predict emergencies with >90% confidence ● 11-13% performance improvement ● Aggressive voltage margins (4%)

Discussion ● How does this compare/differ from Razor? ● Is an 8KB CAM reasonable for this improvement? ● How relevant is this to a multithreaded/multicore system?