Yu-Guang Chen1,2, Wan-Yu Wen1, Tao Wang2,

Slides:



Advertisements
Similar presentations
Autonomic Scaling of Cloud Computing Resources
Advertisements

Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
Prefetching Techniques for STT-RAM based Last-level Cache in CMP Systems Mengjie Mao, Guangyu Sun, Yong Li, Kai Bu, Alex K. Jones, Yiran Chen Department.
1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
An Optimal Algorithm of Adjustable Delay Buffer Insertion for Solving Clock Skew Variation Problem Juyeon Kim, Deokjin Joo, Taehan Kim DAC’13.
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
NTPT: On the End-to-End Traffic Prediction in the On-Chip Networks Yoshi Shih-Chieh Huang 1, June 16, Department of Computer Science, National Tsing.
TH EDA NTHU-CS VLSI/CAD LAB 1 Re-synthesis for Reliability Design Shih-Chieh Chang Department of Computer Science National Tsing Hua University.
© 2005 Altera Corporation © 2006 Altera Corporation Placement and Timing for FPGAs Considering Variations Yan Lin 1, Mike Hutton 2 and Lei He 1 1 EE Department,
A Cell-Based Row-Structure Layout Decomposer for Triple Patterning Lithography Hsi-An Chien, Szu-Yuan Han, Ye-Hong Chen, and Ting-Chi Wang Department of.
Triple Patterning Aware Detailed Placement With Constrained Pattern Assignment Haitong Tian, Yuelin Du, Hongbo Zhang, Zigang Xiao, Martin D.F. Wong.
Hsiu-Yu Lai Ting-Chi Wang A TPL-Friendly Legalizer for Standard Cell Based Design SASIMI ‘15.
Integrated Regulation for Energy- Efficient Digital Circuits Elad Alon 1 and Mark Horowitz 2 1 UC Berkeley 2 Stanford University.
A New Methodology for Reduced Cost of Resilience Andrew B. Kahng, Seokhyeong Kang and Jiajia Li UC San Diego VLSI CAD Laboratory.
OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.
Computer Science Department University of Pittsburgh 1 Evaluating a DVS Scheme for Real-Time Embedded Systems Ruibin Xu, Daniel Mossé and Rami Melhem.
Ranking Queries on Uncertain Data: A Probabilistic Threshold Approach Wenjie Zhang, Xuemin Lin The University of New South Wales & NICTA Ming Hua,
Low-Power Gated Bus Synthesis for 3D IC via Rectilinear Shortest-Path Steiner Graph Chung-Kuan Cheng, Peng Du, Andrew B. Kahng, and Shih-Hung Weng UC San.
The Fast Optimal Voltage Partitioning Algorithm For Peak Power Density Minimization Jia Wang, Shiyan Hu Department of Electrical and Computer Engineering.
Tao Lin Chris Chu TPL-Aware Displacement- driven Detailed Placement Refinement with Coloring Constraints ISPD ‘15.
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
1 ε -Optimal Minimum-Delay/Area Zero-Skew Clock Tree Wire-Sizing in Pseudo-Polynomial Time Jeng-Liang Tsai Tsung-Hao Chen Charlie Chung-Ping Chen (National.
Hard Real-Time Scheduling for Low- Energy Using Stochastic Data and DVS Processors Flavius Gruian Department of Computer Science, Lund University Box 118.
PARR:Pin Access Planning and Regular Routing for Self-Aligned Double Patterning XIAOQING XU BEI YU JHIH-RONG GAO CHE-LUN HSU DAVID Z. PAN DAC’15.
An Energy-efficient Task Scheduler for Multi-core Platforms with per-core DVFS Based on Task Characteristics Ching-Chi Lin Institute of Information Science,
CUHK Learning-Based Power Management for Multi-Core Processors YE Rong Nov 15, 2011.
 GPU Power Model Nandhini Sudarsanan Nathan Vanderby Neeraj Mishra Usha Vinodh
Workload Clustering for Increasing Energy Savings on Embedded MPSoCs S. H. K. Narayanan, O. Ozturk, M. Kandemir, M. Karakoy.
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
1 Tom Edgar’s Contribution to Model Reduction as an introduction to Global Sensitivity Analysis Procedure Accounting for Effect of Available Experimental.
Best detection scheme achieves 100% hit detection with
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
1 Hardware Reliability Margining for the Dark Silicon Era Liangzhen Lai and Puneet Gupta Department of Electrical Engineering University of California,
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
A Flexible Spatio-temporal indexing Scheme for Large Scale GPS Tracks Retrieval Yu Zheng, Longhao Wang, Xing Xie Microsoft Research.
Marilyn Wolf1 With contributions from:
Keep the Adversary Guessing: Agent Security by Policy Randomization
Improving Multi-Core Performance Using Mixed-Cell Cache Architecture
Talal H. Noor, Quan Z. Sheng, Lina Yao,
Reducing the Number of Preemptions in Real-Time Systems Scheduling by CPU Frequency Scaling Abhilash Thekkilakattil, Anju S Pillai, Radu Dobrin, Sasikumar.
Figure 5: Change in Blackjack Posterior Distributions over Time.
THE CMOS INVERTER.
Distributed Network Traffic Feature Extraction for a Real-time IDS
2 Research Department, iFLYTEK Co. LTD.
Alireza Shafaei, Shuang Chen, Yanzhi Wang, and Massoud Pedram
Understanding Latency Variation in Modern DRAM Chips Experimental Characterization, Analysis, and Optimization Kevin Chang Abhijith Kashyap, Hasan Hassan,
SOUTHERN TAIWAN UNIVERSITY ELECTRICAL ENGINEERING DEPARTMENT
ISO New England System R&D Needs
Ching-Chi Lin Institute of Information Science, Academia Sinica
Babak Sorkhpour, Prof. Roman Obermaisser, Ayman Murshed
Tutorial 8: Probability Distribution
Professor Arne Thesen, University of Wisconsin-Madison
Hui Chen, Shinan Wang and Weisong Shi Wayne State University
Pyramid Sketch: a Sketch Framework
Announcements Homework 3 due today (grace period through Friday)
Day 26: November 1, 2013 Synchronous Circuits
Reinforcement Learning with Partially Known World Dynamics
Yiyu Shi*, Jinjun Xiong+, Howard Chen+ and Lei He*
Post-Silicon Tuning for Optimized Circuits
Scaling up Link Prediction with Ensembles
Post-Silicon Calibration for Large-Volume Products
Energy Efficient Power Distribution on Many-Core SoC
Chih-Hsun Chou Daniel Wong Laxmi N. Bhuyan
Realizing Closed-loop, Online Tuning and Control for Configurable-Cache Embedded Systems: Progress and Challenges Islam S. Badreldin*, Ann Gordon-Ross*,
CS639: Data Management for Data Science
A Novel Cache-Utilization Based Dynamic Voltage Frequency Scaling (DVFS) Mechanism for Reliability Enhancements *Yen-Hao Chen, *Yi-Lun Tang, **Yi-Yu Liu,
Chapter 3b Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Prof. Lei He Electrical Engineering Department.
Presentation transcript:

Q-Learning Based Dynamic Voltage Scaling for Designs with Graceful Degradation Yu-Guang Chen1,2, Wan-Yu Wen1, Tao Wang2, Yiyu Shi2, and Shih-Chieh Chang1 1Department of CS, National Tsing Hua University, HsinChu, Taiwan 2Department of ECE, Missouri University of Science and Technology, Rolla, Mo, USA Ladies and gentlemen, good morning. The topic of today’s talk is “Critical Path Monitor Enabled Dynamic Voltage Scaling for Graceful Degradation in Sub-Threshold Designs.” I am Yu-Guang Chen from National Tsing Hua University, Hsinchu, Taiwan. This is a joint work with my friends Tao Wang, Kuan-Yu Lai, Wan-Yu Wen, and Prof. Yiyu Shi from Missouri University of Science and Technology, USA. And my advisior Shih-Chieh Chang from National Tsing Hua University, Hsinchu Taiwan. The motivation of this work is as follows.

Outline Introduction and Motivation Q-Learning Based DVS Scheme Experimental Results Conclusions

Outline Introduction and Motivation Q-Learning Based DVS Scheme Experimental Results Conclusions

Introduction and Motivation Power consumption is an significant problem in modern IC designs. Dynamic voltage scaling (DVS) can efficiently reduce operating power. Dynamically switch operating voltage and/or operating frequency Workload, Process, Environment variations

Introduction and Motivation The key concept of DVS is to decide the optimal operating voltage for different scenarios. Deterministic DVS schemes Construct state table off-line on various statistical analysis. Optimal voltage comes from the real-time feedback and the state table.

Introduction and Motivation Hard for two reasons: Many uncertainties are non-Gaussian and tightly correlated; Much information may not be known a priori. Reinforcement learning based DVS schemes Dynamically adjust the policy at runtime based on the system performance through various learning procedures

Introduction and Motivation Graceful degradation Allow timing errors to occur with a low probability Significantly reduce operating power Timing Error Probability (TEP) Only a few prior works consider DVS with Graceful degradation

Introduction and Motivation Critical Path Monitor (CPM) Measures critical path delays Reflects the influence of process and temperature variations dynamically

Introduction and Motivation Motivation example Deterministic joint probability density function (JPDF) based DVS scheme for graceful degradation Calls for learning based DVS schemes

Problem Formulation Given Determine Goal A chip with CPM placed, the voltage candidates for DVS, and a TEP bound and a timing window length for TEP measurement, Determine The optimal operating voltages at runtime based on the sampled slack from the CPM Goal The operating power is minimized.

Outline Introduction and Motivation Q-Learning Based DVS Scheme Experimental Result Conclusions

Framework Construct 2D state table Row  particular operating voltage candidate Column  particular reading from the CPM Score  corresponding combination of operating voltage and sampled slack from CPM Voltage\Slack 0.1ns 0.2ns 0.3ns … 1.0ns 0.8V 1 2 5 10 0.9V 4 3 8 1.0V 1.1V 1.2V

Framework Optimal operating voltage decision DVS controller samples the slack from the CPM Identifies the voltage candidate with the highest score in the corresponding column Change the operating voltage Voltage\Slack 0.1ns 0.2ns 0.3ns … 1.0ns 0.8V 1 2 5 10 0.9V 4 3 8 1.0V 1.1V 1.2V

Framework Optimal operating voltage decision DVS controller samples the slack from the CPM Identifies the voltage candidate with the highest score in the corresponding column Change the operating voltage Voltage\Slack 0.1ns 0.2ns 0.3ns … 1.0ns 0.8V 1 2 5 10 0.9V 4 3 8 1.0V 1.1V 1.2V

Framework Optimal operating voltage decision DVS controller samples the slack from the CPM Identifies the voltage candidate with the highest score in the corresponding column Change the operating voltage Voltage\Slack 0.1ns 0.2ns 0.3ns … 1.0ns 0.8V 1 2 5 10 0.9V 4 3 8 1.0V 1.1V 1.2V

Framework Optimal operating voltage decision DVS controller samples the slack from the CPM Identifies the voltage candidate with the highest score in the corresponding column Change the operating voltage Voltage\Slack 0.1ns 0.2ns 0.3ns … 1.0ns 0.8V 1 2 5 10 0.9V 4 3 8 1.0V 1.1V 1.2V

Framework Optimal operating voltage decision DVS controller samples the slack from the CPM Identifies the voltage candidate with the highest score in the corresponding column Change the operating voltage Voltage\Slack 0.1ns 0.2ns 0.3ns … 1.0ns 0.8V 1 2 5 10 0.9V 4 3 8 1.0V 1.1V 1.2V

Q-learning Applies to Markov decision problems with unknown costs and transition probabilities. State A legal status Action A legal transition from one state to another Q-table Store Q-values for each state-action pair Expected pay-off from choosing the given action from that state Are updated through reward and penalty policies

Q-learning Based DVS Scheme State A combination of an operating voltage and a sampled slack. Action A voltage transition under the same sampled slack. Q-table Store Q-values from changing the operating voltage under the same sampled slack.

Q-learning Based DVS Scheme Reward State 𝑇 𝑖𝑘 = 𝑉 𝑖 , 𝑆 𝑘 as operating voltage 𝑉 𝑖 and sampled slack 𝑆 𝑘 Action A 𝑖𝑗𝑘 = ( 𝑇 𝑖𝑘 , 𝑇 𝑗𝑘 ) as voltage scaling from 𝑉 𝑖 to 𝑉 𝑗 Entry of Q-table 𝑄 𝑖𝑘 as Q-value for switching from 𝑇 𝑖𝑘 to state 𝑇 𝑗𝑘 (take action A 𝑖𝑗𝑘 ) 𝑅 A 𝑖𝑗𝑘 =𝑁𝑜𝑟𝑚 ∆𝑃𝑅 A 𝑖𝑗𝑘 =( 𝑉 𝑖 2 − 𝑉 𝑗 2 𝑉 𝑚𝑎𝑥 2 − 𝑉 𝑚𝑖𝑛 2 ) ∆𝑃𝑅 A 𝑖𝑗𝑘 is the power reduction from action A 𝑖𝑗𝑘

Q-learning Based DVS Scheme Penalty Prevent TEP( 𝐸 𝑐 ) from exceeding the TEP bound( 𝐸 𝑏 ). Abrupt penalty Constant and large penalty Linearly graded penalty Linearly increase the penalty

Q-learning Based DVS Scheme Penalty 𝑃 A 𝑖𝑗𝑘 as the penalty of A 𝑖𝑗𝑘 Abrupt penalty 𝑃 A 𝑖𝑗𝑘 = 𝑁𝑜𝑟𝑚 𝜀, 𝑖𝑓 𝐸 𝑐 < 𝐸 𝑏 −𝜌 𝜎𝑅 A 𝑖𝑗𝑘 , 𝑖𝑓 𝐸 𝑐 ≥ 𝐸 𝑏 −𝜌 ε is a small constant ρ is a small positive constant set as a margin 𝜎 is a constant

Q-learning Based DVS Scheme Linearly graded penalty 𝑃 A 𝑖𝑗𝑘 = 𝑁𝑜𝑟𝑚( 𝜀, 𝑖𝑓 𝐸 𝑐 < 𝜀−𝜎(𝛾)𝑅( 𝐴 𝑖𝑗𝑘 ) 𝛾 +( 𝐸 𝑏 −𝜌) −𝛾 𝐸 𝑏 −𝜌 − 𝐸 𝑐 +𝜎(𝛾)𝑅 𝐴 𝑖𝑗𝑘 , 𝑖𝑓 𝜀−𝜎(𝛾)𝑅( 𝐴 𝑖𝑗𝑘 ) 𝛾 +( 𝐸 𝑏 −𝜌)≤ 𝐸 𝑐 < 𝐸 𝑏 −𝜌 𝜎𝑅 A 𝑖𝑗𝑘 , 𝑖𝑓 𝐸 𝑐 ≥ 𝐸 𝑏 −𝜌 ) γ is grading factor.

Q-learning Based DVS Scheme Q-values update policy 𝑄 𝑖𝑘 = 1−𝛼 𝑄 𝑖𝑘 +𝛼 𝑅 A 𝑖𝑗𝑘 −𝑃+ 𝑄 𝑗𝑘 𝛼 denotes the learning rate P is defined as 𝑃= 0, 𝑖𝑓 𝑆 𝑘′ 𝑜𝑓 𝑇 𝑗𝑘′ >0 𝑃 A 𝑖𝑗𝑘 , 𝑖𝑓 𝑆 𝑘′ 𝑜𝑓 𝑇 𝑗𝑘′ ≤0 𝑆 𝑘′ is the sampled slack after voltage scaling

Q-learning Based DVS Scheme Summarize Step 1: When the Q-learning process starts, initialize all the Q-values in the Q-table to 0. Step 2: Denote the current state as 𝑇 𝑖𝑘 . Find an action A 𝑖 𝑗 0 𝑘 with the highest Qjk for all the eligible j’s. Switch to V 𝑗 0 . Step 3: Evaluate and update TEP. Calculate the corresponding reward 𝑅 A 𝑖𝑗𝑘 and penalty𝑃 A 𝑖𝑗𝑘 .Then update Qik. Step 4: Set the current state as 𝑇 𝑗𝑘′ , and go to Step.2 when the next cycle starts.

Outline Introduction and Motivation Q-Learning Based DVS Scheme Experimental Result Conclusions

Experimental Results Three industrial designs with 45nm library 8-core, 2.40GHZ, Intel Xeon E5620 CPU, with 32GB memory, CentOS release 5.9 machine Voltage candidates are set to 0.8V, 0.9V, 1V, 1.1V, 1.2V Temperature varies from 20oC to 35oC.

Experimental Results Performance stepping based JPDF based Power is in µW

Experimental Results Performance stepping based JPDF based Power is in µW

Experimental Results Different TEP bounds v.s. TPE achieved

Experimental Results

Outline Introduction and Motivation Q-Learning Based DVS Scheme Experimental Result Conclusions

Conclusions We have proposed a Q-learning based DVS scheme dedicated to the designs with graceful degradation. Proposed Q-learning based scheme can achieve up to 83.9% and 29.1% power reduction respectively with 0.01 TEP bound.

Thank You Q&A Thanks a lot for your attention. If you are interesting in this work, plz come to my booth after this section and I can bring more details about this work.