Run-Time Power-Down Strategies for Real-Time SDRAM Memory Controllers Karthik Chandrasekar 1, Benny Akesson 2, and Kees Goossens 2 1 TU Delft and 2 TU.

Slides:

Advertisements

Similar presentations

Reducing Energy Consumption of Disk Storage Using Power Aware Cache Management Qingbo Zhu, Francis M. David, Christo F. Deveraj, Zhenmin Li, Yuanyuan Zhou.

Advertisements

Multicast Traffic Scheduling in Single-Hop WDM Networks with Tuning Latencies Ching-Fang Hsu Department of Computer Science and Information Engineering.

Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY.

Lecture 8: Memory Hierarchy Cache Performance Kai Bu

PERFORMANCE ANALYSIS OF MULTIPLE THREADS/CORES USING THE ULTRASPARC T1 (NIAGARA) Unique Chips and Systems (UCAS-4) Dimitris Kaseridis & Lizy K. John The.

REAL-TIME COMMUNICATION ANALYSIS FOR NOCS WITH WORMHOLE SWITCHING Presented by Sina Gholamian, 1 09/11/2011.

1 MemScale: Active Low-Power Modes for Main Memory Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University.

System and Circuit Level Power Modeling of Energy-Efficient 3D-Stacked Wide I/O DRAMs Karthik Chandrasekar TU Delft Christian Weis $, Benny Akesson*, Norbert.

1 DIEF: An Accurate Interference Feedback Mechanism for Chip Multiprocessor Memory Systems Magnus Jahre †, Marius Grannaes † ‡ and Lasse Natvig † † Norwegian.

Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.

Chapter 3: CPU Scheduling

CS 311 – Lecture 23 Outline Kernel – Process subsystem Process scheduling Scheduling algorithms User mode and kernel mode Lecture 231CS Operating.

June 20 th 2004University of Utah1 Microarchitectural Techniques to Reduce Interconnect Power in Clustered Processors Karthik Ramani Naveen Muralimanohar.

CS444/CS544 Operating Systems Scheduling 1/31/2007 Prof. Searleman

Chapter 6 Dynamic Priority Servers

Investigating the Effect of Voltage- Switching on Low-Energy Task Scheduling in Hard Real-Time Systems Paper review Presented by Chung-Fu Kao.

1 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Buffer-less Switch Fabric Architectures Vahid Tabatabaee Fall 2006.

HHMSM: A Hierarchical Hybrid Multicast Stream Merging Scheme For Large-Scale Video-On-Demand Systems Hai Jin and Dafu Deng Huazhong University of Science.

Techniques for Efficient Processing in Runahead Execution Engines Onur Mutlu Hyesoon Kim Yale N. Patt.

A. Frank - P. Weisberg Operating Systems CPU Scheduling.

Modified from Silberschatz, Galvin and Gagne ©2009 Lecture 8 Chapter 5: CPU Scheduling.

GreenHadoop: Leveraging Green Energy in Data-Processing Frameworks Íñigo Goiri, Kien Le, Thu D. Nguyen, Jordi Guitart, Jordi Torres, and Ricardo Bianchini.

RTNS Versailles 2014 Improving The Average-Case Using Worst-Case Aware Prefetching Jamie Garside Neil C. Audsley.

Chapter 6: CPU Scheduling

Chapter 6 CPU SCHEDULING.

Low Contention Mapping of RT Tasks onto a TilePro 64 Core Processor 1 Background Introduction = why 2 Goal 3 What 4 How 5 Experimental Result 6 Advantage.

CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Combined Scheduling of Periodic and Aperiodic Tasks.

Timing Channel Protection for a Shared Memory Controller Yao Wang, Andrew Ferraiuolo, G. Edward Suh Feb 17 th 2014.

1 Maintaining Logical and Temporal Consistency in RT Embedded Database Systems Krithi Ramamritham.

Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.

Probabilistic Preemption Control using Frequency Scaling for Sporadic Real-time Tasks Abhilash Thekkilakattil, Radu Dobrin and Sasikumar Punnekkat.

A Time Predictable Instruction Cache for a Java Processor Martin Schoeberl.

1 Reducing Queue Lock Pessimism in Multiprocessor Schedulability Analysis Yang Chang, Robert Davis and Andy Wellings Real-time Systems Research Group University.

A Mixed Time-Criticality SDRAM Controller MeAOW Sven Goossens, Benny Akesson, Kees Goossens COBRA – CA104 NEST.

Main Memory CS448.

Versatile Low Power Media Access for Wireless Sensor Networks Sarat Chandra Subramaniam.

1 Job Scheduling for Grid Computing on Metacomputers Keqin Li Proceedings of the 19th IEEE International Parallel and Distributed Procession Symposium.

Real Time Systems Real-Time Schedulability Part I.

1 Presented By: Michael Bieniek. Embedded systems are increasingly using chip multiprocessors (CMPs) due to their low power and high performance capabilities.

A Unified WCET Analysis Framework for Multi-core Platforms Sudipta Chattopadhyay, Chong Lee Kee, Abhik Roychoudhury National University of Singapore Timon.

Impact of Power-Management Granularity on The Energy-Quality Trade-off for Soft And Hard Real-Time Applications International Symposium on System-on-Chip,

“A cost-based admission control algorithm for digital library multimedia systems storing heterogeneous objects” – I.R. Chen & N. Verma – The Computer Journal.

Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.

6.1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.

Hybrid Multi-Core Architecture for Boosting Single-Threaded Performance Presented by: Peyman Nov 2007.

Cache (Memory) Performance Optimization. Average memory access time = Hit time + Miss rate x Miss penalty To improve performance: reduce the miss rate.

Sunpyo Hong, Hyesoon Kim

Introductory Seminar on Research CIS5935 Fall 2008 Ted Baker.

Lecture 4 CPU scheduling. Basic Concepts Single Process  one process at a time Maximum CPU utilization obtained with multiprogramming CPU idle :waiting.

CPU Scheduling G.Anuradha Reference : Galvin. CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time.

CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.

Scheduling Jobs Across Geo-distributed Datacenters Chien-Chun Hung, Leana Golubchik, Minlan Yu Department of Computer Science University of Southern California.

On-time Network On-Chip: Analysis and Architecture CS252 Project Presentation Dai Bui.

Priority Based Fair Scheduling: A Memory Scheduler Design for Chip-Multiprocessor Systems Tsinghua University Tsinghua National Laboratory for Information.

EEE Embedded Systems Design Process in Operating Systems 서강대학교 전자공학과

Using Derivatives to Find Absolute Maximum and Minimum Values

First Derivative Test So far…

Accelerating Dependent Cache Misses with an Enhanced Memory Controller

Using Derivatives to Find Absolute Maximum and Minimum Values

Chapter 6: CPU Scheduling

Chapter 5: CPU Scheduling

Objectives for Section 12.5 Absolute Maxima and Minima

Operating System Concepts

Chapter 5: CPU Scheduling

X y y = x2 - 3x Solutions of y = x2 - 3x y x –1 5 –2 –3 6 y = x2-3x.

Using Derivatives to Find Absolute Maximum and Minimum Values

Chapter 12 Graphing and Optimization

CPU Scheduling.

Chapter 4 Graphing and Optimization

Chapter 5: CPU Scheduling

Presentation transcript:

Run-Time Power-Down Strategies for Real-Time SDRAM Memory Controllers Karthik Chandrasekar 1, Benny Akesson 2, and Kees Goossens 2 1 TU Delft and 2 TU Eindhoven, The Netherlands Karthik Chandrasekar TU Delft

Save No Performance Impact Context here: SDRAM Memories 2

Problem Statement & Proposed Solutions  SDRAMs contribute significantly to SoC energy profile, even when idle.  Powering down impacts performance, due to power-up latencies.  Existing SDRAM memory controllers provide :  Either “Low power consumption” or “Real-Time performance” not “Both”.  Other existing real-time low-power solutions use compile-time info and are not suitable for run-time memory controller use.  We propose :  Run-time power optimization solutions for real-time SDRAM controllers.  We guarantee :  Significant energy savings without impacting bandwidth guarantees.  We support :  SDRAM memory controllers using Predictable arbiters such as: Round-Robin, Time Division Multiplexing, Priority-based arbiters etc. 3

Arbiters, Requests & Guarantees  Predictable Arbiters such as Round-Robin, TDM, etc. provide:  Maximum Latency Bounds  Minimum Bandwidth Guarantee  Such performance guarantees are based on :  Request Sizes & Service Cycle Length (SCL)  The smallest SCL (min_SCL) defines Scheduling Interval (SI) and Idle SCL.  The longest SCL (max_SCL) defines the guaranteed Net Bandwidth.  Micron 1Gb, DDR3-800 using Closed-Page BC-4, BI-1 for 64B requests. 4

Deriving Latency-Rate Arbiter Guarantees  A Latency-Rate arbiter guarantees a requester :  Maximum Latency Bounds  Minimum Bandwidth Guarantee  Deriving guarantees for R1 when backlogged using Round-Robin arbiter  Maximum Latency Bound( Θ ) = t BLOCK + (x+1) * max_SCL + t REFRESH  Net Bandwidth (Net_BW) = num(max_SCL) * Request Size / t REFI  Minimum Guaranteed Bandwidth (β) = ρ* Net_BW 5

Proposed Real-Time Power-Down Strategies  Conservative Power-Down  Always powers-up within Scheduling Interval (SI)  Aggressive Power-Down  Powers-up only when required; with Snooping SI – tPUP  Request misses slot, if it arrives after Snooping point  Only latency bounds increase and bandwidth guarantee is not affected.  What if the request arrives after Snooping point?

Impact on Θ and β  Conservative Power-Down  Θ does not change  Max_SCL does not change  Aggressive Power-Down  Θ increases by tPUP  Max_SCL does not change  Speculative Power-Down  Max_SCL increases  Latency Bound( Θ ) = t BLOCK + (x+1) * max_SCL + t REFRESH  Net Bandwidth (Net_BW) = num(max_SCL) * request size / t REFI  Bandwidth Guarantee (β) = ρ* Net_BW  Θ increases depending on number of interfering requesters (x)  Net_BW and β decrease significantly depending on increase in max_SCL 7

Impact on Energy & Performance 8  Worst-Case Impact:  Θ Increase:  Aggressive PD – 2.4%  Speculative PD – 12.3%  β Decrease:  Aggressive PD – 0.0%  Speculative PD – 12.1%  Average Execution Time Penalty:  Aggressive PD – 0.25%  Speculative PD – 1.32%  Energy Savings:  Conservative PD – 42.1%  Aggressive PD – 51.3%  Theoretical Best PD – 51.4%  4 Requesters/Apps, Round-Robin, Micron 1Gb, DDR3-800, 64B requests

Summary  Proposed two real-time power-down strategies:  Conservative Latency-Bandwidth-Neutral and Aggressive Bandwidth-Neutral  If memory goes idle, it powers-down (if it is gainful to power-down). Run-time, it checks if the memory can go to or continue to be in power-down.  Evaluated their impact on:  Latency Bounds (Θ)  Bandwidth Guarantee (β)  Compared them against:  Speculative power-down  Theoretical best power-down  Showed impact on:  Real-time performance guarantees  Average-case execution time and energy savings For more details: Please visit my poster! 9