Download presentation
Presentation is loading. Please wait.
Published byCecily Foster Modified over 8 years ago
1
An Energy-efficient Task Scheduler for Multi-core Platforms with per-core DVFS Based on Task Characteristics Ching-Chi Lin, You-Cheng Syu, Chao-Jui Chang, Jan-Jan Wu, Pangfeng Liu, Po-Wen Cheng and Wei-Te Hsu
2
Introduction Green computing is imperative Increasing of computers Increasing of energy cost Increasing of Carbon emissions
3
Motivation Main technologies to improve energy effective ◦ Hardware level: Low power devices ◦ System level: Power-management mechanisms in different levels ◦ Application level: Consolidate with virtualization Power-management mechanisms ◦ Circuit level: Clock-gating ◦ System level: DPM ◦ Processor level: DVFS/DFS/DVS, C-state To Shutdown unused component or circuit
4
Task Execution Modes Batch Mode – Batches of jobs Online Mode – Different time constraints – Interactive and non-interactive tasks – e.g. online judging system
5
Contributions Task scheduling strategy that solves three important issues simultaneously: – assignment of tasks to CPU cores – execution order of tasks – CPU processing rate for the execution of each task Task model, CPU processing rate model, energy consumption model, and cost function. Workload Based Greedy (WBG) for execution of tasks in the batch mode Least Marginal Cost (LMC) a heuristic algorithm for executing tasks in the online mode, LMC assigns interactive and non-interactive tasks to cores.
6
MODELS Task Model – j k = (L k,A k,D k ) – where L k is the number of CPU cycles required to complete j k, A k is the arrival time of j k, and D k is the deadline of j k. If j k has a specific deadline, D k > A k ≥ 0 Processing Rate – Let P = {p1,p2,p3,...} be a non-empty set of discrete processing rates a core can utilize based on the hardware, with 0 < p1 <p2 < p3 <... < p |P|. – We use p jk from set P to denote the processing rate of a task j k. Energy Consumption – For a task j k, let e k is energy consumption; t k the execution time; and p jk be the processing rate. – We define E(p) and T (p) as the energy and the time required to execute one cycle with processing rate p on a CPU core
7
TASK SCHEDULING IN THE BATCH MODE Tasks with Deadlines / Deadline-SingleCore Partition problem: let A={a1,…,an} is set of +ve integers. – Theorem: Deadline-SingleCore is NP Complete. Proof: n tasks j1,…,jn ; no. of cycles needed for first n task is Li=ai S=a1+,…,+an: is total no. of cycles for finishing n tasks. T(pl)=2, T(ph)=1, E(ph)=4, E(pl)=1 ; E = T 2 Time constraint is 1.5S and energy Constraint is 2.5S, deadline is 1.5S. No. of tasks whose sum is at least S/2 to complete in 1.5S time and 2.5S energy.
8
Tasks without Deadlines on a Single Core Platform Cost Function must consider both the energy consumption and the execution time. – Energy Cost: C k,e = R e L k E(p jk ) – Temporal Cost: C k is cost of task j k And C is total cost for all tasks
9
Tasks without Deadlines on a Single Core Platform Amount of delay that a task causes for other tasks
10
Dominating Position Set/Range D p is “dominating position set” of p
11
Scheduling Tasks without Deadlines on Multi-core Platforms Scheduling tasks – Homogeneous multi-core systems Same energy consumption and time consumption function Round-Robin techniques to assign tasks – Heterogeneous multi-core systems Different energy consumption and time consumption function Tasks are assigned in Greedy manner
12
TASK SCHEDULING IN THE ONLINE MODE For e.g. Online judging system Interactive Tasks and Non-interactive Tasks System can be Homogeneous multi-core or Heterogeneous multi-core Interactive task higher priority then non-inter: Marginal Cost
13
Dynamic Task Insertion and Deletion
15
COCA: Computation Offload to Clouds using AOP Hsing-Yu Chen, Yue-Hsun Lin, and Chen-Mou Cheng
16
Introduction Computation Offload – Not Mobile cloud AOP Approach – COCA works in source level vs. Binary level approach – In binary level approach, the offload can be made transparent to the application programmers – But the benefits of this become less important in cloud computing
17
Background Aspect-Oriented Programming – Increase Modularity by allowing the separation of cross-cutting concerns – Entails breaking down program logic into distinct part
18
Background AspectJ – Allows programmers to define “aspects” Aspect provides pointcuts and advices for specific functions – Corresponding advices – main AOP used in COCA before, after, around AspectJ for Android – No official support for Android yet – Major changes Alter the compilation phase of Android Java compiler to AspectJ Dynamic Loading for Java Classes – Complied java bytecode(.class) can be loaded and run on a JVM dynamically in runtime
19
Design of COCA
20
Profile Stage 1.Mark all pure functions 2.Evaluates the processing time and required memory foot print for each function – Result of profiling is summarized in a report – Allows evaluation in an emulated environment – Allows automate the selection process by integrating COCA with existing program partitioning schemes
21
Build Stage 1.Divide the original Java source code into ‘to offload’ and ‘not to offload’ – Programmer can selects the target function to offload It selects the dependent classes 2.Translate the code into AspectJ code – Filtered Java classes are complied to JVM bytecode – Results Jar file for cloud server Apk installation file for Android
22
Register stage Assumption – The user already has an account on an existing cloud service (Amazon EC2) Process – Run the COCA server daemon in the cloud – Upload the compiled bytecode in jar files to the cloud Authenticates and loads the clases from the jar file via the dynamic loading
23
Running 1. Launch the corresponding program 2. COCA requests computation offload 3. Server retrieve the related classes from the database, load the target classes 4. Perform computation by calling appropriate functions 5. Send the result back to smart phone
24
Experimental Evaluation Overhead of AspectJ on Android – Target Device – HTC Tattoo smart phone Qualcomm MSM7225 (528Mhz) – First approach – Comparing the latency of function calls with/without AspectJ Before/after advice – 195 ns per call Around advice – 290 ns per call – Second approach – Android sample application “Amazed” – The overhead brought by Aspect J is negligible
25
Experimental Evaluation Real-world Android Chess Game case – AI Capability Enhancement
26
Experimental Evaluation Communication Cost 3G network : 120/509kbps (Up/Down) Transmitted data : 30KB COCA should work very well on current Wi-Fi network
27
Experimental Evaluation Energy Saving – Using Monsoon power monitor – Experiment on Honzovy achy AI computation 56% energy reduction
28
Discussions Arguments for Working at source level – Additional Overhead No additional overhead for developer – If he codes in AOP…… Users – Need to install patched VM – Modularized source code Developer can simply isolate the design from mobile side and cloud side Maintenance much easier
29
Discussions Pure vs. Non-pure Functions – Non-pure functions Tend to access global variables, including primitive variable Static object calls – Synchronize the function with remote object Serializing – severe cost
30
Discussions Potential Application – 3D image rendering 3D Games on mobile Related solutions – NVIDIA RealityServer – OTOY’s streaming platform – Amazon EC2 - EnFuzion
31
Related works
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.