Download presentation
Presentation is loading. Please wait.
Published byRalf Baldwin Modified over 9 years ago
1
Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn
2
Motivation ●Increase the corresponding speed and throughput ●Guarantee QoS ●Energy Efficient and Green Computing
3
Overview ●Data placement for data-intensive application ●Task scheduling for QoS and energy efficiency ●Online task scheduling
4
1. Data Placement for data-intensive application ●Data clustering based on data correlation if put every 2 at different nodes? how much data transfer amount would be increased BEA Hierarchical clustering tree Objective: Place the close-related data items together so as to decrease data transfers Contributions: 1.Introduced data size factors 2.Issued “First Order Conduction Correlation” from intermediate data
5
●Data distribution Storage capacity, computation load balance “Tree-to-Tree” greedy allocation strategy Modified PSO algorithm 1. Data Placement for data-intensive application ●Cloud platform modeling Physical network structure/ BEA Objective: Make the frequent data movements happene on high-speed channels so as to improve network utilization and the efficiency of the whole cloud system.
6
1. Data Placement for data-intensive application ●Runtime data placement —Newly generated datasets will be saved to the data center which has the maximum dependency with it —The cost of re-distribution itself will also be taken into account. ●Results: by the greedy allocation strategies
7
2. Task Scheduling and Virtual Machine Allocation ●Objective: —Distribute the tasks with strong data dependences to the servers on a high-bandwidth connection, and turn off some of these servers with low utilization —Therefore: the response time can be reduced the utilization of system wide can be improved some idle network devices can also be turned off ●Task Clustering by —Hypergraph partitioning —BEA Transformation Efficient & Energy Saving!
8
2. Task Scheduling and Virtual Machine Allocation ●Requirement of tasks —Storage requirement —Computing Resource requirement: represented by VMs. ●Task Scheduling and Deadline constraint: Decrease the number of VMs as much as possible, while ensuring users’ Service Level Agreements.
9
2. Task Scheduling and Virtual Machine Allocation ●Physical machine allocation —Optimization objective: energy efficiency, high-bandwidth networks, load balance —Greedy Strategy: Each server’s energy efficiency TRD (Task Requirement Degree) Top-Down & Bottom-up: reduce data transfers, and improve network utilization Load balance —Constraint conditions: storage capacity, CPU and memory constraints —Other Methods: Genetic algorithms, PSO algorithms Optimal utilization level in terms of performance-per-watt: Commonly,
10
3. Online Scheduling ●Problems: —How to schedule the tasks in a fine-grained workflow? —How to deal with some variable conditions at runtime? ●Reinforcement learning based methods The goalof RL is to find the optimal policy parameter AgentEnvironment State s Action a Reward r
11
3. Online Scheduling Example: Cart-Pole Swing-up ●Task: swing up the pole by moving the cart ●State (2-D continuous): angle, and velocity of the pole ●Action (1-D continuous): force applied to cart ●Reward:
12
Thank for your time !
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.