Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学

Similar presentations


Presentation on theme: "Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学"— Presentation transcript:

1 Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学 zhaoqingtj@tust.edu.cn

2 Motivation ●Increase the corresponding speed and throughput ●Guarantee QoS ●Energy Efficient and Green Computing

3 Overview ●Data placement for data-intensive application ●Task scheduling for QoS and energy efficiency ●Online task scheduling

4 1. Data Placement for data-intensive application ●Data clustering based on data correlation if put every 2 at different nodes? how much data transfer amount would be increased BEA Hierarchical clustering tree Objective: Place the close-related data items together so as to decrease data transfers Contributions: 1.Introduced data size factors 2.Issued “First Order Conduction Correlation” from intermediate data

5 ●Data distribution Storage capacity, computation load balance “Tree-to-Tree” greedy allocation strategy Modified PSO algorithm 1. Data Placement for data-intensive application ●Cloud platform modeling Physical network structure/ BEA Objective: Make the frequent data movements happene on high-speed channels so as to improve network utilization and the efficiency of the whole cloud system.

6 1. Data Placement for data-intensive application ●Runtime data placement —Newly generated datasets will be saved to the data center which has the maximum dependency with it —The cost of re-distribution itself will also be taken into account. ●Results: by the greedy allocation strategies

7 2. Task Scheduling and Virtual Machine Allocation ●Objective: —Distribute the tasks with strong data dependences to the servers on a high-bandwidth connection, and turn off some of these servers with low utilization —Therefore: the response time can be reduced the utilization of system wide can be improved some idle network devices can also be turned off ●Task Clustering by —Hypergraph partitioning —BEA Transformation Efficient & Energy Saving!

8 2. Task Scheduling and Virtual Machine Allocation ●Requirement of tasks —Storage requirement —Computing Resource requirement: represented by VMs. ●Task Scheduling and Deadline constraint: Decrease the number of VMs as much as possible, while ensuring users’ Service Level Agreements.

9 2. Task Scheduling and Virtual Machine Allocation ●Physical machine allocation —Optimization objective: energy efficiency, high-bandwidth networks, load balance —Greedy Strategy: Each server’s energy efficiency TRD (Task Requirement Degree) Top-Down & Bottom-up: reduce data transfers, and improve network utilization Load balance —Constraint conditions: storage capacity, CPU and memory constraints —Other Methods: Genetic algorithms, PSO algorithms Optimal utilization level in terms of performance-per-watt: Commonly,

10 3. Online Scheduling ●Problems: —How to schedule the tasks in a fine-grained workflow? —How to deal with some variable conditions at runtime? ●Reinforcement learning based methods The goalof RL is to find the optimal policy parameter AgentEnvironment State s Action a Reward r

11 3. Online Scheduling Example: Cart-Pole Swing-up ●Task: swing up the pole by moving the cart ●State (2-D continuous): angle, and velocity of the pole ●Action (1-D continuous): force applied to cart ●Reward:

12 Thank for your time !


Download ppt "Data Placement and Task Scheduling in cloud, Online and Offline 2014.11.27 赵青 天津科技大学"

Similar presentations


Ads by Google