Download presentation
Presentation is loading. Please wait.
1
Dynamic Graph Partitioning Algorithm
Xiaojun Shang xs2236
2
abstract An algorithm to schedule jobs (represented by graphs) over a large cluster of servers to satisfy load balancing and cost consideration at the same time. The algorithm comes from paper scheduling storm and streams in the cloud by Javad Ghaderi, Sanjay Shakkottai and R Srikant. my work: Study the paper and relevant references. Smulate the results of alternative dynamic graph partitioning algorithm given by the paper. Tune the crutial parameters in the algorithm to study the trade-off between minimizing average partitioning cost and average queue length. Give suggestions about the choice of parameters to get better performance.
3
introduction to the paper
The paper is motivated by emerging big streaming data processing paradigms such as the Twitter Storm. In these problems, we need to handle jobs consist of many different compute tasks. Each of these tasks should be excuted seperately and simultaneously. But they also have connections among one another. That means that there are data flows among these tasks. To excute one, you may need data from another. The paper uses graph to represent this kind of job. The nodes in the graph represent the tasks of the job and edges represent data flows between nodes.
4
introduction to the paper
Examples of jobs represented by graphs:
5
introduction to the paper
To execute such jobs, each node of a graph is mapped to a machine in a cloud cluster. The communication fabric of the cloud cluster supports the data flows corresponding to the praph edges. From the cloud cluster side, there are a collection of machines interconnected by a communication network. Each machine can simultaneously support a finite number of graph nodes. (the number is limited by resources: memory/processing/bandwidth) In storm, these available resources are called "slots". Each slot for one node of the graph.
6
introduction to the paper
Graphs arrive randomly over time to this cloud cluster, and upon completion, leave the cluster. The scheduling task is to map the nodes of an incoming graph onto the free slots in machines to have an efficient cluster operation.
7
introduction to the paper
Two crucial points: 1. Delay of the system. When jobs arrive, they can either be immediately served, or queued and served at a later time. Thus, there is a set of queues representing existing jobs on the system either waiting for service or receiving service. 2. Job partition cost. For any job, we assume that the cost of data exchange between two nodes that are inside the same machine is zero, and the cost of data exchange between two nodes of a graph on different machines is one. 1 1 1
8
Preparation for the algorithm
templates: An important construct in this paper. A template corresponds to one possible way in which a graph Gj can be partitioned and distributed over the machines. 1 1 1 1 1 1 1 1 1 1
9
Preparation for the algorithm
Configuration: While there are an extremely large number of templates possible for each graph, only a limited number of templates can be present in the system at any instant of time.(each slot can be used for at most one template at any time) Configuration represents the collection of templates in the system. It contains two parts, actual templates and virtual templates. Actual templates: templates in the system corresponding to each job that is being served. virtual templates: when a new job arrives or departs, the system can potentially create a new template that is a pattern of empty slots that can be filled with a specific job type.
10
Description of the algorithm
examples:
11
Preparation for the algorithm
some importance definations is the cost of partitioning graph Gj according to template A. The algorithm's goal: here is a random variable denoting the fraction of time that a template A is used in the steady state.
12
Preparation for the algorithm
The paper develops a new class of low complexity algorithms and analytically characterize their delay and partitioning costs. In particular, the algorithms can converge to the optimal solution of the static graph partitioning problem, by trading-off delay and partitioning cost.(The proof is in the paper but no time to show). the here is the load of graph.
13
Preparation for the algorithm
Firstly we assume graphs of type j arrive according to Poisson process with rate and will remain in the system for an exponentially distrubuted amount of time with mean
14
introduction to the algorithm
The algorithm shown in the presentation is an alternative description of the Dynamic Graph partitioning algorithm by using dedicated clocks. Each is assigned an independent Poisson clock of rate That means at each time t, the time duration until the tick of the next clock is an exponiential random variable with parameter
15
Description of the algorithm
At the instances of dedicated clocks: Suppose the dedicated clock of queue makes a tick, then: 1: A virtual template is chosen randomly from currently feasible templates for graph , given the current configuration, using Random Partition Procedure, if possible. Then this template is added to the configuration with probability and discarded otherwise. The virtual template leaves the system after an exponientially distributed time duration with mean 2: If there is a job of type j in waiting to get service, and a virtual template of type j is created in step 1, this virtual template is filled by a job from which converts the virtual templates to an actual template.
16
Description of the algorithm
At arrival instances: Suppose a graph of type arrives. The job is added to queue At departure instances: 1: At the departure instances of actual/vitual templates, the algorithm removes the corresponding template from the configuration. 2: If this is a departure of an actual template, the job is departed and the corresponding queue is updated.
17
Two keys for the algorithm
The rate of the Poisson clock for creating vitual templates The probability to keep the template or not In the simulation, we can achieve different performances by tuning the parameters in these two expressions. In my simulation, I only tuned alpha, beta and the number of servers, remaining others the same.
18
simulation Fixed and see the influence of changing to the partitioning cost and delay. Start with single-graph input (the seven nodes balanced binary tree ). The tree is mapped into a cluster of 10 servers and each server with 4 slots. The here is 4, and is 1.
19
simulation Alpha 0.005 beta 0.5 10servers lambda 4
20
simulation Alpha 0.5 beta 0.5 10servers lambda 4
21
simulation Fix , then tune to see its influence over the trade-off. Other conditions remain the same. Alpha 5 beta 5 10servers lambda 4 Alpha 2 beta 2 10servers lambda 4
22
simulation Alpha 0.6 beta 0.6 10servers lambda 4
23
simulation Now we fixed all the parameters but increase the number of servers Alpha 0.5 beta servers lambda 9 Alpha 0.5 beta servers lambda 9
24
simulation For multi input(three graphs share the servers)
Alpha 0.5 beta servers lambda 6 For multi input(three graphs share the servers)
25
simulation Alpha 0.2 beta servers lambda 6
26
Thank You Q&A
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.