Presentation is loading. Please wait.

Presentation is loading. Please wait.

Authors: Xiaoqiao Meng, Vasileio Pappas and Li Zhang

Similar presentations


Presentation on theme: "Authors: Xiaoqiao Meng, Vasileio Pappas and Li Zhang"— Presentation transcript:

1 Improving the Scalability of Data Center Networks with Traffic-aware Virtual Machine Placement
Authors: Xiaoqiao Meng, Vasileio Pappas and Li Zhang Presented by : Jinpeng Liu

2 Introduction Modern virtualization Data Centers are hosting a wide spectrum of applications. Bandwidth usage between VMs is rapidly growing Scalability of data center networks becomes a concern. Techniques suggested: Rich connectivity at the edge of the network Dynamic routing protocols Current solutions require changes in the network architecture and ourint protocols.

3 Introduction This paper tackles the issue from a different perspective. Current VM placement strategy has issues: Placement is decided by capacity planning tools: CPU Memory Power consumption Ignore Consumption of Network resources VM pairs with heavy traffic could be placed on hosts with large network cost. How often does this pattern happen in practice?

4 Background DC traffic pattern Data collected:
DWH hosted by IBM Global Services - server resource utilization from hundreds of server farms. A server cluster with hundreds of VMs - aggregate traffic of 68 VMs Traces were collected for 10 days.

5 Background DC traffic pattern
Uneven distribution of traffic volumes from VMs While 80% of VMs have average rate less than 800 KBytes/min, 4% of them have a rate 10x higher No larger than Why 800? Heatmap shows inter-VM traffic rate varies significantly

6 Background DC traffic pattern Stable per-VM traffic at large timescale
b. Sd of the traffic rate is no more than two times of the mean c. Stable interval means the rate within that interval is no more than one SD away from the mean in the entire interval. No less than 80% d. Long tail indicates at two large timescales, a large fraction of VM’s traffic is relatively constant 82%

7 Background DC traffic pattern
Weak correlation between traffic rate and latency Based on our measurement on the traffic rate and end-to-end latency among 68 VMs in a production cluster, visually no correlation, -0.32 weak correlation between the matrices

8 Background Tree architecture Cons: Topology scaling limitations
By scaling up each individual switch Core tier only accommodate 8 switches Address space limitation Higher server over-subscription Current data centers follow to a great extend a common network architecture . 3 tier. Each server connects to 1(2) access switch Each access switch connects to 1(2) aggregation tier Each agg switch connects with multiple core switch Cisco Data Center Infrastructure 2.5 Design Guide

9 Background VL2 Architecture Share many features with Tree
Complete Bipartite Graph VL2 Architecture Share many features with Tree Core tier and aggregation tier form a Clos topology. Valiant load balancing Randomly selected core switches as intermediate destination Location independent Bipartite graph is a graph whose vertices can be divided into two disjoint sets   (U and V) such that every edge connects a vertex in  U to one in V . If the edges connect every vertex in U with all vertices in V, it is a complete bipartite graph Valiant loading balancing ensure that load is balanced independently of destination of the traffic flows. Access first randomly select aggr switches then core switches and then forward to the destination by MAC. sender receiver

10 Background PortLand Architecture (Fat-Tree)
Require all switches are identical i.e., same number of ports Build with concept of pods A collection of access and aggregation switches form A Clos topology Pods and core switches form a second Clos topology Evenly distributing the up-links Pros: Full Bisection BW: 1:1 Oversubscription ratio Low Cost: commodity switches, low power/cooling Cons: Scalability: size of the network depends on ports per switch. For 48 ports => max 27,648 host

11 Background Modular Data Center (MDC) Thousands of servers
Interconnected by switches Packed into a shipping-container Sun, HP, IBM DELL … Higher degree of mobility Higher system and power density Lower cost (cooling and manufacturing)

12 Background BCube Architectures Purpose: BCube: Data-intense computing
Bandwidth-intensive communication among MDC servers Low-end COTS mini-switches Graceful performance degradation BCube: Server-centric Servers are part of the network Servers not directly connected Defined recursively

13 Background BCube: At level 0 𝐵𝐶𝑢𝑏𝑒 0 consists n servers connected by 1 n-port swithces A 𝐵𝐶𝑢𝑏𝑒 𝑘 is constructed from n 𝐵𝐶𝑢𝑏𝑒 𝑘−1 and 𝑛 𝑘 n-port switches C k = 1, n = 4 𝐵𝐶𝑢𝑏𝑒 0 −4 servers by 1 port 𝐵𝐶𝑢𝑏𝑒 1 −4 𝐵𝐶𝑢𝑏𝑒 0 by 4 port

14 Background BCube: Server Label based on the locations in the BCube structure Severs connected at ith level if their label differs at that level Label Level 0 Level 1 2.4 4th 2nd 1.3 1st 3rd 1 4 3 2 1 4 3 2 1 4 3 2 1 4 3 2

15 Background BCube: Server Label based on the locations in the BCube structure Severs connected at ith level if their label differs at that level Label Level 0 Level 1 2.4 4th 2nd 1.3 3rd 1st 1 4 3 2 1 4 3 2 1 4 3 2 1 4 3 2

16 Background BCube: Server Label based on the locations in the BCube structure Severs connected at ith level if their label differs at that level Label Level 0 Level 1 2.4 4th 2nd 1.4 1st 1 4 3 2 1 4 3 2 1 4 3 2 1 4 3 2

17 Background BCube: Server Label based on the locations in the BCube structure Severs connected at ith level if their label differs at that level Label Level 0 Level 1 2.4 4th 2nd 1.4 4st 1st 1 4 3 2 1 4 3 2 1 4 3 2 1 4 3 2

18 Background BCube: Server Label based on the locations in the BCube structure Severs connected at ith level if their label differs at that level Label Level 0 Level 1 2.4 4th 2nd 1.4 4st 1st Impact of the 4 Arc will be studied. 1 4 3 2 1 4 3 2 1 4 3 2 1 4 3 2

19 Virtual Machine Placement Problem
How to place VMs on a set of Physical hosts ? Assumptions: existing CPU/memory based capacity tools have decided the number of VMs that a host can accommodate. use a slot to refer to one CPU/memory allocation on a host. A host can have multiple slots A VM can take any un-occupied slot Static and single-path routing All external traffic are routed through a common gateway switch Scenario: Place n VMs in n Slots

20 Traffic-aware VM Placement Problem (TVMPP)
𝐶 𝑖𝑗 : fixed communication cost from slot i to j. 𝐷 𝑖𝑗 : traffic rate from VM i to j. 𝑒 𝑖 : external traffic rate for 𝑉𝑀 𝑖 . 𝑔 𝑖 : communication cost between 𝑉𝑀 𝑖 and the gateway 𝜋: 1,…, 𝑛 [1, …, 𝑛] : permutation function for assigning n VMs to n slots The TVMPP is defined as finding a 𝝅 to minimize (1)

21 Traffic-aware VM Placement Problem (TVMPP)
The meaning of the function depends on the definition of 𝐶 𝑖𝑗 . 𝐶 𝑖𝑗 is defined as # of switches on the routing path from VM i to j. (1) is the sum of traffic rate perceived by each switches If (1) is normalized by the sum of VM-to-VM bandwidth demand, it is equivalent to the average number of switches that a data unit traverses. If assuming equal delay on every switch, (1) can be interpreted as the average latency for a data unit traversing the network. Optimizing TVMPP is equivalent to minimizing traffic latency … How to explain ???

22 Traffic-aware VM Placement Problem (TVMPP)
How about # of slots > # of VMs? Add dummy VMs: no traffic Not affect VM placement TVMPP can be simplified by ignoring which is relatively constant. Cost between every host and gateway is the same (WHY??) Cost between every host and gateway is the same ???

23 Traffic-aware VM Placement Problem (TVMPP)
Offline Mode Data center operators estimate traffic matrix Collect network topology Solve TVMPP to decide which host(s) to create the VMs Online Mode Re-solve TVMPP periodically Reshuffle VMs placement when needed

24 TVMPP Complexity Matrix notation for (1):
D is traffic rates matrix C is communication cost matrix  is the set of permutation matrices A Quadratic Assignment Problem (QAP): There are a set of n facilities and a set of n locations. For each pair of locations, a distance is specified and for each pair of facilities a weight or flow is specified (e.g., the amount of supplies transported between the two facilities). The problem is to assign all facilities to different locations with the goal of minimizing the sum of the distances multiplied by the corresponding flows.

25 TVMPP Complexity Matrix notation for (1):
D is traffic rates matrix C is communication cost matrix  is the set of permutation matrices A Quadratic Assignment Problem (QAP): There are a set of n facilities and a set of n locations. For each pair of locations, a distance is specified and for each pair of facilities a weight or flow is specified (e.g., the amount of supplies transported between the two facilities). The problem is to assign all facilities to different locations with the goal of minimizing the sum of the distances multiplied by the corresponding flows. NP-Hard

26 TVMPP Complexity TVMPP problem belongs to the general QAP problem.
No existing exact solution can be scale to the size of current data centers.

27 Algorithms: Cluster-and-Cut
Proposition 1: Suppose 0  𝑎 1  𝑎 2 … 𝑎 𝑛 and 0  𝑏 1  𝑏 2 … 𝑏 𝑛 , the following inequalities hold for any permuation 𝜋: 1,…, 𝑛 . Design principle 1: Solving TVMPP is equivalent to finding a mapping of VMs to slots such that VM pairs with heavy mutual traffic be assigned to slot pairs with low-cost connection. rearrangement inequality TVMPP is to sum up all multiplications between Cij and corresponding Dij. ????

28 Algorithms: Cluster-and-Cut
Design principle 2: Divide-and-Conquer Partition VMs into VM-clusters / Slots into slot-clusters Map each VM-cluster to a slot-cluster by TVMPP Then map VMs to slots in each mapped VM and slot cluster by TVMPP

29 Algorithms: Cluster-and-Cut
Design principle 2: Divide-and-Conquer Partition VMs into VM-clusters / Slots into slot-clusters Cluster VMs Classical min-cut graph algorithm. VM pairs with high mutual traffic rate are within the same VM-cluster Consistent with previously finding that traffic generated from a small group of VMs comprise a large fraction of the total traffic. Approximation ratio = 𝑘−1 𝑘 𝑛

30 Algorithms: Cluster-and-Cut
Design principle 2: Divide-and-Conquer Partition VMs into VM-clusters / Slots into slot-clusters Cluster Slots Classical clustering algorithm. Slot pairs with low-cost connections belong to the same slot-cluster. Networks contains many groups of densely connected end hosts. Approximation ratio= 2

31 Algorithms: Cluster-and-Cut
Design principle 2:

32 Algorithms: Cluster-and-Cut
Design principle 2: 𝑂(𝑛𝑘) 𝑶( 𝒏 𝟒 ) 𝑂( 𝑛 4 ) How about recursive?

33 Impact of Network Architectures & Traffic Patterns
Performance Gains are affected by: Cost matrices (C) Tree VL2 Fat-Tree BCube Traffic matrices (D) Global traffic model Partitioned traffic model Through the problem formulation, we can notice that the traffic and cost matrices are the two determining factors for optimizing the VM placement. Consequently, we seek to answer a fundamental question: given that traffic patterns and network architectures in data centers have significant differences, how the performance gains due to optimal VM placement are affected?

34 Impact of Network Architectures & Traffic Patterns
Define Tree Cost Matrices C: 𝑝 0 :𝑇ℎ𝑒 𝑓𝑎𝑛−𝑜𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑐𝑐𝑒𝑠𝑠 𝑠𝑤𝑖𝑡𝑐ℎ𝑒𝑠 𝑝 1 :𝑇ℎ𝑒 𝑓𝑎𝑛−𝑜𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑖𝑜𝑛 𝑠𝑤𝑖𝑡𝑐ℎ𝑒𝑠 n x n

35 Impact of Network Architectures & Traffic Patterns
Define Tree Cost Matrices C: 𝑝 0 :𝑇ℎ𝑒 𝑓𝑎𝑛−𝑜𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑐𝑐𝑒𝑠𝑠 𝑠𝑤𝑖𝑡𝑐ℎ𝑒𝑠 𝑝 1 :𝑇ℎ𝑒 𝑓𝑎𝑛−𝑜𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑖𝑜𝑛 𝑠𝑤𝑖𝑡𝑐ℎ𝑒𝑠 n x n

36 Impact of Network Architectures & Traffic Patterns
Define Tree Cost Matrices C: 𝑝 0 :𝑇ℎ𝑒 𝑓𝑎𝑛−𝑜𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑐𝑐𝑒𝑠𝑠 𝑠𝑤𝑖𝑡𝑐ℎ𝑒𝑠 𝑝 1 :𝑇ℎ𝑒 𝑓𝑎𝑛−𝑜𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑖𝑜𝑛 𝑠𝑤𝑖𝑡𝑐ℎ𝑒𝑠 n x n 1

37 Impact of Network Architectures & Traffic Patterns
Define Tree Cost Matrices C: 𝑝 0 :𝑇ℎ𝑒 𝑓𝑎𝑛−𝑜𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑐𝑐𝑒𝑠𝑠 𝑠𝑤𝑖𝑡𝑐ℎ𝑒𝑠 𝑝 1 :𝑇ℎ𝑒 𝑓𝑎𝑛−𝑜𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑖𝑜𝑛 𝑠𝑤𝑖𝑡𝑐ℎ𝑒𝑠 n x n 3

38 Impact of Network Architectures & Traffic Patterns
Define Tree Cost Matrices C: 𝑝 0 :𝑇ℎ𝑒 𝑓𝑎𝑛−𝑜𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑐𝑐𝑒𝑠𝑠 𝑠𝑤𝑖𝑡𝑐ℎ𝑒𝑠 𝑝 1 :𝑇ℎ𝑒 𝑓𝑎𝑛−𝑜𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑖𝑜𝑛 𝑠𝑤𝑖𝑡𝑐ℎ𝑒𝑠 n x n 5

39 Impact of Network Architectures & Traffic Patterns
Define VL2 Cost Matrices C: 𝑝 0 :𝑇ℎ𝑒 𝑓𝑎𝑛−𝑜𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑐𝑐𝑒𝑠𝑠 𝑠𝑤𝑖𝑡𝑐ℎ𝑒𝑠 𝑝 1 :𝑇ℎ𝑒 𝑓𝑎𝑛−𝑜𝑢𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑎𝑔𝑔𝑟𝑒𝑔𝑎𝑡𝑖𝑜𝑛 𝑠𝑤𝑖𝑡𝑐ℎ𝑒𝑠 n x n 5

40 Impact of Network Architectures & Traffic Patterns
Define Fat-Tree Cost Matrices C: 𝑘:𝑡ℎ𝑒 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑟𝑡𝑠 𝑜𝑛 𝑒𝑎𝑐ℎ 𝑠𝑤𝑖𝑡𝑐ℎ n x n 3

41 Impact of Network Architectures & Traffic Patterns
Define Fat-Tree Cost Matrices C: 𝑘:𝑡ℎ𝑒 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑟𝑡𝑠 𝑜𝑛 𝑒𝑎𝑐ℎ 𝑠𝑤𝑖𝑡𝑐ℎ n x n 5

42 Impact of Network Architectures & Traffic Patterns
Define BCube Cost Matrices C: Hamming distance of server address n x n

43 Impact of Network Architectures & Traffic Patterns
Define BCube Cost Matrices C: Hamming distance of server address n x n Distance=1

44 Impact of Network Architectures & Traffic Patterns
Define BCube Cost Matrices C: Hamming distance of server address n x n Distance=2

45 Impact of Network Architectures & Traffic Patterns
Global Traffic Model: Each VM communicates with every other at a constant rate Complexity: 𝑂( 𝑛 3 ) Solved by Hungarian algorithm. : optimized objective value Random placement objective value

46 Impact of Network Architectures & Traffic Patterns
Global Traffic Model: @0 variance - 𝑆 𝑜𝑝𝑡 = 𝑆 𝑟𝑎𝑛𝑑 E.g., map-reduce type workload Otherwise 𝑆 𝑜𝑝𝑡 ≤ 𝑆 𝑟𝑎𝑛𝑑 E.g., Multi-tiered web application Gaps indicate improvement space for random placement. BCube has largest improvement space. Benefit in terms of scalability VL2 has the smallest improvement space 1024 VMs 4 port switchs then Bcube 4-level intermediate Switches

47 Impact of Network Architectures & Traffic Patterns
Partitioned Traffic Model: VMs form isolated partitions, and only VMs within the same partition communicate with each other Pairwise traffic rate following a normal distribution GLB: lower bound for the 𝑆 𝑜𝑝𝑡 Gaps indicate the performance improvement potential 𝑆 𝑟𝑎𝑛𝑑 has improvement space under different traffic variance BCube has larger improvement potential

48 Impact of Network Architectures & Traffic Patterns
At size 31 VL2 overlap with random Smaller partition size has higher improvement potential More performance improvement potential in a system with different partition size

49 Impact of Network Architectures & Traffic Patterns
Greater benefits under such conditions: increased traffic variance Increased number of partitions Multi-layer architecture

50 Evaluation Compare Cluster-and Cut to other QAP solving algorithms:
Local Optimal Pairwise Interchange (LOPI) Simulated Annealing (SA)

51 Evaluation Compare Cluster-and Cut to other QAP solving algorithms:
Local Optimal Pairwise Interchange (LOPI) Simulated Annealing (SA) 10% smaller

52 Evaluation Compare Cluster-and Cut to other QAP solving algorithms:
Local Optimal Pairwise Interchange (LOPI) Simulated Annealing (SA) 50% less

53 Summary Used traffic-aware virtual machine placement to improve network scalability Formulated the VM placement as an NP-Hard optimization problem. Proposed Cluster-and-Cut algorithm as a efficient solution Evaluated the potential performance on different traffic patterns and network architectures.

54 Thank you !


Download ppt "Authors: Xiaoqiao Meng, Vasileio Pappas and Li Zhang"

Similar presentations


Ads by Google