ElasticTree: Saving Energy in Data Center Networks 許倫愷 2013/5/28
About the paper Brandon Heller, Srini Seetharaman, Priya Mahadevan, Yiannis Yiakoumis, Puneet Sharma, Sujata Banerjee, Nick McKeown NSDI’10 (USENIX conference on Networked systems design and implementation) Citation: pages
Outline The big picture Introduction ElasticTree system Analysis Conclusion
Outline The big picture Introduction ElasticTree system Analysis Conclusion
The motivation
Very inefficient!! Desired
Why wasting power Provisioning for peak Time varying traffic demands Low efficiency at low loads
The goal of ElasticTree
The approach… Turn off unneeded links and switches
The challenge Performance Fault tolerance Scalability
Outline The big picture Introduction ElasticTree system Analysis Conclusion
Introduction What is ElasticTree: ElasticTree is a system for dynamically adapting the energy consumption of a data center network What does it do: Finding minimum-power network subsets across a range of traffic patterns Trade-off: energy efficiency, performance and robustness
Introduction
Data center network (Traditional) 2N Tree: One failure can cut the effective bisection bandwidth in half; two failures can disconnect servers
Data center network Fat tree : SIGCOMM 2008, A Scalable, Commodity Data Center Network Architecture
Data center network provision for peak workload Traffic varies daily, weekly, monthly, and yearly.
Energy Proportionality The strategy: turn off the links and switches that we don’t need
Outline The big picture Introduction ElasticTree system Analysis Conclusion
ElasticTree ElasticTree is a system for dynamically adapting the energy consumption of a data center network
ElasticTree If 0.2 Gbps of traffic per host,1 Gbps link…
ElasticTree 13/20 switches and 28/48 links stay active ElasticTree reduces network power by 38%
ElasticTree The optimizer : find the minimum- power network subset which satisfies current traffic conditions
Optimizer As traffic conditions change, the optimizer continuously re-computes the optimal network subset 3 approaches: Formal Model, Greedy Bin-Packing, Topology-aware Heuristic
Optimizer comparison
Formal model Finding the optimal flow assignment alone is an NP- complete problem for integer flows. Derived from standard multi-commodity flow (MCF) problem The model outputs a subset of the original topology, plus the routes taken by each flow to satisfy the traffic matrix O(n^3.5+)
Greedy Bin-Packing Strategy: choose the leftmost one with sufficient capacity O(n^2+) 1G link
Greedy Bin-Packing 1G link
Topo-aware Heuristic 1. does not compute the set of flow routes 2. assumes perfectly divisible flows => pack every link to full utilization and reduce TCP bandwidth => starter subset Decoupling power optimization from routing : => can be applied alongside any fat tree routing algorithm
Topo-aware Heuristic An edge switch doesn’t care which aggregation switches are active, but instead, how many are active
Topo-aware Heuristic Decoupling power optimization from routing
Optimizer comparison
Outline The big picture Introduction ElasticTree system Analysis Conclusion
How to test K = 6, fat tree OpenFlow
Analysis Traffic pattern: Near: servers communicate only with other servers through their edge switch Far: servers communicate only with servers in other pods
Analysis Random demand: Individual aggregation/core switches turning on/off
Analysis 70% to outside, 30% inside DCN Different traffic load
Analysis: redundancy If only the MST is on => no redundancy => no fault tolerance
Analysis: redundancy +MST: additive cost, multiplicative benefit
Analysis: latency Need safety margin!! Ethernet overheads (preamble, inter-frame spacing, and the CRC) cause the egress buffer to fill up Packets either get dropped or significantly delayed
Analysis: latency Safety margin is the amount of capacity reserved at every link by the optimizer Traffic overload is the amount each host sends and receives beyond the original traffic matrix Trade-off between Energy and Performance
Outline The big picture Introduction ElasticTree system Analysis Conclusion
Summary
Reference The paper The slide (by the author) A youtube video (by the author, too)
Questions
Thank you!