Wei Bai with Li Chen, Kai Chen, Dongsu Han, Chen Tian, Hao Wang SING HKUST Information-Agnostic Flow Scheduling for Commodity Data Centers 1 SJTU,

Slides:



Advertisements
Similar presentations
Finishing Flows Quickly with Preemptive Scheduling
Advertisements

B 黃冠智.
Deconstructing Datacenter Packet Transport Mohammad Alizadeh, Shuang Yang, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker Stanford University.
CSIT560 Internet Infrastructure: Switches and Routers Active Queue Management Presented By: Gary Po, Henry Hui and Kenny Chong.
Fixing TCP in Datacenters Costin Raiciu Advanced Topics in Distributed Systems 2011.
Improving Datacenter Performance and Robustness with Multipath TCP Costin Raiciu, Sebastien Barre, Christopher Pluntke, Adam Greenhalgh, Damon Wischik,
PFabric: Minimal Near-Optimal Datacenter Transport Mohammad Alizadeh Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker.
Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,
Balajee Vamanan et al. Deadline-Aware Datacenter TCP (D 2 TCP) Balajee Vamanan, Jahangir Hasan, and T. N. Vijaykumar.
Mohammad Alizadeh Adel Javanmard and Balaji Prabhakar Stanford University Analysis of DCTCP:Analysis of DCTCP: Stability, Convergence, and FairnessStability,
The War Between Mice and Elephants LIANG GUO, IBRAHIM MATTA Computer Science Department Boston University ICNP (International Conference on Network Protocols)
The War Between Mice and Elephants Presented By Eric Wang Liang Guo and Ibrahim Matta Boston University ICNP
Bertha & M Sadeeq.  Easy to manage the problems  Scalability  Real time and real environment  Free data collection  Cost efficient  DCTCP only covers.
Congestion control in data centers
Defense: Christopher Francis, Rumou duan Data Center TCP (DCTCP) 1.
1 Emulating AQM from End Hosts Presenters: Syed Zaidi Ivor Rodrigues.
Efficient Internet Traffic Delivery over Wireless Networks Sandhya Sumathy.
Evaluating System Performance in Gigabit Networks King Fahd University of Petroleum and Minerals (KFUPM) INFORMATION AND COMPUTER SCIENCE DEPARTMENT Dr.
Computer Networking Lecture 17 – Queue Management As usual: Thanks to Srini Seshan and Dave Anderson.
Information-Agnostic Flow Scheduling for Commodity Data Centers
UNIVERSITY OF ELECTRONIC SCIENCE & TECHNOLOGY OF CHINA IEEE INFOCOM 2015, Hong Kong RAPIER: Integrating Routing and Scheduling for Coflow-aware Data Center.
A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares, Alexander Loukissas, Amin Vahdat Presented by Gregory Peaker and Tyler Maclean.
Mohammad Alizadeh, Abdul Kabbani, Tom Edsall,
ICTCP: Incast Congestion Control for TCP in Data Center Networks∗
Practical TDMA for Datacenter Ethernet
Mohammad Alizadeh Stanford University Joint with: Abdul Kabbani, Tom Edsall, Balaji Prabhakar, Amin Vahdat, Masato Yasuda HULL: High bandwidth, Ultra Low-Latency.
Curbing Delays in Datacenters: Need Time to Save Time? Mohammad Alizadeh Sachin Katti, Balaji Prabhakar Insieme Networks Stanford University 1.
Detail: Reducing the Flow Completion Time Tail in Datacenter Networks SIGCOMM PIGGY.
On the Data Path Performance of Leaf-Spine Datacenter Fabrics Mohammad Alizadeh Joint with: Tom Edsall 1.
B 李奕德.  Abstract  Intro  ECN in DCTCP  TDCTCP  Performance evaluation  conclusion.
A.SATHEESH Department of Software Engineering Periyar Maniammai University Tamil Nadu.
CQRD: A Switch-based Approach to Flow Interference in Data Center Networks Guo Chen Dan Pei, Youjian Zhao Tsinghua University, Beijing, China.
1 Fair Queuing Hamed Khanmirza Principles of Network University of Tehran.
Jiaxin Cao, Rui Xia, Pengkun Yang, Chuanxiong Guo,
Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas.
Spring Computer Networks1 Congestion Control Sections 6.1 – 6.4 Outline Preliminaries Queuing Discipline Reacting to Congestion Avoiding Congestion.
MMPTCP: A Multipath Transport Protocol for Data Centres 1 Morteza Kheirkhah University of Edinburgh, UK Ian Wakeman and George Parisis University of Sussex,
Shuihai Hu, Wei Bai, Kai Chen, Chen Tian (NJU), Ying Zhang (HP Labs), Haitao Wu (Microsoft) Sing Hong Kong University of Science and Technology.
Scalable Congestion Control Protocol based on SDN in Data Center Networks Speaker : Bo-Han Hua Professor : Dr. Kai-Wei Ke Date : 2016/04/08 1.
Congestion Control Mechanisms for Data Center Networks
Data Center TCP (DCTCP)
Low-Latency Software Rate Limiters for Cloud Networks
Incast-Aware Switch-Assisted TCP Congestion Control for Data Centers
Resilient Datacenter Load Balancing in the Wild
6.888 Lecture 5: Flow Scheduling
How I Learned to Stop Worrying About the Core and Love the Edge
Chris Cai, Shayan Saeed, Indranil Gupta, Roy Campbell, Franck Le
Experiments with Data Center Congestion Control Research
HyGenICC: Hypervisor-based Generic IP Congestion Control for Virtualized Data Centers Conference Paper in Proceedings of ICC16 By Ahmed M. Abdelmoniem,
Queue Management Jennifer Rexford COS 461: Computer Networks
Chapter 6 Queuing Disciplines
Congestion-Aware Load Balancing at the Virtual Edge
Credit-Scheduled Delay-Bounded Congestion Control for Datacenters
Scaling Deep Reinforcement Learning to Enable Datacenter-Scale
Hamed Rezaei, Mojtaba Malekpourshahraki, Balajee Vamanan
Augmenting Proactive Congestion Control with Aeolus
SoftRing: Taming the Reactive Model for Software Defined Networks
TimeTrader: Exploiting Latency Tail to Save Datacenter Energy for Online Search Balajee Vamanan, Hamza Bin Sohail, Jahangir Hasan, and T. N. Vijaykumar.
Providing QoS through Active Domain Management
AMP: A Better Multipath TCP for Data Center Networks
Network Simulation NET441
SICC: SDN-Based Incast Congestion Control For Data Centers Ahmed M
Centralized Arbitration for Data Centers
Iroko A Data Center Emulator for Reinforcement Learning
Lecture 16, Computer Networks (198:552)
Congestion-Aware Load Balancing at the Virtual Edge
Lecture 17, Computer Networks (198:552)
2019/10/9 A Weighted ECMP Load Balancing Scheme for Data Centers Using P4 Switches Presenter:Hung-Yen Wang Authors:Jin-Li Ye, Yu-Huang Chu, Chien Chen.
Rethinking Transport Layer Design for Distributed Machine Learning
Queueing Problem The performance of network systems rely on different delays. Propagation/processing/transmission/queueing delays Which delay is affected.
Presentation transcript:

Wei Bai with Li Chen, Kai Chen, Dongsu Han, Chen Tian, Hao Wang SING HKUST Information-Agnostic Flow Scheduling for Commodity Data Centers 1 SJTU, June 1st, 2015

Data Centers 2

This talk is about the transport inside the data center 3

Data Center Networks 4 High bandwidth Low latency Commodity switches Single administrative domain

Data Center Transport Cloud applications – Desire low latency for short messages Goal: Minimize flow completion time (FCT) – Many flow scheduling proposals… 5

The State-of-the-art Solutions PDQ [SIGCOMM’12] pFabric [SIGCOMM’13] PASE [SIGCOMM’14] … All assume prior knowledge of flow size information to approximate ideal preemptive Shortest Job First (SJF) with customized network elements 6

The State-of-the-art Solutions PDQ [SIGCOMM’12] pFabric [SIGCOMM’13] PASE [SIGCOMM’14] … All assume prior knowledge of flow size information to approximate ideal preemptive Shortest Job First (SJF) with customized network elements Not feasible for some applications 7

The State-of-the-art Solutions PDQ [SIGCOMM’12] pFabric [SIGCOMM’13] PASE [SIGCOMM’14] … All assume prior knowledge of flow size information to approximate ideal preemptive Shortest Job First (SJF) with customized network elements Hard to deploy in practice 8

Question Without prior knowledge of flow size information, how to minimize FCT in commodity data centers? 9

Design Goal 1 Without prior knowledge of flow size information, how to minimize FCT in commodity data centers? Information-agnostic: not assume a priori knowledge of flow size information available from the applications 10

Design Goal 2 Without prior knowledge of flow size information, how to minimize FCT in commodity data centers? FCT minimization: minimize average and tail FCTs of short flows & not adversely affect FCTs of large flows 11

Design Goal 3 Without prior knowledge of flow size information, how to minimize FCT in commodity data centers? Readily-deployable: work with existing commodity switches & be compatible with legacy network stacks 12

Question Without prior knowledge of flow size information, how to minimize FCT in commodity data centers? Our answer: PIAS 13

PIAS’S DESIGN 14

Design Rationale PIAS performs Multi-Level Feedback Queue (MLFQ) to emulate Shortest Job First Priority 1 Priority 2 Priority K …… High Low 15

Design Rationale PIAS performs Multi-Level Feedback Queue (MLFQ) to emulate Shortest Job First Priority 1 Priority 2 Priority K …… 16

Simple Example Illustrating PIAS Congestion 17

Simple Example Illustrating PIAS Priority 1 Priority 2 Priority 4 Flow 1 with 10 packets and flow 2 with 2 packets arrive Priority 3 18

Simple Example Illustrating PIAS Priority 1 Priority 2 Priority 4 Flow 1 and 2 transmit simultaneously Priority 3 19

Simple Example Illustrating PIAS Priority 1 Priority 2 Priority 4 Flow 2 finishes while flow 1 is demoted to priority 2 Priority 3 20

Simple Example Illustrating PIAS Priority 1 Priority 2 Priority 4 Flow 3 with 2 packets arrives Priority 3 21

Simple Example Illustrating PIAS Priority 1 Priority 2 Priority 4 Flow 3 and 1 transmit simultaneously Priority 3 22

Simple Example Illustrating PIAS Priority 1 Priority 2 Priority 4 Flow 3 finishes while flow 1 is demoted to priority 3 Priority 3 23

Simple Example Illustrating PIAS Priority 1 Priority 2 Priority 4 Flow 4 with 2 packets arrives Priority 3 24

Simple Example Illustrating PIAS Priority 1 Priority 2 Priority 4 Flow 4 and 1 transmit simultaneously Priority 3 25

Simple Example Illustrating PIAS Priority 1 Priority 2 Priority 4 Priority 3 Flow 4 finishes while flow 1 is demoted to priority 4 26

Simple Example Illustrating PIAS Priority 1 Priority 2 Priority 4 Priority 3 Eventually, flow 1 finishes in priority 4 With MLFQ, PIAS can emulate Shortest Job First without prior knowledge of flow size information 27

How to implement? Priority 1 Priority 2 Priority K …… Strict priority queueing on switches Packet tagging as a shim layer at end hosts 28

How to implement? Priority 1 Priority 2 Priority K …… Strict priority queueing on switches Packet tagging as a shim layer at end hosts

How to implement? Priority 1 Priority 2 Priority K …… Strict priority queueing on switches Packet tagging as a shim layer at end hosts 30

How to implement? Priority 1 Priority 2 Priority K …… Strict priority queueing on switches Packet tagging as a shim layer at end hosts

Thresholds depend on: – Flow size distribution – Traffic load Solution: – Solve a FCT minimization problem to calculate demotion thresholds Problem: – Traffic is highly dynamic Determine Thresholds Traffic variations -> Mismatched thresholds 32

Impact of Mismatches High Low 10MB 20KB 33

When the threshold is perfect (20KB) Impact of Mismatches 34

When the threshold is too small (10KB) Impact of Mismatches Increased latency for short flows 35

When the threshold is too large (1MB) Impact of Mismatches Increased latency for short flows 36

When the threshold is too large (1MB) Impact of Mismatches Increased latency for short flows Leverage ECN to keep low buffer occupation 37

When the threshold is too small (10KB) Handle Mismatches If we enable ECN 38

When the threshold is too small (10KB) Handle Mismatches ECN can keep low latency 39

When the threshold is too large (1MB) Handle Mismatches If we enable ECN 40

When the threshold is too large (1MB) Handle Mismatches ECN can keep low latency 41

PIAS in 1 Slide PIAS packet tagging – Maintain flow states and mark packets with priority PIAS switches – Enable strict priority queueing and ECN PIAS rate control – Employ Data Center TCP to react to ECN 42

Testbed Experiments PIAS prototype – Testbed Setup – A Gigabit Pronto-3295 switch – 16 Dell servers Benchmarks – Web search (DCTCP paper) – Data mining (VL2 paper) – Memcached 43

Small Flows (<100KB) Web SearchData Mining Compared to DCTCP, PIAS reduces average FCT of small flows by up to 47% and 45% 47% 45% 44

NS2 Simulation Setup Topology – 144-host leaf-spine fabric with 10G/40G links Workloads – Web search (DCTCP paper) – Data mining (VL2 paper) Schemes – Information-agnostic: PIAS, DCTCP and L2DCT – Information-aware: pFabric 45

Overall Performance Web SearchData Mining PIAS has an obvious advantage over DCTCP and L2DCT in both workloads. 46

Small Flows (<100KB) Simulations confirm testbed experiment results Web SearchData Mining 40% - 50% improvement 47

Comparison with pFabric PIAS only has 4.9% performance gap to pFabric for small flows in data mining workload Web SearchData Mining 48

Conclusion PIAS: practical and effective – Not assume flow information from applications – Enforce Multi-Level Feedback Queue scheduling – Use commodity switches & legacy network stacks Information-agnostic FCT minimization Readily deployable 49

Thanks! 50

Starvation Measurement – 5000 flows, 5.7 million MTU-sized packets – 200 timeouts, 31 two consecutive timeouts Solutions – Per-port ECN pushes back high priority flows when many low priority flow get starved – Treating a long-term starved flow as a new flow 51

Persistent Connections Solution: periodically reset flow states based on more behaviors of traffic – When a flow idles for some time, we reset the bytes sent of this flow to 0. – Define a flow as packets demarcated by incoming packets with payload within a single connection 52