Data Center Networks and Switching and Queueing

Slides:



Advertisements
Similar presentations
Congestion Control and Fairness Models Nick Feamster CS 4251 Computer Networking II Spring 2008.
Advertisements

1 CONGESTION CONTROL. 2 Congestion Control When one part of the subnet (e.g. one or more routers in an area) becomes overloaded, congestion results. Because.
TELE202 Lecture 8 Congestion control 1 Lecturer Dr Z. Huang Overview ¥Last Lecture »X.25 »Source: chapter 10 ¥This Lecture »Congestion control »Source:
1.  Congestion Control Congestion Control  Factors that Cause Congestion Factors that Cause Congestion  Congestion Control vs Flow Control Congestion.
Congestion Control Created by M Bateman, A Ruddle & C Allison As part of the TCP View project.
PHY Covert Channels: Can you see the Idles? Ki Suh Lee Cornell University Joint work with Han Wang, and Hakim Weatherspoon 1 첩자첩자 Chupja.
4-1 Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving side, delivers.
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
1 689 Lecture 2 Review of Last Lecture Networking basics TCP/UDP review.
1 TCP Transport Control Protocol Reliable In-order delivery Flow control Responds to congestion “Nice” Protocol.
L13: Sharing in network systems Dina Katabi Spring Some slides are from lectures by Nick Mckeown, Ion Stoica, Frans.
Timing is Everything: Accurate, Minimum Overhead, Available Bandwidth Estimation in High-speed Wired Networks Han Wang, Ki Suh Lee, Erluo Li, Chiun Lin.
Data Center Traffic and Measurements: Available Bandwidth Estimation Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance.
Chapter 4 Queuing, Datagrams, and Addressing
Computer Networks Performance Metrics. Performance Metrics Outline Generic Performance Metrics Network performance Measures Components of Hop and End-to-End.
1 Lecture 14 High-speed TCP connections Wraparound Keeping the pipeline full Estimating RTT Fairness of TCP congestion control Internet resource allocation.
MaxNet NetLab Presentation Hailey Lam Outline MaxNet as an alternative to TCP Linux implementation of MaxNet Demonstration of fairness, quick.
Chapter 12 Transmission Control Protocol (TCP)
11 Experimental and Analytical Evaluation of Available Bandwidth Estimation Tools Cesar D. Guerrero and Miguel A. Labrador Department of Computer Science.
Forwarding.
Efficient Gigabit Ethernet Switch Models for Large-Scale Simulation Dong (Kevin) Jin David Nicol Matthew Caesar University of Illinois.
Spring Computer Networks1 Congestion Control Sections 6.1 – 6.4 Outline Preliminaries Queuing Discipline Reacting to Congestion Avoiding Congestion.
Network Layer4-1 Chapter 4 Network Layer All material copyright J.F Kurose and K.W. Ross, All Rights Reserved Computer Networking: A Top Down.
Ethernet Packet Filtering – Part 2 Øyvind Holmeide 10/28/2014 by.
Graciela Perera Department of Computer Science and Information Systems Slide 1 of 18 INTRODUCTION NETWORKING CONCEPTS AND ADMINISTRATION CSIS 3723 Graciela.
Chapter 3 Part 3 Switching and Bridging
Congestion Control in Data Networks and Internets
Chapter 4 Network Layer All material copyright
Topics discussed in this section:
Routing and Switching Fabrics
Empirically Characterizing the Buffer Behaviour of Real Devices
Buffer Management in a Switch
Queue Management Jennifer Rexford COS 461: Computer Networks
Transport Protocols over Circuits/VCs
EEC-484/584 Computer Networks
Chapter 4: Network Layer
Chapter 3 Part 3 Switching and Bridging
CS 1652 The slides are adapted from the publisher’s material
Chapter 4-1 Network layer
CONGESTION CONTROL.
Transport Layer Unit 5.
What’s “Inside” a Router?
Lecture 19 – TCP Performance
Network Core and QoS.
EEC-484/584 Computer Networks
Congestion Control in Data Networks and Internets
Congestion Control (from Chapter 05)
COS 461: Computer Networks
PRESENTATION COMPUTER NETWORKS
Computer Science Division
EEC-484/584 Computer Networks
Congestion Control (from Chapter 05)
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Chapter 4 Network Layer Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April Network Layer.
Distributed Systems CS
Congestion Control (from Chapter 05)
Chapter 3 Part 3 Switching and Bridging
TCP Congestion Control
Network Performance Definitions
Congestion Control (from Chapter 05)
Congestion Control (from Chapter 05)
Computer Networks: Switching and Queuing
Congestion Control (from Chapter 05)
Routing and Switching Fabrics
Congestion Control (from Chapter 05)
Congestion Control (from Chapter 05)
Chapter 4: Network Layer
Network Core and QoS.
Lecture 6, Computer Networks (198:552)
Distributed Systems CS
Presentation transcript:

Data Center Networks and Switching and Queueing Hakim Weatherspoon Associate Professor, Dept of Computer Science CS 5413: High Performance Computing and Networking March 6, 2017

Network Queueing What happens when a burst of packets arrive at a switch destined to the same output port? switch fabric switch fabric at t, packets more from input to output one packet time later buffering when arrival rate via switch exceeds output line speed queueing (delay) and loss due to output port buffer overflow!

Network Queueing TCP congestion control and avoidance attempts to share bandwidth and prevent buffering (queueing) switch fabric switch fabric at t, packets more from input to output one packet time later Ideally, each flow sends at its “fair share”, 1/n the network path capacity TCP uses packet drops to signal congestion and flows adjust rates, AIMD How do we proactively estimate available bandwidth?

Available Bandwidth Estimation Basic building block Network Protocol Networked Systems Distributed Systems Which of the two paths has more available bandwidth? 1’20” How do I measure with minimum overhead? End-to-end: How to measure without access to anything in the network?

Available Bandwidth Estimation Passive Measurement Polling counters: Port Stats or Flow Stats Active Measurement Probe Packets: Packet Pair or Packet Train Probe Packets: We are going to use active probe because we assume autonomous system.

Active Measurement Narrow link: least capacity Tight link: least available bandwidth Narrow link Quick example Narrow link is 2 Gbps Tight link is Network Capacity Cross Traffic Tight link Available Bandwidth

Active Measurement Estimate available bandwidth by saturating the tight link Saturate/Congest Tight Link Estimate Available Bandwidth? Packet Queuing Delays Increase Measuring (Increased) Queuing Delay Narrow link Network Capacity Cross Traffic Tight link Available Bandwidth 7

Active Measurement Estimate available bandwidth by saturating the tight link LA8 < LB8, congestion! LA8 LB8 8Gbps 6Gbps LA4 == LB4, no congestion LA4 LB4 4Gbps LA1 == LB1, no congestion LA1 LB1 1Gbps Rate Probe Train # Narrow link Can also put the actual figure up. ----- Meeting Notes (11/6/14 13:26) ----- Circle the knee, 4 figures. Network Capacity Cross Traffic Tight link Available Bandwidth

By measuring the increase in packet train length By measuring the increase in packet train length*, we can compute the queuing delay experienced, hence estimate the available bandwidth. * Increase in packet train length == increase in sum of inter-packet gap Increase

Challenge Cannot Control at 100ps Cannot Measure at 100ps LA8 LB8 State-of-art (software) tools do not work at high speed because they cannot control and capture inter-packet spacing with required precision. ? Summary: two basic issues. At the sender end, in order to generate accurate probe rate, need fine control of packet timing at 100ps. What can commodity machine do? forwarding. Incoming rate is accurately controlled with increase rate. Linux kernel forwarding lost control on timing. Similarly, intel dpdk does better at lower data rate, but still have no control over high data rate. Takeaway, with commodity machine, obtaining the level of control to generate accurate probe rate is not possible. Similarly, the problem of measuring packet timing at the receiver end is also difficult. ----- Meeting Notes (11/6/14 13:57) ----- 10Gbps every bit is 100ps. In userspace we generated packet with instantenous rate from 0.1 Gbps, each pair of packet Experiment is simple, we have one forwarding element. The batching slow down and clear. Two lines to emphasize

Accurate available bandwidth estimation [IMC 2014] Application Transport Network Data Link Physical Packet i Packet i+1 Packet i+2 ----- Meeting Notes (10/17/14 17:10) ----- Modulating packet at PHY layer ----- Meeting Notes (10/20/14 16:35) ----- always sending bits fill with idle. key is access / control idle in phy layer given this, we can precisely generate packet train, and measure packet train. ----- Meeting Notes (10/21/14 21:24) ----- Need to mention that it is sufficient to measure with packet gap

Accurate available bandwidth estimation [IMC 2014] Idle Characters (/I/) Each bit ~100 picoseconds 7~8 bit special character in the physical layer 700~800 picoseconds to transmit Only in PHY Application Transport Network Data Link Physical Packet i Packet i+1 Packet i+2 ----- Meeting Notes (10/17/14 17:10) ----- Modulating packet at PHY layer ----- Meeting Notes (10/20/14 16:35) ----- always sending bits fill with idle. key is access / control idle in phy layer given this, we can precisely generate packet train, and measure packet train. ----- Meeting Notes (10/21/14 21:24) ----- Need to mention that it is sufficient to measure with packet gap

Accurate available bandwidth estimation [IMC 2014] Probe Generation 1500 B 382 /I/s 8Gbps Rate 8.0Gbps 1500 B 2290 /I/s 4Gbps 4.0Gbps 1.0Gbps Emphasize exact amount. That is what makes us accurate. 1500 B 11374 /I/s 1Gbps Probe # By modulating Inter-packet gap at PHY layer, we can generate accurate probe rate.

Questions: Can MinProbe accurately estimate avail-bw at 10Gbps? Can existing estimation algorithms work with MinProbe? How do the following parameters affect accuracy? Packet train length probe packet size distribution cross packet size distribution cross packet burstiness Does MinProbe work in the wild, Internet? Does MinProbe work in rate limiting environments? Can MinProbe accurately estimate avail-bw at 10Gbps? Can existing estimation algorithms work with MinProbe? How do the following parameters affect accuracy? Packet train length probe packet size distribution cross packet size distribution cross packet burstiness Does MinProbe work in the wild, Internet? Does MinProbe work in rate limiting environments?

Experiment Setup Controlled Environment National Lambda Rail NYC Cornell(NYC) Boston Chicago Cleveland Controlled Environment National Lambda Rail

Accurate available bandwidth estimation [IMC 2014] Cross Traffic Probe Traffic Probe Traffic MinProbe tight link MinProbe 10Gbps Cross Traffic, paced at 1.0, 3.0, 6.0, 8.0Gbps

Accurate available bandwidth estimation [IMC 2014] Cross Traffic Probe Traffic 0.1Gbps Rate … 9.6 Gbps 0.2Gbps MinProbe tight link MinProbe 10Gbps Cross Traffic, paced at 1.0, 3.0, 6.0, 8.0Gbps

Accurate available bandwidth estimation [IMC 2014] 0.1Gbps Rate … 9.6 Gbps 0.2Gbps ----- Meeting Notes (11/6/14 14:13) ----- Show blue Show orange, Show little green Show all green. Available Bandwidth = 2Gbps Cross Traffic 8Gbps 0.1Gbps 8.1Gbps 10Gbps

Accurate available bandwidth estimation [IMC 2014] 0.1Gbps Rate … 9.6 Gbps 0.2Gbps ----- Meeting Notes (11/6/14 14:13) ----- Show blue Show orange, Show little green Show all green. Available Bandwidth = 2Gbps Cross Traffic 8Gbps 0.2Gbps 8.2Gbps 10Gbps

Accurate available bandwidth estimation [IMC 2014] 0.1Gbps Rate … 9.6 Gbps 0.2Gbps ----- Meeting Notes (11/6/14 14:13) ----- Show blue Show orange, Show little green Show all green. Available Bandwidth = 2Gbps Cross Traffic 8Gbps 0.3Gbps 8.3Gbps 10Gbps

Accurate available bandwidth estimation [IMC 2014] 0.1Gbps Rate … 9.6 Gbps 0.2Gbps ----- Meeting Notes (11/6/14 14:13) ----- Show blue Show orange, Show little green Show all green. 5.0Gbps Queued 10.0Gbps 6.0Gbps Queued 10.0Gbps 4.0Gbps Queued 10.0Gbps 2.0Gbps Queued 10.0Gbps 1.0Gbps 9.0Gbps 3.0Gbps Queued 10.0Gbps 1.9Gbps 9.9Gbps Available Bandwidth = 2Gbps Cross Traffic 8Gbps 10Gbps

Accurate available bandwidth estimation [IMC 2014] 0.1Gbps Rate … 9.6 Gbps 0.2Gbps Available Bandwidth = 4Gbps Cross Traffic 6Gbps ----- Meeting Notes (11/6/14 14:13) ----- Show blue Show orange, Show little green Show all green. 10Gbps

Accurate available bandwidth estimation [IMC 2014] 0.1Gbps Rate … 9.6 Gbps 0.2Gbps Available Bandwidth = 7Gbps Cross Traffic 3Gbps ----- Meeting Notes (11/6/14 14:13) ----- Show blue Show orange, Show little green Show all green. 10Gbps

Accurate available bandwidth estimation [IMC 2014] 0.1Gbps Rate … 9.6 Gbps 0.2Gbps Available Bandwidth = 9Gbps Cross Traffic 1Gbps ----- Meeting Notes (11/6/14 14:13) ----- Show blue Show orange, Show little green Show all green. 10Gbps

Accurate available bandwidth estimation [IMC 2014] ----- Meeting Notes (1/24/14 10:03) ----- Robustness: Forward error correction ----- Meeting Notes (10/20/14 16:35) ----- Map Show one at a time. ----- Meeting Notes (11/6/14 14:13) ----- Make this bigger. 10Gbps

Accurate available bandwidth estimation [IMC 2014] Chicago Boston Cleveland Cornell(NYC) NYC ----- Meeting Notes (1/24/14 10:03) ----- Robustness: Forward error correction ----- Meeting Notes (10/20/14 16:35) ----- Map Show one at a time. Cornell(Ithaca) 10Gbps

Questions: Can MinProbe accurately estimate avail-bw at 10Gbps? Can existing estimation algorithms work with MinProbe? How do the following parameters affect accuracy? Packet train length probe packet size distribution cross packet size distribution cross packet burstiness Does MinProbe work in the wild, Internet? Does MinProbe work in rate limiting environments?

Perspective Modulation of probe packets in PHY Accurate control & measure of packet timing Enabled available bandwidth estimation in 10Gbps Accurate, Minimum Overhead, Available Bandwidth Estimation in High Speed Wired Networks

Before Next time Background for next time (not required reading) Project Continue to make progress. Intermediate project report due Mar 22. BOOM proposal due Mar 29. HW2 Chat Server Due this Friday, March 10 Background for next time (not required reading) Queues Don’t Matter When You Can JUMP Them!, Matthew P. Grosvenor, Malte Schwarzkopf, Ionel Gog, Robert N. M. Watson, Andrew W. Moore, Steven Hand, and Jon Crowcroft. Appears in the Proceedings of the USENIX symposium on Networked Systems Design and Implementation (NSDI), pp 1--14, May 2015. Check website for updated schedule