1 Internet Routers Stochastics Network Seminar February 22 nd 2002 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.

Slides:



Advertisements
Similar presentations
EE384y: Packet Switch Architectures
Advertisements

Router Internals CS 4251: Computer Networking II Nick Feamster Spring 2008.
Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.
1 Scheduling Crossbar Switches Who do we chose to traverse the switch in the next time slot? N N 11.
1 Outline  Why Maximal and not Maximum  Definition and properties of Maximal Match  Parallel Iterative Matching (PIM)  iSLIP  Wavefront Arbiter (WFA)
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 High Speed Router Design Shivkumar Kalyanaraman Rensselaer Polytechnic Institute
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
Routers with a Single Stage of Buffering Sundar Iyer, Rui Zhang, Nick McKeown High Performance Networking Group, Stanford University,
Towards Simple, High-performance Input-Queued Switch Schedulers Devavrat Shah Stanford University Berkeley, Dec 5 Joint work with Paolo Giaccone and Balaji.
A Scalable Switch for Service Guarantees Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Algorithm Orals Algorithm Qualifying Examination Orals Achieving 100% Throughput in IQ/CIOQ Switches using Maximum Size and Maximal Matching Algorithms.
Making Parallel Packet Switches Practical Sundar Iyer, Nick McKeown Departments of Electrical Engineering & Computer Science,
1 Input Queued Switches: Cell Switching vs. Packet Switching Abtin Keshavarzian Joint work with Yashar Ganjali, Devavrat Shah Stanford University.
1 Comnet 2006 Communication Networks Recitation 5 Input Queuing Scheduling & Combined Switches.
1 Architectural Results in the Optical Router Project Da Chuang, Isaac Keslassy, Nick McKeown High Performance Networking Group
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion MSM.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion The.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scaling.
EE 122: Router Design Kevin Lai September 25, 2002.
IEE, October 2001Nick McKeown1 High Performance Routers Slides originally by Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Introduction.
Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,
1 EE384Y: Packet Switch Architectures Part II Load-balanced Switches Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
Maximum Size Matchings & Input Queued Switches Sundar Iyer, Nick McKeown High Performance Networking Group, Stanford University,
1 Trend in the design and analysis of Internet Routers University of Pennsylvania March 17 th 2003 Nick McKeown Professor of Electrical Engineering and.
1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,
1 Netcomm 2005 Communication Networks Recitation 5.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Maximal.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scheduling.
1 Growth in Router Capacity IPAM, Lake Arrowhead October 2003 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
Router Architectures An overview of router architectures.
1 Scheduling Crossbar Switches Who do we chose to traverse the switch in the next time slot? N N 11.
Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook.
1 IP routers with memory that runs slower than the line rate Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford.
Computer Networks Switching Professor Hui Zhang
Load Balanced Birkhoff-von Neumann Switches
Professor Yashar Ganjali Department of Computer Science University of Toronto
CS 552 Computer Networks IP forwarding Fall 2005 Rich Martin (Slides from D. Culler and N. McKeown)
Summary of switching theory Balaji Prabhakar Stanford University.
EE384y EE384Y: Packet Switch Architectures Part II Scaling Crossbar Switches Nick McKeown Professor of Electrical Engineering and Computer Science,
Routers. These high-end, carrier-grade 7600 models process up to 30 million packets per second (pps).
ISLIP Switch Scheduler Ali Mohammad Zareh Bidoki April 2002.
Packet Forwarding. A router has several input/output lines. From an input line, it receives a packet. It will check the header of the packet to determine.
1 Performance Guarantees for Internet Routers ISL Affiliates Meeting April 4 th 2002 Nick McKeown Professor of Electrical Engineering and Computer Science,
Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant.
An Introduction to Packet Switching Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University
Winter 2006EE384x1 EE384x: Packet Switch Architectures I a) Delay Guarantees with Parallel Shared Memory b) Summary of Deterministic Analysis Nick McKeown.
Winter 2006EE384x Handout 11 EE384x: Packet Switch Architectures Handout 1: Logistics and Introduction Professor Balaji Prabhakar
IEE, October 2001Nick McKeown1 High Performance Routers IEE, London October 18 th, 2001 Nick McKeown Professor of Electrical Engineering and Computer Science,
Buffered Crossbars With Performance Guarantees Shang-Tse (Da) Chuang Cisco Systems EE384Y Thursday, April 27, 2006.
SNRC Meeting June 7 th, Crossbar Switch Scheduling Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University
1 A quick tutorial on IP Router design Optics and Routing Seminar October 10 th, 2000 Nick McKeown
1 How scalable is the capacity of (electronic) IP routers? Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University
Improving Matching algorithms for IQ switches Abhishek Das John J Kim.
Topics in Internet Research: Project Scope Mehreen Alam
Packet Switch Architectures The following are (sometimes modified and rearranged slides) from an ACM Sigcomm 99 Tutorial by Nick McKeown and Balaji Prabhakar,
The Fork-Join Router Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University
Input buffered switches (1)
Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai.
1 Building big router from lots of little routers Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University.
scheduling for local-area networks”
EE384Y: Packet Switch Architectures Scaling Crossbar Switches
Weren’t routers supposed
Packet Forwarding.
Addressing: Router Design
Packet Scheduling/Arbitration in Virtual Output Queues and Others
Outline Why Maximal and not Maximum
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Scheduling Crossbar Switches
Techniques and problems for
Presentation transcript:

1 Internet Routers Stochastics Network Seminar February 22 nd 2002 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University

2 What a Router Looks Like Cisco GSR 12416Juniper M160 6ft 19” 2ft Capacity: 160Gb/s Power: 4.2kW 3ft 2.5ft 19” Capacity: 80Gb/s Power: 2.6kW

3 Points of Presence (POPs) A B C POP1 POP3 POP2 POP4 D E F POP5 POP6 POP7 POP8

4 Basic Architectural Components of an IP Router Control Plane Datapath per-packet processing Switching Forwarding Table Routing Table Routing Protocols

5 Per-packet processing in an IP Router 1. Accept packet arriving on an ingress line. 2. Lookup packet destination address in the forwarding table, to identify outgoing interface(s). 3. Manipulate packet header: e.g., decrement TTL, update header checksum. 4. Send packet to outgoing interface(s). 5. Queue until line is free. 6. Transmit packet onto outgoing line.

6 Generic Router Architecture Lookup IP Address Update Header Header Processing DataHdrDataHdr ~1M prefixes Off-chip DRAM Address Table Address Table IP AddressNext Hop Queue Packet Buffer Memory Buffer Memory ~1M packets Off-chip DRAM

7 Generic Router Architecture Lookup IP Address Update Header Header Processing Address Table Address Table Lookup IP Address Update Header Header Processing Address Table Address Table Lookup IP Address Update Header Header Processing Address Table Address Table Buffer Manager Buffer Memory Buffer Memory Buffer Manager Buffer Memory Buffer Memory Buffer Manager Buffer Memory Buffer Memory

8 Packet processing is getting harder CPU Instructions per minimum length packet since 1996

9 Performance metrics 1. Capacity  “maximize C, s.t. volume < 2m 3 and power < 5kW” 2. Throughput  Operators like to maximize usage of expensive long-haul links.  This would be trivial with work-conserving output-queued routers 3. Controllable Delay  Some users would like predictable delay.  This is feasible with output-queueing plus weighted fair queueing (WFQ). WFQ

10 The Problem  Output queued switches are impractical R R R R DRAM NR data R R R R output 1 N Can’t I just use N separate memory devices per output?

11 Memory Bandwidth Commercial DRAM 1. It’s hard to keep up with Moore’s Law:  The bottleneck is memory speed.  Memory speed is not keeping up with Moore’s Law. DRAM 1.1x / 18months Moore’s Law 2x / 18 months Router Capacity 2.2x / 18months Line Capacity 2x / 7 months

12 Generic Router Architecture Lookup IP Address Update Header Header Processing Address Table Address Table Lookup IP Address Update Header Header Processing Address Table Address Table Lookup IP Address Update Header Header Processing Address Table Address Table Queue Packet Buffer Memory Buffer Memory Queue Packet Buffer Memory Buffer Memory Queue Packet Buffer Memory Buffer Memory 1 2 N 1 2 N Scheduler

13 Outline of next two talks  What’s known about throughput  Today: Survey of ways to achieve 100% throughput  What’s known about controllable delay  Next week (Sundar): Controlling delay in routers with a single stage of buffering.

14 Potted history 1. [Karol et al. 1987] Throughput limited to by head- of-line blocking for Bernoulli IID uniform traffic. 2. [Tamir 1989] Observed that with “Virtual Output Queues” (VOQs) Head-of-Line blocking is reduced and throughput goes up.

15 Potted history 3. [Anderson et al. 1993] Observed analogy to maximum size matching in a bipartite graph. 4. [M et al. 1995] (a) Maximum size match can not guarantee 100% throughput. (b) But maximum weight match can – O(N 3 ). 5. [Mekkittikul and M 1998] A carefully picked maximum size match can give 100% throughput. Matching O(N 2.5 )

16 Potted history Speedup 5. [Chuang, Goel et al. 1997] Precise emulation of a central shared memory switch is possible with a speedup of two and a “stable marriage” scheduling algorithm. 6. [Prabhakar and Dai 2000] 100% throughput possible for maximal matching with a speedup of two.

17 Potted history Newer approaches 7. [Tassiulas 1998] 100% throughput possible for simple randomized algorithm with memory. 8. [Giaccone et al. 2001] “Apsara” algorithms. 9. [Iyer and M 2000] Parallel switches can achieve 100% throughput and emulate an output queued switch. 10. [Chang et al. 2000] A 2-stage switch with a TDM scheduler can give 100% throughput. 11. [Iyer, Zhang and M 2002] Distributed shared memory switches can emulate an output queued switch.

18 Scheduling crossbar switches to achieve 100% throughput 1. Basic switch model. 2. When traffic is uniform (Many algorithms…) 3. When traffic is non-uniform, but traffic matrix is known. Technique: Birkhoff-von Neumann decomposition. 4. When matrix is not known. Technique: Lyapunov function. 5. When algorithm is pipelined, or information is incomplete. Technique: Lyapunov function. 6. When algorithm does not complete. Technique: Randomized algorithm. 7. When there is speedup. Technique: Fluid model. 8. When there is no algorithm. Technique: 2-stage load-balancing switch. Technique: Parallel Packet Switch.

19 Basic Switch Model A 1 (n) S(n) N N L NN (n) A 1N (n) A 11 (n) L 11 (n) 11 A N (n) A NN (n) A N1 (n) D 1 (n) D N (n)

20 Some definitions 3. Queue occupancies: Occupancy L 11 (n) L NN (n)

21 Some definitions of throughput When traffic is admissible

22 Scheduling algorithms to achieve 100% throughput 1. Basic switch model. 2. When traffic is uniform (Many algorithms…) 3. When traffic is non-uniform, but traffic matrix is known Technique: Birkhoff-von Neumann decomposition. 4. When matrix is not known. Technique: Lyapunov function. 5. When algorithm is pipelined, or information is incomplete. Technique: Lyapunov function. 6. When algorithm does not complete. Technique: Randomized algorithm. 7. When there is speedup. Technique: Fluid model. 8. When there is no algorithm. Technique: 2-stage load-balancing switch. Technique: Parallel Packet Switch.

23 Algorithms that give 100% throughput for uniform traffic  Quite a few algorithms give 100% throughput when traffic is uniform 1  For example:  Maximum size bipartite match.  Maximal size match (e.g. PIM, iSLIP, WFA)  Deterministic and a few variants  Wait-until-full 1. “Uniform”: the destination of each cell is picked independently and uniformly and at random (uar) from the set of all outputs.

24 Maximum size bipartite match  Intuition: maximizes instantaneous throughput  for uniform traffic. L 11 (n)>0 L N1 (n)>0 “Request” Graph Bipartite Match Maximum Size Match

25 Aside: Maximal Matching  A maximal matching is one in which each edge is added one at a time, and is not later removed from the matching.  i.e. no augmenting paths allowed (they remove edges added earlier).  No input and output are left unnecessarily idle.

26 Aside: Example of Maximal Size Matching A1 B C D E F A1 B C D E F Maximal Matching Maximum Matching

27 Algorithms that give 100% throughput for uniform traffic  Quite a few algorithms give 100% throughput when traffic is uniform  For example:  Maximum size bipartite match.  Maximal size match (e.g. PIM, iSLIP, WFA)  Determinstic and a few variants  Wait-until-full

28 Deterministic Scheduling Algorithm If arriving traffic is i.i.d with destinations picked uar across outputs, then a round-robin schedule gives 100% throughput. A1 B C D B C D B C D A1 A1 Variation 1: if permutations are picked uar from the set of N! permutations, this too will also give 100% throughput. Variation 2: if permutations are picked uar from the permutations above, this too will give 100% throughput.

29 A Simple wait-until-full algorithm The following algorithm appears to be stable for Bernoulli i.i.d. uniform arrivals: 1.If any VOQ is empty, do nothing (i.e. serve no queues). 2.If no VOQ is empty, pick a permutation uar across either (sequence of permutations, or all permutations).

30 Some simple algorithms that achieve 100% throughput

31 Some observations  A maximum size match (MSM) maximizes instantaneous throughput.  But a MSM is complex – O(N 2.5 ).  It turns out that there are many simple algorithms that give 100% throughput for uniform traffic.  So what happens if the traffic is non- uniform?

32 Why doesn’t maximizing instantaneous throughput give 100% throughput for non- uniform traffic? Three possible matches, S (n):

33 Simulation of simple 3x3 example

34 Scheduling algorithms to achieve 100% throughput 1. Basic switch model. 2. When traffic is uniform (Many algorithms…) 3. When traffic is non-uniform, but traffic matrix is known Technique: Birkhoff-von Neumann decomposition. 4. When matrix is not known. Technique: Lyapunov function. 5. When algorithm is pipelined, or information is incomplete. Technique: Lyapunov function. 6. When algorithm does not complete. Technique: Randomized algorithm. 7. When there is speedup. Technique: Fluid model. 8. When there is no algorithm. Technique: 2-stage load-balancing switch. Technique: Parallel Packet Switch.

35 Example 1: (Trivial) scheduling to achieve 100% throughput  Assume we know the traffic matrix, and the arrival pattern is deterministic:  Then we can simply choose:

36 Example 2:With random arrivals, but known traffic matrix  Assume we know the traffic matrix, and the arrival pattern is random:  Then we can simply choose:  In general, if we know , can we pick a sequence S(n) to achieve 100% throughput?

37 Birkhoff - von Neumann Decomposition Any  can be decomposed into a linear (convex) combination of matrices, ( M 1, …, M r ).

38 In practice…  Unfortunately, we usually don’t know traffic matrix  a priori, so we can:  Measure or estimate , or  Not use .  In what follows, we will assume we don’t know or use .

39 Scheduling algorithms to achieve 100% throughput 1. Basic switch model. 2. When traffic is uniform (Many algorithms…) 3. When traffic is non-uniform, but traffic matrix is known Technique: Birkhoff-von Neumann decomposition. 4. When traffic matrix is not known. Technique: Lyapunov function. 5. When algorithm is pipelined, or information is incomplete. Technique: Lyapunov function. 6. When algorithm does not complete. Technique: Randomized algorithm. 7. When there is speedup. Technique: Fluid model. 8. When there is no algorithm. Technique: 2-stage load-balancing switch. Technique: Parallel Packet Switch.

40 When the traffic matrix is not known

41 Problem

42 Maximum weight matching A 1 (n) N N L NN (n) A 1N (n) A 11 (n) L 11 (n) 11 A N (n) A NN (n) A N1 (n) D 1 (n) D N (n) L 11 (n) L N1 (n) “Request” Graph Bipartite Match S*(n) Maximum Weight Match

43 Outline of Proof

44 Choosing the weight

45 Scheduling algorithms to achieve 100% throughput 1. Basic switch model. 2. When traffic is uniform (Many algorithms…) 3. When traffic is non-uniform, but traffic matrix is known. Technique: Birkhoff-von Neumann decomposition. 4. When matrix is not known. Technique: Lyapunov function. 5. When algorithm is pipelined, or information is incomplete. Technique: Lyapunov function. 6. When algorithm does not complete. Technique: Randomized algorithm. 7. When there is speedup. Technique: Fluid model. 8. When there is no algorithm. Technique: 2-stage load-balancing switch. Technique: Parallel Packet Switch.

46 100% throughput with pipelining

47 100% throughput with incomplete information

48 Scheduling algorithms to achieve 100% throughput 1. Basic switch model. 2. When traffic is uniform (Many algorithms…) 3. When traffic is non-uniform, but traffic matrix is known. Technique: Birkhoff-von Neumann decomposition. 4. When matrix is not known. Technique: Lyapunov function. 5. When algorithm is pipelined, or information is incomplete. Technique: Lyapunov function. 6. When algorithm does not complete. Technique: Randomized algorithm. 7. When there is speedup. Technique: Fluid model. 8. When there is no algorithm. Technique: 2-stage load-balancing switch. Technique: Parallel Packet Switch.

49 Achieving 100% when algorithm does not complete Randomized algorithms: 1. Basic idea (Tassiulas) 2. Reducing delay (Shah, Giaccone and Prabhakar)

50 Scheduling algorithms to achieve 100% throughput 1. Basic switch model. 2. When traffic is uniform (Many algorithms…) 3. When traffic is non-uniform, but traffic matrix is known. Technique: Birkhoff-von Neumann decomposition. 4. When matrix is not known. Technique: Lyapunov function. 5. When algorithm is pipelined, or information is incomplete. Technique: Lyapunov function. 6. When algorithm does not complete. Technique: Randomized algorithm. 7. When there is speedup. Technique: Fluid model. 8. When there is no algorithm. Technique: 2-stage load-balancing switch. Technique: Parallel Packet Switch.

51 Speedup and Combined Input Output Queueing (CIOQ) A 1 (n) S(n) N N L NN (n) A 1N (n) A 11 (n) L 11 (n) 11 A N (n) A NN (n) A N1 (n) D 1 (n) D N (n) With speedup, the matching is performed s times per cell time, and up to s cells are removed from each VOQ. Therefore, output queues are required.

52 Fluid Model [Dai and Prabhakar]

53 Scheduling algorithms to achieve 100% throughput 1. Basic switch model. 2. When traffic is uniform (Many algorithms…) 3. When traffic is non-uniform, but traffic matrix is known. Technique: Birkhoff-von Neumann decomposition. 4. When matrix is not known. Technique: Lyapunov function. 5. When algorithm is pipelined, or information is incomplete. Technique: Lyapunov function. 6. When algorithm does not complete. Technique: Randomized algorithm. 7. When there is speedup. Technique: Fluid model. 8. When there is no algorithm. Technique: 2-stage load-balancing switch.

54 2-stage switch and no scheduler Motivation: 1. If traffic is uniformly distributed, then even a deterministic schedule gives 100% throughput. 2. So why not force non-uniform traffic to be uniformly distributed?

55 2-stage switch and no scheduler S 2 (n) N N L NN (n) L 11 (n) 11 D 1 (n) D N (n) N N 11 A’ 1 (n) A’ N (n) S 1 (n) A 1 (n) A N (n) Bufferless Load-balancing Stage Buffered Switching Stage

56 2-stage switch with no scheduler

57 Scheduling algorithms to achieve 100% throughput 1. Basic switch model. 2. When traffic is uniform (Many algorithms…) 3. When traffic is non-uniform, but traffic matrix is known. Technique: Birkhoff-von Neumann decomposition. 4. When matrix is not known. Technique: Lyapunov function. 5. When algorithm is pipelined, or information is incomplete. Technique: Lyapunov function. 6. When algorithm does not complete. Technique: Randomized algorithm. 7. When there is speedup. Technique: Fluid model. 8. When there is no algorithm. Technique: 2-stage load-balancing switch.

58 Throughput results Theory: Practice: Input Queueing (IQ) Input Queueing (IQ) Input Queueing (IQ) Input Queueing (IQ) 58% [Karol, 1987] IQ + VOQ, Maximum weight matching IQ + VOQ, Maximum weight matching IQ + VOQ, Sub-maximal size matching e.g. PIM, iSLIP. IQ + VOQ, Sub-maximal size matching e.g. PIM, iSLIP. 100% [M et al., 1995] Different weight functions, incomplete information, pipelining. Different weight functions, incomplete information, pipelining. Randomized algorithms 100% [Tassiulas, 1998] 100% [Various] Various heuristics, distributed algorithms, and amounts of speedup Various heuristics, distributed algorithms, and amounts of speedup IQ + VOQ, Maximal size matching, Speedup of two. IQ + VOQ, Maximal size matching, Speedup of two. 100% [Dai & Prabhakar, 2000]

59 Outline of next talk Sundar Iyer  What’s known about controllable delay  Emulation of Output queued switches  PIFOs and WFQ  Single-buffered switches: Parallel packet switches, and distributed shared memory switches.