1 Building big router from lots of little routers Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University
2 Current/Recent Projects 1.Packet Switching: –Scheduling Crossbar Switches –Speedup and QoS –Parallel Packet Switches 2.Router algorithms –IP Lookups –IP Packet Classification 3.Internet –Evolving the Internet to circuit switching and optics –Internet traffic analysis 4. Router Architectures –The Fork Join Router –Incorporating Optics into Routers –The Design of Fast Packet Buffers 5. Teaching Tools –Network hardware lab: NetFPGA –Network software lab: Virtual Router More details:
3 Why might this be interesting? Widely held assumption: Electronic IP routers will not keep up with link capacity. Background: Router Capacity = (number of lines) x (line-rate) Biggest router capacity 4 years ago ~= 10Gb/s Biggest router capacity 2 years ago ~= 40Gb/s Biggest router capacity today ~= 160Gb/s Next couple of generations: ~1-40Tb/s
4 Larger overall routing capacity Processing power Link speed
5 Larger overall routing capacity Do we need more? 0, Fiber Capacity (Gbit/s) TDMDWDM Processing PowerLink Speed (Fiber) 2x / 2 years2x / 7 months Source: SPEC95Int & David Miller, Stanford.
6 Instructions per packet time Instructions per packet What we’d like: (more features) QoS, Multicast, Security, … What will happen
7 What limits a router’s capacity? It’s a packet switch: –Must be able to buffer every packet for an unpredictable amount of time. Hop-by-hop routing: –Once per ~1000bits it must index into a forwarding table with ~100k entries. Limited by memory random access time
8 Larger overall capacity What limits capacity today? Memory bandwidth for packet buffers –Shared memory: B = 2NR –Input queued: B = 2R –Would like: <<R –Perhaps via intelligent load-balancing? Memory bandwidth for IP lookups –Must perform lookup for each packet –B ~ R
9 What I’d like: The building blocks I’d like to use: R RR R R R R R NxN
10 Why this might be a good approach Larger overall routing capacity –Reduces aggregate capacity of any one router Slower memory per router Lower power per router Faster line rates Redundancy Familiarity –“After all, this is how the Internet is built”
11 Why this might be a bad idea Manageability: it’s harder to manage, maintain, upgrade, …, a large number of small systems than one large one. The total space & power might be larger. The interconnect between the routers might be highly redundant. How will it perform? (Throughput) Can it provide delay guarantees? (QoS)
12 The interconnect might be highly redundant and wasteful R R R R 1 2 N “Excess” links contribute to power Today: Big routers are limited by their overall power. Chip-to-chip connections make up approx 50% of the power.
13 I’ll be considering Load Balancing architectures 1 2 … … k R R R R/k
14 Method #1: Random packet load- balancing Method: As packets arrive they are randomly distributed, packet by packet over each router. Advantages: –Load-balancer is simple –Load-balancer needs no packet buffering Disadvantages: –Random fluctuations in traffic each router is loaded differently Packets within a flow may become mis-sequenced It is hard to predict the system performance
15 Method #2: Random flow load- balancing Method: Each new flow (e.g. TCP connection) is randomly assigned to a router. All packets in a flow follow the same path. Advantages: –Load-balancer is simple (e.g. hashing of flow ID). –Load-balancer needs no packet buffering. –No mis-sequencing of packets within a flow. Disadvantages: –Random fluctuations in traffic each router is loaded differently It is hard to predict the system performance
16 Observations Random load-balancing: It’s hard to predict system performance. Flow-by-flow load-balancing: Worst-case performance is very poor. If designers, system builders, network operators etc. need to know the worst case performance, random load-balancing will not suffice.
17 Method #3: Intelligent packet load-balancing Goal: Each packet is carefully assigned to a middle-stage router so that: Packets within a flow are not mis- sequenced. Throughput is maximized and understood. Delay of packets are controlled. We call this “Parallel Packet Switching”
18 Method #3: Intelligent packet load- balancing Parallel Packet Switching 1 2 k 1 N rate, R 1 N Router Bufferless R/k A packet keeps a link at speed R/k busy for k times longer than a link of speed R.
19 Parallel Packet Switching Advantages –Single-stage of buffering –k power per subsystem –k memory bandwidth –k lookup rate
20 Parallel Packet Switch Questions –Switching: What is the performance? –IP Lookups: How do they work?
21 A Parallel Packet Switch 1 N rate, R 1 N Output Queued Switch Output Queued Switch Output Queued Switch 1 2 k Arriving packet tagged with egress port
22 Performance Questions 1.Can its outputs be busy all the time? i.e. can it be work-conserving? Can it achieve 100% throughput? 2.Can it emulate a single big shared memory switch? 3.Can it support delay guarantees, strict-priorities, WFQ, …?
23 Work Conservation and 100% Throughput rate, R k 1 R/k Input Link Constraint Output Link Constraint Shared memory switch Shared memory switch Shared memory switch
24 Work Conservation rate, R k 1 R/k Output Link Constraint
25 Work Conservation 1 N rate, R 1 N Shared Memory Switch Shared Memory Switch Shared Memory Switch 1 2 k S(R/k)
26 Precise Emulation of a Shared Memory Switch NN Shared Memory Switch 1 N Parallel Packet Switch = ? 1 N 1 N
27 Parallel Packet Switch Theorems 1.If S > 2k/(k+2) 2 then a parallel packet switch can be work- conserving for all traffic. 2.If S > 2k/(k+2) 2 then a parallel packet switch can precisely emulate a FCFS shared memory switch for all traffic.
28 Parallel Packet Switch Theorems 3. If S > 3k/(k+3) 3 then a parallel packet switch can precisely emulate a switch with WFQ, strict priorities, and other types of QoS, for all traffic.
29 Parallel Packet Switch Theorems 4. If S = 1 then a parallel packet switch with a small co-ordination buffer at rate R, can precisely emulate a FCFS shared memory switch for all traffic.
30 Co-ordination buffers rate, R Shared Memory Switch Shared Memory Switch Shared Memory Switch 1 2 k R/k Size Nk
31 Parallel Packet Switch Questions –Switching: What is the performance? –Forwarding Lookups: How do they work?
32 Parallel Packet Switch Lookahead Lookups Packet tagged with egress port at next router Lookup performed in parallel at rate R/k
33 Parallel Packet Switch 1 2 k 1 N rate, R 1 N Router Possibly >100Tb/s aggregate capacity Linerates in excess of 100Gb/s