Download presentation
Presentation is loading. Please wait.
Published byRosemary Ashlee Bryan Modified over 6 years ago
1
INF5050 – Protocols and Routing in Internet (Friday 3.3.2017)
Subject: IP-router architecture Presented by Tor Skeie
2
This presentation is based on slides from Nick McKeown, with updates
Professor of Electrical Engineering and Computer Science, Stanford University Stanford High Performance Networking group: 2
3
Outline Background Architectures and techniques The Future
What is a router? Why do we need faster routers? Why are they hard to build? Architectures and techniques The evolution of router architecture. IP address lookup. Packet buffering. Switching. The Future 3
4
What is Routing? R3 A B C R1 R2 R4 D E F R5 R5 F R3 E D Next Hop
Destination 4
5
What is Routing? R3 A B C R1 R2 R4 D E F R5 32 Data Options (if any)
16 32 4 1 Data Options (if any) Destination Address Source Address Header Checksum Protocol TTL Fragment Offset Flags Fragment ID Total Packet Length T.Service HLen Ver 20 bytes D D D R5 F R3 E D Next Hop Destination 5
6
Points of Presence (POPs)
A B C POP1 POP3 POP2 POP4 D E F POP5 POP6 POP7 POP8 6
7
Where High Performance Routers are Used
(400 Gb/s) R2 (400 Gb/s) (2.5 Gb/s) R1 R6 R5 R4 R3 R7 R9 R10 R8 R11 R12 R14 R13 R16 R15 (400 Gb/s) (400 Gb/s) 7
8
What a Router Looks Like
Cisco CRS-X (CRS-X 16 slot single-shelf on picture) Juniper M320 (M160 on picture) 2.14m 0.60m 0.91m Capacity: 12.8Tb/s Power: 11.2kW Weight: 723kg 0.88m 0.65m 0.44m Capacity: 160Gb/s Power: 3.5kW Capacity is sum of rates of linecards 8
9
Some Multi-rack Routers
Juniper TX8/T640 TX8 Alcatel 7670 RSP Chiaro Avici TSR
10
Generic Router Architecture
Header Processing Data Hdr Lookup IP Address Update Header Data Hdr Queue Packet Address Table IP Address Next Hop Buffer Memory 1M prefixes Off-chip DRAM 1M packets Off-chip DRAM 11
11
Generic Router Architecture
Data Hdr Lookup IP Address Update Header Header Processing Address Table Data Hdr Buffer Manager Buffer Memory Data Hdr Lookup IP Address Update Header Header Processing Address Table Data Hdr Buffer Manager Data Hdr Buffer Memory Data Hdr Lookup IP Address Update Header Header Processing Address Table Buffer Manager Buffer Memory 12
12
Why do we Need Faster Routers?
To prevent routers becoming the bottleneck in the Internet. To increase POP capacity, and to reduce cost, size and power. 13
13
Fiber Capacity (Gbit/s)
Why we Need Faster Routers 1: To prevent routers from being the bottleneck Packet processing Power Link Speed More recently transmission speed of Petabit/s has been demonstrated by Labs (multicore fiber) 10000 1000 2x / 18 months 2x / 7 months 100 Fiber Capacity (Gbit/s) 10 1 1985 1990 1995 2000 0,1 TDM DWDM Source: SPEC95Int & David Miller, Stanford. 14
14
Why we Need Faster Routers 2: To reduce cost, power & complexity of POPs
POP with smaller routers POP with large routers Ports: Price >$100k, Power > 400W. It is common for 50-60% of ports to be for interconnection. 15
15
Why are Fast Routers Difficult to Make?
It’s hard to keep up with Moore’s Law: The bottleneck is memory speed. Memory speed is not keeping up with Moore’s Law. 16
16
Why are Fast Routers Difficult to Make? Speed of Commercial DRAM
It’s hard to keep up with Moore’s Law: The bottleneck is memory speed. Memory speed is not keeping up with Moore’s Law. Moore’s Law 2x / 18 months 1.1x / 18 months 1.1x / 18 months DDR4 (2017): Speed: 3200 Mb/s Latency: ~13 ns Capacity: 64GB Moore’s Law 2x / 18 months 17
17
Why are Fast Routers Difficult to Make?
It’s hard to keep up with Moore’s Law: The bottleneck is memory speed. Memory speed is not keeping up with Moore’s Law. Moore’s Law is too slow: Routers need to improve faster than Moore’s Law. 18
18
Router Performance Exceeds Moore’s Law
Growth in capacity of commercial routers: Capacity 1992 ~ 2Gb/s Capacity 1995 ~ 10Gb/s Capacity 1998 ~ 40Gb/s Capacity 2001 ~ 160Gb/s Capacity 2003 ~ 640Gb/s Capacity 2008 ~ 100Tb/s Capacity 2013 ~ 922Tb/s Average growth rate: 2.2x / 18 months, but the last 5 years: 2.8x / 18 months. 2013: The Cisco CRS-X multishelf router has a capacity of 921.6Tb/s (1152 ports) 19
19
Outline Background Architectures and techniques The Future
What is a router? Why do we need faster routers? Why are they hard to build? Architectures and techniques The evolution of router architecture. IP address lookup. Packet buffering. Switching. The Future 20
20
First Generation Routers
Shared Backplane Route Table CPU Buffer Memory Line Interface MAC Typically <0.5Gb/s aggregate capacity CPU Memory Line Interface 21
21
Second Generation Routers
CPU Route Table Buffer Memory Line Card Line Card Line Card Buffer Memory Buffer Memory Buffer Memory Fwding Cache Fwding Cache Fwding Cache MAC MAC MAC Typically <5Gb/s aggregate capacity 22
22
Third Generation Routers
Switched Backplane Line Card CPU Card Line Card Local Buffer Memory Local Buffer Memory CPU Line Interface Routing Table Memory Fwding Table Fwding Table MAC MAC Typically <50Gb/s aggregate capacity 23
23
Fourth Generation Routers/Switches Optics inside a router for the first time
Optical links 100s of metres Switch Core Linecards 100s Tb/s routers available/in development 24
24
Outline Background Architectures and techniques The Future
What is a router? Why do we need faster routers? Why are they hard to build? Architectures and techniques The evolution of router architecture. IP address lookup. Packet buffering. Switching. The Future 25
25
Generic Router Architecture
Lookup IP Address Update Header Header Processing Address Table Buffer Manager Lookup IP Address Address Table Buffer Memory Header Processing Buffer Manager Lookup IP Address Update Header Address Table Buffer Memory Lookup IP Address Update Header Header Processing Address Table Buffer Manager Buffer Memory 26
26
IP Address Lookup Why it’s thought to be hard:
It’s not an exact match: it’s a longest prefix match. The table is large: about 650,000 entries today, and growing. The lookup must be fast: about 2ns for a 140Gb/s line. 27
27
Longest Prefix Match is Harder than Exact Match
The destination address of an arriving packet does not carry with it the information to determine the length of the longest matching prefix Hence, one needs to search among the space of all prefix lengths; as well as the space of all prefixes of a given length 28
28
IP Lookups find Longest Prefixes
/24 /21 /21 /8 /19 /16 232-1 Routing lookup: Find the longest matching prefix (aka the most specific route) among all prefixes that match the destination address. 29
29
Address Tables are Large
30
30
Lookups Must be Fast Year Aggregate Line-rate
Arriving rate of 40B packets (Million pkts/sec) 1997 622 Mb/s 1.56 2001 10 Gb/s 31.25 2006 80 Gb/s 250 2010 140 Gb/s 437.5 2013 400 Gb/s 1250 Lookup mechanism must be simple and easy to implement Memory access time is the bottleneck 250Mpps × 2 lookups/pkt = 500 Mlookups/sec → 2ns per lookup 31
31
IP Address Lookup Binary tries
1 Example Prefixes: a) b) c) d) 001 e) 0101 d f g f) 011 1 g) 100 h i h) 1010 e 1 i) 1100 a j) b c j 32
32
Multi-bit Tries Binary trie W Multi-ary trie W/k Time ~ W/k
Depth = W Degree = 2 Stride = 1 bit Depth = W/k Degree = 2k Stride = k bits Multi-ary trie W/k Time ~ W/k Storage ~ NW/k * 2k-1 W = longest prefix N = #prefixes 33
33
Prefix Length Distribution
Source: Geoff Huston, Oct 2001 99.5% prefixes are 24-bits or shorter 34
34
24-8 Direct Lookup Trie 0000……0000 1111……1111 24 bits 224-1 8 bits 28-1 When pipelined, allows one lookup per memory access. Inefficient use of memory, though. 35
35
Outline Background Architectures and techniques The Future
What is a router? Why do we need faster routers? Why are they hard to build? Architectures and techniques The evolution of router architecture. IP address lookup. Packet buffering. Switching. The Future 38
36
Conceptual architecture
Bi-directional ports Arbitration/Control Line cards hosting one or more ports Non-blocking switching core(s) 40
37
Conceptual Packet Buffering
Bi-directional ports Input buffering Arbitration/Control Non-blocking switching core(s) 41
38
Arbitration Arbitration Data Hdr Data Hdr 1 2 N Data Hdr Queue Packet
Lookup IP Address Update Header Header Processing Address Table Data Hdr Queue Packet 1 2 N Buffer Memory Lookup IP Address Update Header Header Processing Address Table Data Hdr Queue Packet Buffer Memory Arbitration Lookup IP Address Update Header Header Processing Address Table Queue Packet Buffer Memory 43
39
Head of Line Blocking 44
40
A Router with Input Queues Head of Line Blocking
The best that any queueing system can achieve. 45
41
The best that any queueing system can achieve.
The Best Performance The best that any queueing system can achieve. 46
42
Conceptual Packet Buffering
Central buffer Bi-directional ports Arbitration/Control Non-blocking switching core(s) 47
43
Fast Packet Buffers (http://yuba.stanford.edu/fastbuffers/)
Example: 40Gb/s packet buffer Size = RTT*BW = 10Gb; 40 byte packets Write Rate, R Buffer Manager Read Rate, R 1 packet every 8 ns 1 packet every 8 ns Buffer Memory How fast? Get into the motivation – say OC768 line rate buffering is a goal. - Just mention directly the amount of buffering required. Why is it not possible today? Why is it intersting…. Talks abut SRAM/DRAM. Use SRAM? + fast enough random access time, but - too low density to store 10Gb of data. Use DRAM? + high density means we can store data, but - too slow (~15ns random access time). 48
44
Packet Caches Buffer Manager SRAM DRAM Buffer Memory
Small ingress SRAM cache of FIFO heads cache of FIFO tails 55 56 96 97 87 88 57 58 59 60 89 90 91 1 Q 2 5 7 6 8 10 9 11 12 14 13 15 50 52 51 53 54 86 82 84 83 85 92 94 93 95 DRAM Buffer Memory Buffer Manager SRAM Arriving 4 3 2 1 Departing 2 Packets 5 4 3 2 1 Packets Q 6 5 4 3 2 1 b>>1 packets at a time DRAM Buffer Memory 49
45
Conceptual Packet Buffering
Bi-directional ports Output buffering Arbitration/Control Non-blocking switching core(s) 50
46
Output buffering Data Hdr Data Hdr Data Hdr Data Hdr Data Hdr Data Hdr
Lookup IP Address Update Header Header Processing Address Table Data Hdr Buffer Manager Buffer Memory Data Hdr Lookup IP Address Update Header Header Processing Address Table Data Hdr Buffer Manager Data Hdr Buffer Memory Data Hdr Lookup IP Address Update Header Header Processing Address Table Buffer Manager Buffer Memory 51
47
Speed-up Data Hdr 1 1 2 2 N times line rate N times line rate N N
Lookup IP Address Update Header Header Processing Address Table Data Hdr 1 1 Queue Packet Buffer Memory Lookup IP Address Update Header Header Processing Address Table 2 2 N times line rate Queue Packet Buffer Memory N times line rate Lookup IP Address Update Header Header Processing Address Table N N Queue Packet Buffer Memory 54
48
Conceptual Packet Buffering
Bi-directional ports Input buffering with a virtual output queue Arbitration/Control Non-blocking switching core(s) 55
49
Virtual Output Queues 56
50
Matching vertex edge A matching on a graph is a subset of edges of the graph such that no two of them share a vertex in common. 57
51
A Router with Virtual Output Queues
The best that any queueing system can achieve. 59
52
Current Internet Router Technology Summary
There are three potential bottlenecks: Address lookup, Packet buffering, and Switching. Techniques exist today for: 100sTb/s Internet routers, with 400Gb/s linecards. But what comes next…? 67
53
Outline Background Architectures and techniques The Future
What is a router? Why do we need faster routers? Why are they hard to build? Architectures and techniques The evolution of router architecture. IP address lookup. Packet buffering. Switching. The Future More parallelism. Eliminating schedulers. Introducing optics into routers. 68
54
The Future
55
Complex linecards Typical IP Router Linecard 10Gb/s linecard:
Lookup Tables Buffer & State Memory Switch Fabric Optics Packet Processing Buffer Mgmt & Scheduling Physical Layer Framing & Maintenance Buffer Mgmt & Scheduling Buffer & State Memory Arbitration 10Gb/s linecard: Number of gates: 30M Amount of memory: 2Gbits Cost: >$20k Power: 300W 72
56
External Parallelism: Multiple Parallel Routers
IP Router capacity 100s of Tb/s What we’d like: R R NxN R R The building blocks we’d like to use: R 73
57
Multiple parallel routers Load Balancing
1 2 … k R R/k 74
58
Intelligent Packet Load-balancing Parallel Packet Switching
Demultiplexor Router Multiplexor R/k R/k 1 rate, R rate, R 1 1 2 rate, R rate, R N N k Bufferless 75
59
Parallel Packet Switching
Advantages Single-stage of buffering No excess link capacity kh a power per subsystem i kh a memory bandwidth i kh a lookup rate i 76
60
Parallel Packet Switching
Advantages Load-balancing: output links are less congested Scalability: new router can be dynamically added Redundancy 77
61
Parallel Packet Switch Theorem
If Speed-up > 2 then a parallel packet switch can precisely emulate a single big router. 78
62
Eliminating schedulers Two-Stage Switch
External Inputs Internal Inputs External Outputs Load Balancing 1 N 1 N 1 N First Round-Robin Second Round-Robin 80
63
Optics in routers Optical links Switch Core Linecards 83
64
The Stanford Phicticious Optical Router
2-stage Switch Linecard #1 Linecard #625 160- 320Gb/s 160- 320Gb/s 40Gb/s Line termination IP packet processing Packet buffering Line termination IP packet processing Packet buffering 40Gb/s 160Gb/s 40Gb/s 40Gb/s 100 Tb/s IP Router, 625 linecards, each operating at 160Gb/s. 85
65
References General Fast Packet Buffers
J. S. Turner “Design of a Broadcast packet switching network”, IEEE Trans Comm, June 1988, pp C. Partridge et al. “A Fifty Gigabit per second IP Router”, IEEE Trans Networking, 1998. N. McKeown, M. Izzard, A. Mekkittikul, W. Ellersick, M. Horowitz, “The Tiny Tera: A Packet Switch Core”, IEEE Micro Magazine, Jan-Feb 1997. Fast Packet Buffers Sundar Iyer, Ramana Rao, Nick McKeown “Design of a fast packet buffer”, IEEE HPSR 2001, Dallas. 90
66
References IP Lookups A. Brodnik, S. Carlsson, M. Degermark, S. Pink. “Small Forwarding Tables for Fast Routing Lookups”, Sigcomm 1997, pp 3-14. B. Lampson, V. Srinivasan, G. Varghese. “ IP lookups using multiway and multicolumn search”, Infocom 1998, pp , vol. 3. M. Waldvogel, G. Varghese, J. Turner, B. Plattner. “Scalable high speed IP routing lookups”, Sigcomm 1997, pp P. Gupta, S. Lin, N. McKeown. “Routing lookups in hardware at memory access speeds”, Infocom 1998, pp , vol. 3. S. Nilsson, G. Karlsson. “Fast address lookup for Internet routers”, IFIP Intl Conf on Broadband Communications, Stuttgart, Germany, April 1-3, 1998. V. Srinivasan, G.Varghese. “Fast IP lookups using controlled prefix expansion”, Sigmetrics, June 1998. 91
67
References Switching N. McKeown, A. Mekkittikul, V. Anantharam, and J. Walrand. Achieving 100% Throughput in an Input-Queued Switch. IEEE Transactions on Communications, 47(8), Aug 1999. A. Mekkittikul and N. W. McKeown, "A practical algorithm to achieve 100% throughput in input-queued switches," in Proceedings of IEEE INFOCOM '98, March 1998. L. Tassiulas, “Linear complexity algorithms for maximum throughput in radio networks and input queued switchs,” in Proc. IEEE INFOCOM ‘98, San Francisco CA, April 1998. D. Shah, P. Giaccone and B. Prabhakar, “An efficient randomized algorithm for input-queued switch scheduling,” in Proc. Hot Interconnects 2001. J. Dai and B. Prabhakar, "The throughput of data switches with and without speedup," in Proceedings of IEEE INFOCOM '00, Tel Aviv, Israel, March 2000, pp C.-S. Chang, D.-S. Lee, Y.-S. Jou, “Load balanced Birkhoff-von Neumann switches,” Proceedings of IEEE HPSR ‘01, May 2001, Dallas, Texas. 92
68
References Future C.-S. Chang, D.-S. Lee, Y.-S. Jou, “Load balanced Birkhoff-von Neumann switches,” Proceedings of IEEE HPSR ‘01, May 2001, Dallas, Texas. Pablo Molinero-Fernndez, Nick McKeown "TCP Switching: Exposing circuits to IP" Hot Interconnects IX, Stanford University, August 2001 S. Iyer, N. McKeown, "Making parallel packet switches practical," in Proc. IEEE INFOCOM `01, April 2001, Alaska. I. Keslassy et al. ”Scaling Internet Routers Using Optics” in. Proc. ACM SIGCOMM `03, August 2003, Germany. 93
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.