Router Design (Nick Feamster) February 11, 2008. 2 Today’s Lecture The design of big, fast routers Partridge et al., A 50 Gb/s IP Router Design constraints.

Slides:



Advertisements
Similar presentations
Router Internals CS 4251: Computer Networking II Nick Feamster Spring 2008.
Advertisements

Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.
IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
1 ELEN 602 Lecture 18 Packet switches Traffic Management.
Based on slides from Dave Andersen and Nick Feamster
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 High Speed Router Design Shivkumar Kalyanaraman Rensselaer Polytechnic Institute
Router Architecture : Building high-performance routers Ian Pratt
Chen-Nien Tsai 2008/10/16 1 Wireless and Broadband Network Laboratory (WBNLAB) Dept. of CSIE, NTUT.
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
Lecture 11 Switching & Forwarding EECS 122 University of California Berkeley.
4-1 Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving side, delivers.
UCB Switches Jean Walrand U.C. Berkeley
CS 268: Router Design Ion Stoica March 1, 2004.
1 Comnet 2006 Communication Networks Recitation 5 Input Queuing Scheduling & Combined Switches.
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
1 Architectural Results in the Optical Router Project Da Chuang, Isaac Keslassy, Nick McKeown High Performance Networking Group
Big, Fast Routers Dave Andersen CMU CS
Scaling Internet Routers Using Optics Isaac Keslassy, Shang-Tse Da Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, Nick McKeown Department.
1 Internet Routers Stochastics Network Seminar February 22 nd 2002 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
EE 122: Router Design Kevin Lai September 25, 2002.
IEE, October 2001Nick McKeown1 High Performance Routers Slides originally by Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Introduction.
CS 268: Lecture 12 (Router Design) Ion Stoica March 18, 2002.
Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,
1 Trend in the design and analysis of Internet Routers University of Pennsylvania March 17 th 2003 Nick McKeown Professor of Electrical Engineering and.
Katz, Stoica F04 EECS 122: Introduction to Computer Networks Switch and Router Architectures Computer Science Division Department of Electrical Engineering.
1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,
The Data Plane Nick Feamster CS 6250: Computer Networking Fall 2011.
1 Growth in Router Capacity IPAM, Lake Arrowhead October 2003 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
Router Architectures An overview of router architectures.
Router Architectures An overview of router architectures.
Chapter 4 Queuing, Datagrams, and Addressing
1 IP routers with memory that runs slower than the line rate Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford.
Computer Networks Switching Professor Hui Zhang
Professor Yashar Ganjali Department of Computer Science University of Toronto
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Univ. of TehranComputer Network1 Advanced topics in Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani.
CS 552 Computer Networks IP forwarding Fall 2005 Rich Martin (Slides from D. Culler and N. McKeown)
A 50-Gb/s IP Router 참고논문 : Craig Partridge et al. [ IEEE/ACM ToN, June 1998 ]
TO p. 1 Spring 2006 EE 5304/EETS 7304 Internet Protocols Tom Oh Dept of Electrical Engineering Lecture 9 Routers, switches.
Advance Computer Networking L-8 Routers Acknowledgments: Lecture slides are from the graduate level Computer Networks course thought by Srinivasan Seshan.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-6600: Internet Protocols Informal Quiz #14 Shivkumar Kalyanaraman: GOOGLE: “Shiv RPI”
Applied research laboratory 1 Scaling Internet Routers Using Optics Isaac Keslassy, et al. Proceedings of SIGCOMM Slides:
Packet Forwarding. A router has several input/output lines. From an input line, it receives a packet. It will check the header of the packet to determine.
1 Performance Guarantees for Internet Routers ISL Affiliates Meeting April 4 th 2002 Nick McKeown Professor of Electrical Engineering and Computer Science,
Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant.
CS 4396 Computer Networks Lab Router Architectures.
Belgrade University Aleksandra Smiljanić: High-Capacity Switching Switches with Input Buffers (Cisco)
Forwarding.
CS 740: Advanced Computer Networks IP Lookup and classification Supplemental material 02/05/2007.
Winter 2006EE384x Handout 11 EE384x: Packet Switch Architectures Handout 1: Logistics and Introduction Professor Balaji Prabhakar
IEE, October 2001Nick McKeown1 High Performance Routers IEE, London October 18 th, 2001 Nick McKeown Professor of Electrical Engineering and Computer Science,
Buffered Crossbars With Performance Guarantees Shang-Tse (Da) Chuang Cisco Systems EE384Y Thursday, April 27, 2006.
Lecture Note on Switch Architectures. Function of Switch.
1 A quick tutorial on IP Router design Optics and Routing Seminar October 10 th, 2000 Nick McKeown
1 How scalable is the capacity of (electronic) IP routers? Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University
Packet Switch Architectures The following are (sometimes modified and rearranged slides) from an ACM Sigcomm 99 Tutorial by Nick McKeown and Balaji Prabhakar,
Network Layer4-1 Chapter 4 Network Layer All material copyright J.F Kurose and K.W. Ross, All Rights Reserved Computer Networking: A Top Down.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Univ. of TehranComputer Network1 Computer Networks Computer Networks (Graduate level) University of Tehran Dept. of EE and Computer Engineering By: Dr.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai.
1 Building big router from lots of little routers Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University.
Weren’t routers supposed
CS 268: Router Design Ion Stoica February 27, 2003.
Packet Forwarding.
Addressing: Router Design
Lecture 11 Switching & Forwarding
Advance Computer Networking
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Presentation transcript:

Router Design (Nick Feamster) February 11, 2008

2 Today’s Lecture The design of big, fast routers Partridge et al., A 50 Gb/s IP Router Design constraints –Speed –Size –Power consumption Components Algorithms –Lookups and packet processing (classification, etc.) –Packet queuing –Switch arbitration

3 What’s In A Router Interfaces –Input/output of packets Switching fabric –Moving packets from input to output Software –Routing –Packet processing –Scheduling –Etc.

4 What a Router Chassis Looks Like Cisco CRS-1Juniper M320 6ft 19 ” 2ft Capacity: 1.2Tb/s Power: 10.4kW Weight: 0.5 Ton Cost: $500k 3ft 2ft 17 ” Capacity: 320 Gb/s Power: 3.1kW

5 What a Router Line Card Looks Like 1-Port OC48 (2.5 Gb/s) (for Juniper M40) 4-Port 10 GigE (for Cisco CRS-1) Power: about 150 Watts 21in 2in 10in

6 Big, Fast Routers: Why Bother? Faster link bandwidths Increasing demands Larger network size (hosts, routers, users)

7 Summary of Routing Functionality Router gets packet Looks at packet header for destination Looks up routing table for output interface Modifies header (TTL, IP header checksum) Passes packet to output interface

8 Generic Router Architecture Lookup IP Address Update Header Header Processing DataHdrDataHdr 1M prefixes Off-chip DRAM Address Table Address Table IP AddressNext Hop Queue Packet Buffer Memory Buffer Memory 1M packets Off-chip DRAM Question: What is the difference between this architecture and that in today’s paper?

9 Innovation #1: Each Line Card Has the Routing Tables Prevents central table from becoming a bottleneck at high speeds Complication: Must update forwarding tables on the fly. –How does the BBN router update tables without slowing the forwarding engines?

10 Generic Router Architecture Lookup IP Address Update Header Header Processing Address Table Address Table Lookup IP Address Update Header Header Processing Address Table Address Table Lookup IP Address Update Header Header Processing Address Table Address Table DataHdrDataHdrDataHdr Buffer Manager Buffer Memory Buffer Memory Buffer Manager Buffer Memory Buffer Memory Buffer Manager Buffer Memory Buffer Memory DataHdrDataHdrDataHdr Interconnection Fabric

11 Route Table CPU Buffer Memory Line Interface MAC Line Interface MAC Line Interface MAC Typically <0.5Gb/s aggregate capacity Shared Bus Line Interface CPU Memory First Generation Routers Off-chip Buffer

12 Route Table CPU Line Card Buffer Memory Line Card MAC Buffer Memory Line Card MAC Buffer Memory Fwding Cache Fwding Cache Fwding Cache MAC Buffer Memory Typically <5Gb/s aggregate capacity Second Generation Routers

13 Third Generation Routers Line Card MAC Local Buffer Memory CPU Card Line Card MAC Local Buffer Memory “Crossbar”: Switched Backplane Line Interface CPU Memory Fwding Table Routing Table Fwding Table Typically <50Gb/s aggregate capacity

14 Innovation #2: Switched Backplane Every input port has a connection to every output port During each timeslot, each input connected to zero or one outputs Advantage: Exploits parallelism Disadvantage: Need scheduling algorithm

15 Head-of-Line Blocking Output 1 Output 2 Output 3 Input 1 Input 2 Input 3 Problem: The packet at the front of the queue experiences contention for the output queue, blocking all packets behind it. Maximum throughput in such a switch: 2 – sqrt(2) M.J. Karol, M. G. Hluchyj, and S. P. Morgan, “Input Versus Output Queuing on a Space-Division Packet Switch,” IEEE Transactions On Communications, Vol. Com-35, No. 12, December 1987, pp

16 Speedup What if the crossbar could have a “speedup”? Key result: Given a crossbar with 2x speedup, any maximal matching can achieve 100% throughput. S.-T. Chuang, A. Goel, N. McKeown, and B. Prabhakarm, “Matching Output Queueing with a Combined Input Output Queued Switch”, Proceedings of INFOCOM,1998. I.e., does as well as a switch with Nx speedup.

17 Combined Input-Output Queuing Advantages –Easy to build 100% can be achieved with limited speedup Disadvantages –Harder to design algorithms Two congestion points Flow control at destination input interfacesoutput interfaces Crossbar

18 Solution: Virtual Output Queues Maintain N virtual queues at each input – one per output Output 1 Output 2 Output 3 Input 1 Input 2 Input 3 N. McKeown, A. Mekkittikul, V. Anantharam, and J. Walrand, “Achieving 100% Throughput in an Input-Queued Switch,” IEEE Transactions on Communications, Vol. 47, No. 8, August 1999, pp

19 Router Components and Functions Route processor –Routing –Installing forwarding tables –Management Line cards –Packet processing and classification –Packet forwarding Switched bus (“Crossbar”) –Scheduling

20 Crossbar Switching Conceptually: N inputs, N outputs –Actually, inputs are also outputs In each timeslot, one-to-one mapping between inputs and outputs. Goal: Maximal matching L 11 (n) L N1 (n) Traffic DemandsBipartite Match Maximum Weight Match

21 Early Crossbar Scheduling Algorithm Wavefront algorithm Problems: Fairness, speed, …

22 Alternatives to the Wavefront Scheduler PIM: Parallel Iterative Matching –Request: Each input sends requests to all outputs for which it has packets –Grant: Output selects an input at random and grants –Accept: Input selects from its received grants Problem: Matching may not be maximal Solution: Run several times Problem: Matching may not be “fair” Solution: Grant/accept in round robin instead of random

23 Processing: Fast Path vs. Slow Path Optimize for common case –BBN router: 85 instructions for fast-path code –Fits entirely in L1 cache Non-common cases handled on slow path –Route cache misses –Errors (e.g., ICMP time exceeded) –IP options –Fragmented packets –Mullticast packets

24 Recent Trends: Programmability NetFPGA: 4-port interface card, plugs into PCI bus (Stanford) –Customizable forwarding –Appearance of many virtual interfaces (with VLAN tags) Programmability with Network processors (Washington U.) Line Cards PEs Switch

25 Scheduling and Fairness What is an appropriate definition of fairness? –One notion: Max-min fairness –Disadvantage: Compromises throughput Max-min fairness gives priority to low data rates/small values Is it guaranteed to exist? Is it unique?

26 Max-Min Fairness A flow rate x is max-min fair if any rate x cannot be increased without decreasing some y which is smaller than or equal to x. How to share equally with different resource demands –small users will get all they want –large users will evenly split the rest More formally, perform this procedure: –resource allocated to customers in order of increasing demand –no customer receives more than requested –customers with unsatisfied demands split the remaining resource

27 Example Demands: 2, 2.6, 4, 5; capacity: 10 –10/4 = 2.5 –Problem: 1st user needs only 2; excess of 0.5, Distribute among 3, so 0.5/3=0.167 –now we have allocs of [2, 2.67, 2.67, 2.67], –leaving an excess of 0.07 for cust #2 –divide that in two, gets [2, 2.6, 2.7, 2.7] Maximizes the minimum share to each customer whose demand is not fully serviced

28 How to Achieve Max-Min Fairness Take 1: Round-Robin –Problem: Packets may have different sizes Take 2: Bit-by-Bit Round Robin –Problem: Feasibility Take 3: Fair Queuing –Service packets according to soonest “finishing time” Adding QoS: Add weights to the queues…

29 Why QoS? Internet currently provides one single class of “best-effort” service –No assurances about delivery Existing applications are elastic –Tolerate delays and losses –Can adapt to congestion Future “real-time” applications may be inelastic

30 IP Address Lookup Challenges: 1.Longest-prefix match (not exact). 2.Tables are large and growing. 3.Lookups must be fast.

31 IP Lookups find Longest Prefixes / / / / / / Routing lookup: Find the longest matching prefix (aka the most specific route) among all prefixes that match the destination address.

32 IP Address Lookup Challenges: 1.Longest-prefix match (not exact). 2.Tables are large and growing. 3.Lookups must be fast.

33 Address Tables are Large

34 IP Address Lookup Challenges: 1.Longest-prefix match (not exact). 2.Tables are large and growing. 3.Lookups must be fast.

35 Lookups Must be Fast 12540Gb/s Gb/s Gb/s Mb/s B packets (Mpkt/s) LineYear OC-12 OC-48 OC-192 OC-768 Still pretty rare outside of research networks Cisco CRS-1 1-Port OC-768C (Line rate: 42.1 Gb/s)

36 IP Address Lookup: Binary Tries Example Prefixes: a) b) c) d) 001 e) 0101 f) 011 g) 100 h) 1010 i) 1100 j) e f g h i j 01 a bc d

37 Example Prefixes a) b) c) d) 001 e) 0101 f) 011 g) 100 h) 1010 i) 1100 j) e f g h i j Skip a bc d IP Address Lookup: Patricia Trie Problem: Lots of (slow) memory lookups

38 Address Lookup: Direct Trie When pipelined, one lookup per memory access Inefficient use of memory 0000…… …… bits 8 bits

39 Faster LPM: Alternatives Content addressable memory (CAM) –Hardware-based route lookup –Input = tag, output = value –Requires exact match with tag Multiple cycles (1 per prefix) with single CAM Multiple CAMs (1 per prefix) searched in parallel –Ternary CAM (0,1,don’t care) values in tag match Priority (i.e., longest prefix) by order of entries Historically, this approach has not been very economical.

40 Faster Lookup: Alternatives Caching –Packet trains exhibit temporal locality –Many packets to same destination Cisco Express Forwarding

41 IP Address Lookup: Summary Lookup limited by memory bandwidth. Lookup uses high-degree trie. State of the art: 10Gb/s line rate. Scales to: 40Gb/s line rate.

42 Fourth-Generation: Collapse the POP High Reliability and Scalability enable “vertical” POP simplification Reduces CapEx, Operational cost Increases network stability

43 Fourth-Generation Routers Switch Linecards Limit today ~2.5Tb/s  Electronics  Scheduler scales <2x every 18 months  Opto-electronic conversion

44 In Out WAN Linecard In WAN Multi-rack routers Out Switch fabric

45 Future: 100Tb/s Optical Router Arbitration 40Gb/s OpticalSwitch Line termination IP packet processing Packet buffering Line termination IP packet processing Packet buffering Electronic Linecard #1 Electronic Linecard #1 Electronic Linecard #625 Electronic Linecard #625 Request Grant Gb/s 160Gb/s Gb/s (100Tb/s = 625 * 160Gb/s) McKeown et al., Scaling Internet Routers Using Optics, ACM SIGCOMM 2003

46 Challenges with Optical Switching Mis-sequenced packets Pathological traffic patterns Rapidly configuring switch fabric Failing components