Building Gigabit-rate Routers with the NetFPGA: NICTA Tutorial at UNSW

Slides:



Advertisements
Similar presentations
1 Understanding Buffer Size Requirements in a Router Thanks to Nick McKeown and John Lockwood for numerous slides.
Advertisements

IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
Berlin – November 10th, 2011 NetFPGA Programmable Networking for High-Speed Network Prototypes, Research and Teaching Presented by: Andrew W. Moore (University.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Introduction.
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
CS 838: NetFPGA Tutorial Theophilus Benson.
Lecture Week 8 The Routing Table: A Closer Look
NetFPGA Cambridge Spring School Mar NetFPGA : Cambridge Spring School Presented by: Andrew W. Moore and David Miller (University of Cambridge)
Aug 20 th, 2002 Sigcomm Education Workshop 1 Teaching tools for a network infrastructure teaching lab The Virtual Router and NetFPGA Sigcomm Education.
Paper Review Building a Robust Software-based Router Using Network Processors.
NetFPGA: Reusable Router Architecture for Experimental Research Jad Naous, Glen Gibb, Sara Bolouki, and Nick Presented.
PA3: Router Junxian (Jim) Huang EECS 489 W11 /
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
NetFPGA Cambridge Spring School Mar Day 2: NetFPGA Cambridge Spring School Module Development and Testing Presented by: Andrew W. Moore and.
IP Forwarding.
Applied research laboratory David E. Taylor Users Guide: Fast IP Lookup (FIPL) in the FPX Gigabit Kits Workshop 1/2002.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
EuroSys NetFPGA Tutorial 1 S T A N F O R D U N I V E R S I T Y Presented by: John W. Lockwood & G. Adam Covington (Stanford University) Andrew Moore.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 6: Static Routing Routing and Switching Essentials.
Networks and Protocols CE Week 7b. Routing an Overview.
Anurag Dwivedi. Basic Block - Gates Gates -> Flip Flops.
4/19/20021 TCPSplitter: A Reconfigurable Hardware Based TCP Flow Monitor David V. Schuehler.
Hot Interconnects TCP-Splitter: A Reconfigurable Hardware Based TCP/IP Flow Monitor David V. Schuehler
Field Programmable Port Extender (FPX) 1 Modular Design Techniques for the FPX.
NetFPGA tutorial - India: May S T A N F O R D U N I V E R S I T Y Hands-on with the NetFPGA to build a Gigabit-rate Router at Indian Institute of.
1 Kyung Hee University Chapter 6 Delivery Forwarding, and Routing of IP Packets.
OpenFlow MPLS and the Open Source Label Switched Router Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
October S T A N F O R D U N I V E R S I T Y A Quick Update for GENI Engineering Conference (GEC3) - Oct 30, 2008 John W. Lockwood and the NetFPGA.
1 Introduction to Engineering Spring 2007 Lecture 18: Digital Tools 2.
Graciela Perera Department of Computer Science and Information Systems Slide 1 of 18 INTRODUCTION NETWORKING CONCEPTS AND ADMINISTRATION CSIS 3723 Graciela.
Introduction to Computers - Hardware
Introduction to the FPGA and Labs
Behrouz A. Forouzan TCP/IP Protocol Suite, 3rd Ed.
Instructor Materials Chapter 1: LAN Design
Routing and routing tables
Understanding Buffer Size Requirements in a Router
REGISTER TRANSFER LANGUAGE (RTL)
Whirlwind Tour Of Lectures So Far
Chapter 6 Delivery & Forwarding of IP Packets
Reference Router on NetFPGA 1G
CS 286 Computer Organization and Architecture
Troubleshooting IP Addressing
Net 323: NETWORK Protocols
Chapter 3 Part 3 Switching and Bridging
(Stanford University) (Cambridge University)
IP Forwarding Relates to Lab 3.
What’s “Inside” a Router?
Interfacing Memory Interfacing.
Chapter 2: Static Routing
A Quick Update for GENI Engineering Conference (GEC3) - Oct 30, 2008
Network Core and QoS.
Datapaths For the rest of the semester, we’ll focus on computer architecture: how to assemble the combinational and sequential components we’ve studied.
Setting Up Firewall using Netfilter and Iptables
Packet Switch Architectures
Dynamic Packet-filtering in High-speed Networks Using NetFPGAs
IP Forwarding Relates to Lab 3.
Implementing an OpenFlow Switch on the NetFPGA platform
Recall: ROM example Here are three functions, V2V1V0, implemented with an 8 x 3 ROM. Blue crosses (X) indicate connections between decoder outputs and.
Chapter 4 Network Layer Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April Network Layer.
Chapter 3 Part 3 Switching and Bridging
Project proposal: Questions to answer
IP Control Gateway (IPCG)
Networking and Network Protocols (Part2)
IP Forwarding Relates to Lab 3.
Routers with Very Small Buffers
Reference Router on NetFPGA 1G
NetFPGA - an open network development platform
Network Core and QoS.
Chapter 4: outline 4.1 Overview of Network layer data plane
Presentation transcript:

Building Gigabit-rate Routers with the NetFPGA: NICTA Tutorial at UNSW Presented by: John W. Lockwood, Jad Naous, Glen Gibb (Stanford University) Hosted by: Lavy Libman (NICTA) and Philip Allen (UNSW) February 6, 2008: 9am-5pm Lab 343A, Electrical Engineering Building (G17) Kensington Campus, University of New South Wales Sydney, Australia http://NetFPGA.org

What is the NetFPGA? Networking Software running on a standard PC PC with NetFPGA Networking Software running on a standard PC CPU Memory PCI A hardware accelerator built with Field Programmable Gate Array driving Gigabit network links FPGA Memory 1GE NetFPGA Board

Introduction Who uses the NetFPGA How they use the NetFPGA Teachers Students Researchers How they use the NetFPGA To run the Router Kit To build modular reference designs IPv4 router 4-port NIC Ethernet switch, … To create new systems

Running the Router Kit User-space development, 4x1GE line-rate forwarding Usage #1 OSPF BGP My Protocol user kernel Routing Table CPU Memory PCI “Mirror” IPv4 Router 1GE Fwding Table Packet Buffer FPGA Memory 1GE

Building Modular Router Modules Usage #2 NetFPGA Driver PW-OSPF Verilog EDA Tools (Xilinx, Mentor, etc.) Design Simulate Synthesize Download CPU Memory Java GUI Front Panel (Extensible) PCI In Q Mgmt IP Lookup L2 Parse L3 Out Q 1GE Verilog modules interconnected by FIFO interfaces 1GE FPGA 1GE 1GE My Block Memory 1GE

(1GE MAC is soft/replaceable) Creating new systems Usage #3 NetFPGA Driver 1GE My Design (1GE MAC is soft/replaceable) Verilog EDA Tools (Xilinx, Mentor, etc.) Design Simulate Synthesize Download CPU Memory PCI 1GE FPGA 1GE 1GE Memory 1GE

Tutorial Outline Background The Stanford Base Reference Router Basics of an IP Router The NetFPGA Platform The Stanford Base Reference Router Demo1 : Reference Router running on the NetFPGA Inside the NetFPGA hardware Breakneck introduction to Verilog Exercise 1: Build your own Reference Router The Enhanced Reference Router Motivation: Understanding buffer size requirements in a router Demo 2: Observing and controlling the queue size Using NetFPGA for research and teaching Exercise 2: Enhancing the Reference Router The Life of a Packet Through the NetFPGA

Basic Operation of an IP Router D E F R5 D R5 F R3 E D Next Hop Destination

What does a router do? R3 A B C R1 R2 R4 D E F R5 32 Data 16 32 4 1 Data Options (if any) Destination Address Source Address Header Checksum Protocol TTL Fragment Offset Flags Fragment ID Total Packet Length T.Service HLen Ver 20 bytes D R5 F R3 E D Next Hop Destination

What does a router do? A B C R1 R2 R3 R4 D E F R5

Basic Components of an IP Router Software Hardware Management & CLI Routing Protocols Control Plane Routing Table Datapath per-packet processing Forwarding Table Switching

Per-packet processing in an IP Router 1. Accept packet arriving on an incoming link. 2. Lookup packet destination address in the forwarding table, to identify outgoing port(s). 3. Manipulate IP header: e.g., decrement TTL, update header checksum. 5. Buffer packet in the output queue. 6. Transmit packet onto outgoing link.

Generic Datapath Architecture Header Processing Data Hdr Lookup IP Address Update Header Data Hdr Queue Packet Forwarding Table IP Address Next Hop Buffer Memory

CIDR and Longest Prefix Matches The IP address space is broken into line segments. Each line segment is described by a prefix. A prefix is of the form x/y where x indicates the prefix of all addresses in the line segment, and y indicates the length of the segment. e.g. The prefix 128.9/16 represents the line segment containing addresses in the range: 128.9.0.0 … 128.9.255.255. 128.9.0.0 142.12/19 65/8 128.9/16 232-1 216 128.9.16.14

Classless Interdomain Routing (CIDR) 128.9.19/24 128.9.25/24 128.9.16/20 128.9.176/20 Most specific route = “longest matching prefix” 128.9/16 232-1 128.9.16.14

Techniques for LPM in hardware Linear search Direct lookup Currently requires too much memory Updating a prefix leads to many changes Tries Deterministic lookup time Easily pipelined But requires multiple memories/references TCAM (Ternary CAM) Simple and widely used But low-density, high-power Gradually being replaced by new algorithms

An IP Router on NetFPGA Linux user-level Software processes Verilog on Hardware Management & CLI Linux user-level processes Routing Protocols Exception Processing Routing Table Verilog on NetFPGA PCI board Forwarding Table Switching

NetFPGA Router Open-source FPGA hardware Open-souce Software Function 4 Gigabit Ethernet ports Fully programmable FPGA hardware Low cost Open-source FPGA hardware Verilog base design Open-souce Software Drivers in C and C++

NetFPGA Platform Major Components Interfaces Memories FPGA Resources 4 Gigabit Ethernet Ports PCI Host Interface Memories 36Mbits Static RAM 512Mbits DDR2 Dynamic RAM FPGA Resources Block RAMs Configurable Logic Block (CLBs) Memory Mapped Registers

Packet Forwarding Table NetFPGA Router Hardware NetFPGA System User Space Linux Kernel CAD Tools Monitor Software Web & Video Server Browser & Video Client Packet Forwarding Table PCI PCI-e VI VI VI VI NIC NetFPGA Router Hardware GE GE GE GE GE GE (nf2c0 .. 3) (eth1 .. 2)

NetFPGA Hardware

NetFPGA System Implementation NetFPGA Blocks Virtex-2 Pro FPGA 4.5MB ZBT SRAM 64MB DDR2 DRAM PCI Host Interface 4 Gigabit Ethernet ports Intranet Test Ports Dual or Quad Gigabit Etherents on PCI-e Internet Gigabit Ethernet on Motherboard Processor Dual-Core CPU Operating System Linux CentOS 4.4

NetFPGA Lab Setup Dual NIC CPU x2 Net-FPGA (eth1 .. 2) Client Server Eth2 : Server PCI-e GE (eth1 .. 2) Eth1 : Local host Server GE Net-FPGA GE Nf2c3 : Adj. Server NetFPGA Control SW PCI Internet Router Hardware Nf2c2 : Local Host GE GE Nf2c1 : Adjacent Nf2c0 : Adjacent CAD Tools GE

NetFPGA Hardware Set for Demo #1 CPU x2 NIC Video Server PCI-e PCI-e GE GE Net-FPGA GE PCI Internet Router Hardware GE GE Server delivers streaming HD video through a chain of NetFPGA Routers GE Net-FPGA GE Internet Router Hardware GE GE GE … CPU x2 NIC PCI-e GE GE Net-FPGA GE Video Display PCI Internet Router Hardware GE GE CAD Tools GE

Tutorial Outline Background The Stanford Base Reference Router Basics of an IP Router The NetFPGA Platform The Stanford Base Reference Router Demo1 : Reference Router running on the NetFPGA Inside the NetFPGA hardware Breakneck introduction to Verilog Exercise 1: Build your own Reference Router The Enhanced Reference Router Motivation: Understanding buffer size requirements in a router Demo 2: Observing and controlling the queue size Using NetFPGA for research and teaching Exercise 2: Enhancing the Reference Router The Life of a Packet Through the NetFPGA

Topology of NetFPGA Routers Demo 1 Video Server HD Display

Setup for the Reference Router Demo 1 Video Server Each NetFPGA card has four ports Port 2 connected to Client / Server Ports 0 and 3 connected to adjacent NetFPGA cards NetFPGA NetFPGA Video Client NetFPGA 27

Demo 1: Logical Topology .1.1 .4.1 .7.1 .10.1 .13.1 .16.1 .1.2 .4.2 .7.2 .10.2 .13.2 .16.2 .3.1 .6.2 .9.2 .12.2 .15.2 .17.1 .2.1 .3.2 .6.1 .9.1 .12.1 .15.1 .30.2 .5.1 .8.1 .11.1 .14.1 .18.1 .30.1 .26.1 .23.1 .18.2 .27.2 .24.2 .21.2 .20.1 Explain the reason why we chose this toplogy Explain how we will run the video. Will it be projected on the screen or will we ask users to do it themselves? .29.1 .27.1 .24.1 .21.1 .28.2 .25.2 .22.2 .19.2 .28.1 .25.1 .22.1 .19.1 Video Server Video Client Shortest Path 28 28

Working IP Router 29 Objectives Demo 1 Objectives Become familiar with Stanford Reference Router Observe PW-OSPF re-routing traffic around a failure 29

Streaming Video through the NetFPGA Demo 1 Video server Source files /var/www/html/video Network URL : http://192.168.Net.Host/Video Video client Windows Media Player Linux mplayer Video traffic MPEG2 HDTV (35 Mbps) MPEG2 TV (9 Mbps) DVI (3 Mbps) WMF (1.7 Mbps)

Step 1 – Observe the Routing Tables Demo 1 The router is already configured and running on your machines The routing table has converged to the routing decisions with minimum number of hops Next, break a link … 31

Step 2 - Dynamic Re-routing Demo 1 Break the link between video server and video client Routers re-route traffic around the broken link and video continues playing .1.1 .4.1 .7.1 .10.1 .13.1 .16.1 .1.2 .4.2 .7.2 .10.2 .13.2 .16.2 .3.1 .6.2 .9.2 .12.2 .15.2 .2.1 .3.2 .6.1 .9.1 .12.1 .15.1 .17.1 .30.2 .5.1 .8.1 .11.1 .14.1 .18.1 .30.1 .26.1 .23.1 .18.2 .27.2 .24.2 .21.2 .29.1 .20.1 .27.1 .24.1 .21.1 .28.2 .25.2 .22.2 .19.2 .28.1 .25.1 .22.1 .19.1 32

Tutorial Outline Background The Stanford Base Reference Router Basics of an IP Router The NetFPGA Platform The Stanford Base Reference Router Demo1 : Reference Router running on the NetFPGA Inside the NetFPGA hardware Breakneck introduction to Verilog Exercise 1: Build your own Reference Router The Enhanced Reference Router Motivation: Understanding buffer size requirements in a router Demo 2: Observing and controlling the queue size Using NetFPGA for research and teaching Exercise 2: Enhancing the Reference Router The Life of a Packet Through the NetFPGA

Integrated Circuit Technology Full-custom Design Complementary Metal Oxide Semiconductor (CMOS) Semi-custom ASIC Design Gate array Standard cell Programmable Logic Device Programmable Array Logic Field Programmable Gate Arrays Processors

Look-Up Tables Combinatorial logic is stored in Look-Up Tables (LUTs) Also called Function Generators (FGs) Capacity is limited only by number of inputs, not complexity Delay through the LUT is constant A B C D Z 1 . Combinatorial Logic A B C D Z Diagram From: Xilinx, Inc

Xilinx CLB Structure Each slice has four outputs Two registered outputs, two non-registered outputs Two BUFTs associated with each CLB, accessible by all 16 CLB outputs Carry logic run vertically Signals run upwards Two independent carry chains per CLB Slice 0 LUT Carry D Q CE PRE CLR LUT Carry D Q CE PRE CLR The major parts of a slice include two look-up tables (LUTs), two sequential elements, and carry logic. The LUTs are known as the F LUT and the G LUT. The sequential elements can be programmed to be either registers or latches. The next several slides cover the LUT, carry logic, and flip-flops in detail. Diagram From: Xilinx, Inc (Courtesy Jeff Weintraub)

Field Programmable Gate Arrays CLB Primitive element of FPGA Routing Module Global routing Local interconnect Macro Blocks Block Memories Microprocessor I/O Block

NetFPGA Block Diagram

Details of NetFPGA Fits into Standard PCI slot Standard Bus : 32 bits, 33 MHz Provides Interfaces for processing network packets 4 Gigabit Ethernet Ports Allows hardware-accelerated processing Implemented with Field Programmable Gate Array (FPGA) Logic

Tutorial Outline Background The Stanford Base Reference Router Basics of an IP Router The NetFPGA Platform The Stanford Base Reference Router Demo1 : Reference Router running on the NetFPGA Inside the NetFPGA hardware Breakneck introduction to Verilog Exercise 1: Build your own Reference Router The Enhanced Reference Router Motivation: Understanding buffer size requirements in a router Demo 2: Observing and controlling the queue size Using NetFPGA for research and teaching Exercise 2: Enhancing the Reference Router The Life of a Packet Through the NetFPGA

Hardware Description Languages Concurrent By Default, Verilog statements evaluated concurrently Express fine grain parallelism Allows gate-level parallelism Provides Precise Description Eliminates ambiguity about operation Synthesizable Generates hardware from description

Verilog Data Types reg [7:0] A; // 8-bit register, MSB to LSB // (Preferred bit order for NetFPGA) reg [0:15] B; // 16-bit register, LSB to MSB B = {A[7:0],A[0:7]}; // Assignment of bits reg [31:0] Mem [0:1023]; // 1K Word Memory integer Count; // simple signed 32-bit integer integer K[1:64]; // an array of 64 integers time Start, Stop; // Two 64-bit time variables From: CSCI 320 Computer Architecture Handbook on Verilog HDL, by Dr. Daniel C. Hyde : http://eesun.free.fr/DOC/VERILOG/verilog-manual.html

Signal Multiplexers Two input multiplexer (using if / else) reg y; always @*    if (select)       y = a;    else       y = b; Two input multiplexer (using ternary operator ?:) wire t = (select ? a : b); From: http://eesun.free.fr/DOC/VERILOG/synvlg.html

Larger Multiplexers Three input multiplexer reg s; always @*    begin    case (select2)       2'b00: s = a;       2'b01: s = b;       default: s = c;     endcase    end

Synchronous Storage Elements Values change at times governed by clock Clock Din Dout Q D Clock Transition t=0 t=1 t=2 1 Clock time Clock Input to circuit Clock Event Example: Rising edge Din A B C t=0 Flip/Flop Transfers Value From Din to Dout on Clock event Clock Transition Dout A B S0 t=0

Finite State Machines

Synthesizable Verilog : Delay Flip/Flops D-type flip flop reg q; always @ (posedge clk)   q <= d; D type flip flop with data enable reg q; always @ (posedge clk)   if (enable)     q <= d; From: http://eesun.free.fr/DOC/VERILOG/synvlg.html

Tutorial Outline Background The Stanford Base Reference Router Basics of an IP Router The NetFPGA Platform The Stanford Base Reference Router Demo1 : Reference Router running on the NetFPGA Inside the NetFPGA hardware Breakneck introduction to Verilog Exercise 1: Build your own Reference Router The Enhanced Reference Router Motivation: Understanding buffer size requirements in a router Demo 2: Observing and controlling the queue size Using NetFPGA for research and teaching Exercise 2: Enhancing the Reference Router The Life of a Packet Through the NetFPGA

Reference Router Pipeline Exercise 1 MAC RxQ CPU Input Arbiter Output Port Lookup TxQ Output Queues Five stages Input Input Arbitration Routing Decision and packet modification Output Queuing Output Packet-based module interface Pluggable design

Make your own router Objectives: Execution Learn how to build hardware Exercise 1 Objectives: Learn how to build hardware Run the software Explore router architecture Execution Start synthesis Rerun the GUI with the new hardware Test connectivity and statistics with pings Explore pipeline in the details page Explore detailed statistics in the details page

Step 1 - Build the hardware Exercise 1 Start terminal, cd to “NF2/projects/tutorial_router/synth” Start synthesis with “make”

Step 2 - Run Homemade Router Exercise 1 cd to “NF2/projects/tutorial_router/sw” Type: “tutorial_router_gui.pl” to use the just built router hardware The same interface should start again

Step 4 - Connectivity and Statistics Exercise 1 Ping any addresses 192.168.x.y where x is from 1-20 and y is 1 or 2 Open the statistics tab in the Quickstart window to see some statistics Explore more statistics in modules under the details tab

Step 5 - Explore Router Architecture Exercise 1 Click the Details tab of the Quickstart window This is the reference router pipeline – a canonical, simple to understand, modular router pipeline 54

Step 6 - Explore Output Queues Exercise 1 Click on the Output Queues module in the Details tab The page gives configuration details …and statistics

Tutorial Outline Background The Stanford Base Reference Router Basics of an IP Router The NetFPGA Platform The Stanford Base Reference Router Demo1 : Reference Router running on the NetFPGA Inside the NetFPGA hardware Breakneck introduction to Verilog Exercise 1: Build your own Reference Router The Enhanced Reference Router Motivation: Understanding buffer size requirements in a router Demo 2: Observing and controlling the queue size Using NetFPGA for research and teaching Exercise 2: Enhancing the Reference Router The Life of a Packet Through the NetFPGA

Buffer Requirements in a Router Buffer size matters: Small queues reduce delay Large buffers are expensive Theoretical tools predict requirements Queuing theory Large deviation theory Mean field theory Yet, there is no direct answer. Flows have a closed-loop nature Question arises on whether focus should be on equilibrium state or transient state.. Having said that, one might think the buffer sizing problem must be very well understood. After all, we are equipped with tools like queueing theory, large deviations theory, and mean field theory which are focused on solving exactly this type of problem. You would think this is simply a matter of understanding the random process that describes the queue occupancy over time. Unfortunately, this is not the case. The closed-loop nature of the flows, and the fact that flows react to the state of the system makes it necessary to use control theoretic tools, but those tools emphasize on the equilibrium state of the system, and fail to describe transient delays.

Rule-of-thumb Universally applied rule-of-thumb: Context Source Router Destination C 2T Universally applied rule-of-thumb: A router needs a buffer size: 2T is the two-way propagation delay (or just 250ms) C is capacity of bottleneck link Context Mandated in backbone and edge routers. Appears in RFPs and IETF architectural guidelines. Already known by inventors of TCP [Van Jacobson, 1988] Has major consequences for router design So if the problem is not easy, what do people do in practice? Buffer sizes in today’s Internet routers are set based on a rule-of-thumb which says If we want the core routers to have 100% utilization, The buffer size should be greater than or equal to 2TxC Here 2T is the two way propagation delay of packets going through the router And C is the capacity of the target link. This rule is mandated in backbone and edge routers, and Appears in RFPs and IETF architectural guidelines. It has been known almost since the time TCP was invented. Note that if the capacity of the network is increased, based on this rule, we need to increase the buffer size linearly with capacity. We don’t expect the propagation delay changed that much over time, but we expect the capacity to grow very rapidly, Therefore, this rule can have major consequences in router design, and that’s exactly why today’s routers have so much buffering as I showed you a few moments ago.

The Story So Far 10,000 20 1,000,000 # packets at 10Gb/s After this relatively long introduction, let me give an overview of the rest of my presentation. I'll talk about three different rules for sizing router buffers. The first rule is the rule-of-thumb which I just described. As I mentioned, this rule is based on the assumption that we want to have 100% link utilization at the core links. The second rule is a more recent result proposed by Appenzeller, Keslassy, and McKeown which basically challenges the original rule-of-thumb. Based on this rule if we have N flows going through the router, we can reduce the buffer size by a factor of sqrt(N) The underlying assumption is that we have a large number of flows, and the flows are desynchronized. Finally, the third rule which I’ll talk about today, says that If we are willing to sacrifice a very small amount of throughput, i.e. if having a throughput less than 100% is acceptable, We might be able to reduce the buffer sizes significantly to just O(log(W)) packets. Here W is the maximum congestion window size. If we apply each of these rules to a 10Gb/s link We will need to buffer 1,000,000 packets based on the first rule, About 10,000 packets based on the 2nd one, And only 20 packets based on the 3rd rule. For the rest of this presentation I’ll show you the intuition behind each of these rules; and Will provide some evidence that validates the rule. Let’s start with the rule-of-thumb. Assume: Large number of desynchronized flows; 100% utilization Assume: Large number of desynchronized flows; <100% utilization

Using NetFPGA to explore buffer size Need to reduce buffer size and measure occupancy Alas, not possible in commercial routers So, we will use NetFPGA instead Objective: Use NetFPGA to understand how large a buffer we need for a single TCP flow.

Why 2TxC for a single TCP Flow? Rule for adjusting W If an ACK is received: W ← W+1/W If a packet is lost: W ← W/2 Only W packets may be outstanding

Time Evolution of a Single TCP Flow Time evolution of a single TCP flow through a router. Buffer is 2T*C Time evolution of a single TCP flow through a router. Buffer is < 2T*C Here is a simulation of a single TCP flow with a buffer size equal to the bandwidth delay product. As you can see, the congestion window changes according to a sawtooth shape, and varies between 140, and 280. On the bottom graph we can see the queue occupancy. As you can see, when the congestion window is halved the buffer occupancy becomes zero, and the two curves change similarly. Note that since the pipe is full at all times, and the link utilization remains at 100%. Now, on the other hand, when the buffer size is less than the bandwidth delay product, we can see that When the congestion window is halved, the queue occupancy goes to zero and Remains at zero for a while before the congestion window is increased again, and can fill up the pipe. During this time, we see a reduction in link utilization.

NetFPGA Hardware Set for Demo #2 … CPU x2 NIC PCI-e GE GE Net-FPGA GE Video Client PCI Internet Router Hardware Server delivers streaming HD video to adjacent client GE GE GE CPU x2 NIC Video Server PCI-e PCI-e GE GE

Tutorial Outline Background The Stanford Base Reference Router Basics of an IP Router The NetFPGA Platform The Stanford Base Reference Router Demo1 : Reference Router running on the NetFPGA Inside the NetFPGA hardware Breakneck introduction to Verilog Exercise 1: Build your own Reference Router The Enhanced Reference Router Motivation: Understanding buffer size requirements in a router Demo 2: Observing and controlling the queue size Using NetFPGA for research and teaching Exercise 2: Enhancing the Reference Router The Life of a Packet Through the NetFPGA

Setup for the Demo 2 65 Each NetFPGA card has four ports Port 2 connected to Local Host Port 3 connected to adjacent Server Adjacent Server Local Host NetFPGA 65

Topology for Second Demonstration Routers connected point-to-point topology Port 3 connects to local host Port 1 connects to adjacent neighbor Ports 0 and 2 unused .2.1 .5.1 .8.1 .11.1 .4.1 .7.1 .10.1 .13.1 .1.1 .1.2 .4.2 .7.2 .10.2 .13.2 .2.2 .5.2 .8.2 .11.2 .29.1 .14.2 .29.2 .14.1 .26.2 .23.2 .20.2 .17.2 .28.2 .25.2 .22.2 .19.2 .16.2 .16.1 .28.1 .25.1 .22.1 .19.1 .26.1 .23.1 .20.1 .17.1

Enhanced Router Objectives Execution Observe router with new modules Demo 2 Objectives Observe router with new modules New modules: rate limiting, delay, event capture Execution Run event capture router Look at routing tables Explore details pane Start tcp transfer, look at queue occupancy Change rate/delay, look at queue occupancy

Step 1 - Run Pre-made Enhanced Router Demo 2 Start terminal and cd to “NF2/projects/tutorial_router/sw/” Type “./tut_adv_router_gui.pl” A familiar GUI should start

Step 3 - Explore Enhanced Router Demo 2 Click on the Details tab A similar Pipeline to the one seen previously shows with some additions

Enhanced Router Pipeline Demo 2 MAC RxQ CPU Input Arbiter Output Port Lookup TxQ Output Queues Rate Limiter Event Capture Two modules added Event Capture to capture output queue events (writes, reads, drops) Rate Limiter to create a bottleneck

Step 4 - Decrease the Link Rate Demo 2 To create bottleneck and show the TCP “sawtooth”, link-rate is decreased. In the Details tab click the “Rate Limit” module Check Enable Set link rate to 1.953Mbps 71

Step 5 – Decrease Queue Size Demo 2 Go back to the Details Panel and click on “Output Queues”. Select the “Output Queue 2” tab. Change the output queues size in packets slider to 16

Step 6 - Start Event Capture Demo 2 Click on the Event Capture module under the Details tab This should start the configuration page

Step 7 - Configure Event Capture Demo 2 Check Send to local host to receive events on the local host Check Monitor Queue 2 to monitor output queue of MAC port1 Check “Enable Capture” to start Event capture

Step 8 - Start TCP Transfer Demo 2 We will use iperf to run a large TCP transfer and look at queue evolution Start a terminal and cd to “NF2/projects/tutorial_router/sw” type “iperf.sh” 75

Step 9 - Look at Event Capture Results Demo 2 Click on the Event Capture module under the details tab. The sawtooth pattern should now be visible.

Queue Occupancy Charts

Tutorial Outline Background The Stanford Base Reference Router Basics of an IP Router The NetFPGA Platform The Stanford Base Reference Router Demo1 : Reference Router running on the NetFPGA Inside the NetFPGA hardware Breakneck introduction to Verilog Exercise 1: Build your own Reference Router The Enhanced Reference Router Motivation: Understanding buffer size requirements in a router Demo 2: Observing and controlling the queue size Using NetFPGA for research and teaching Exercise 2: Enhancing the Reference Router The Life of a Packet Through the NetFPGA

NetFPGA in the Classroom Stanford CS344: “Build an Internet Router” Courseware will be available later in 2007 Students work in teams of three (2 software, 1 hardware) Design and implement hardware and software in 8 weeks Software: CLI, PW-OSPF Show interoperability with other groups Add new features in remaining two weeks Firewall, NAT, DRR, Packet capture, Data generator, …

Networked FPGAs in Research RCP: Congestion control New module for parsing and overwriting new packet New software to calculate explicit rates Packet Monitoring (ICSI) Network Shunt Deep Packet Inspection (FPX) TCP/IP Flow Reconstruction Regular Expression Matching Bloom Filters Ethane: Network security New switch (“managed flow-table”) deployed Buffer Sizing Reduce buffer size and measure effect on network performance. Need a way to set buffer size, and measure buffer occupancy.

Tutorial Outline Background The Stanford Base Reference Router Basics of an IP Router The NetFPGA Platform The Stanford Base Reference Router Demo1 : Reference Router running on the NetFPGA Inside the NetFPGA hardware Breakneck introduction to Verilog Exercise 1: Build your own Reference Router The Enhanced Reference Router Motivation: Understanding buffer size requirements in a router Demo 2: Observing and controlling the queue size Using NetFPGA for research and teaching Exercise 2: Enhancing the Reference Router The Life of a Packet Through the NetFPGA

Enhance Your Router Objectives Execution Add new modules to datapath Exercise 2 Objectives Add new modules to datapath Synthesize and test router Execution Open user_datapath.v, uncomment delay/rate/event capture modules Synthesize After synthesis, test the new system.

An aside: xemacs Tips We will be modifying the Verilog source code Slides show xemacs, but vim also available. xemacs: To undo, use ctrl+shift+'-' To cancel a multi-keystroke command, just type ctrl+g To select lines, hold shift and press the arrow keys. To comment some selected lines, type ctrl+c+c To uncomment a commented block, move the cursor to one of the lines inside the commented block and type ctrl+c+u To save type ctrl+x+s To search, type ctrl+s search_pattern

Step 1 - Open the Source Exercise 2 We will modify the Verilog source code to add event capture, rate limiter, and delay modules We will simply comment and uncomment existing code Open terminal Type “xemacs NF2/projects/tutorial_router/src/user_data_path.v

Step 2 - Add wires Now we need to add wires to connect the new modules Exercise 2 Now we need to add wires to connect the new modules Search for “new wires” (ctrl+s new wires) then press Enter Uncomment the wires (ctrl+c+u)

Step 3 - Connect Event Capture Exercise 2 Search for opl_output (ctrl+s opl_output) then press Enter Comment the four lines above (up, shift + up + up + up + up, ctrl+c+c) Uncomment the block below to connect the outputs (ctrl+s opl_out, ctrl+c+u)

Step 4 - Add the Event Capture Module Exercise 2 Search for evt_capture_top (ctrl+s evt_capture_top) then press Enter Uncomment the block (ctrl+c+u)

Step 5 - Connect the Output Queue to the Rate Limiter Exercise 2 Search for port_outputs (ctrl+s ports_outputs, Enter) Comment the 4 lines above (select the four lines by using shift+arrow keys, then type ctrl+c+c) Uncomment the commented block by scrolling down into the block and typing ctrl+c+u

Step 6 - Add Rate Limiter Exercise 2 Scroll down until you reach the next “Excluded” block Uncomment the block containing the rate limiter instantiations. (scroll into the block and type ctrl+c+u) Save (ctrl+x+s)

Step 7 - Build the hardware Exercise 2 Start terminal, cd to “NF2/projects/tutorial_router/synth” Start synthesis with “make”

Tutorial Outline Background The Stanford Base Reference Router Basics of an IP Router The NetFPGA Platform The Stanford Base Reference Router Demo1 : Reference Router running on the NetFPGA Inside the NetFPGA hardware Breakneck introduction to Verilog Exercise 1: Build your own Reference Router The Enhanced Reference Router Motivation: Understanding buffer size requirements in a router Demo 2: Observing and controlling the queue size Using NetFPGA for research and teaching Exercise 2: Enhancing the Reference Router The Life of a Packet Through the NetFPGA

Full System Components PW-OSPF Java GUI Software Driver nf2c0 nf2c1 nf2c2 nf2c3 ioctl PCI Bus DMA Registers CPU RxQ CPU TxQ CPU RxQ CPU TxQ CPU RxQ CPU TxQ nf2_reg_grp CPU RxQ CPU TxQ NetFPGA user data path MAC TxQ MAC RxQ MAC TxQ MAC RxQ MAC TxQ MAC RxQ MAC TxQ MAC RxQ Ethernet

Life of a Packet through the hardware port0 port2 192.168.2.y 192.168.1.x IP packet

Router Stages Again MAC RxQ CPU Input Arbiter Output Port Lookup TxQ Output Queues

Inter-module Communication Using “Module Headers”: Ctrl Word (8 bits) Data Word (64 bits) x Module Hdr Contain information such as packet length, input port, output port, … … … y Last Module Hdr Eth Hdr IP Hdr … 0x10 Last word of packet

Inter-module Communication Module i Module i+1 data ctrl wr rdy

Dst MAC = port 0, Ethertype = IP MAC Rx Queue MAC Rx Queue IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 Eth Hdr: Dst MAC = port 0, Ethertype = IP Data

Dst MAC = port 0, Ethertype = IP Rx Queue Rx Queue IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 Eth Hdr: Dst MAC = port 0, Ethertype = IP Data Pkt length, input port = 0 0xff

Input Arbiter Rx Q 7 Input Arbiter Pkt … Rx Q 1 Pkt Rx Q 0 Pkt

Output Port Lookup Output Port Lookup Data Pkt length, 0xff IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 EthHdr: Dst MAC = 0 Src MAC = x, Ethertype = IP Data Pkt length, input port = 0 0xff

Output Port Lookup Output Port Lookup 5- Add output port module 1- Check input port matches Dst MAC Output Port Lookup 0x04 output port = 4 6- Modify MAC Dst and Src addresses 2- Check TTL, checksum 0xff Pkt length, input port = 0 EthHdr: Dst MAC = nextHop Src MAC = port 4, Ethertype = IP EthHdr: Dst MAC = 0 Src MAC = x, Ethertype = IP 3- Lookup next hop IP & output port (LPM) IP Hdr: IP Dst: 192.168.2.3, TTL: 63, Csum:0x3ac2 IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 7-Decrement TTL and update checksum 4- Lookup next hop MAC address (ARP) Data

Output Queues Output Queues OQ0 OQ4 Pkt OQ7

EthHdr: Dst MAC = nextHop Src MAC = port 4, MAC Tx Queue MAC Tx Queue IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 IP Dst: 192.168.2.3, TTL: 63, Csum:0x3ac2 EthHdr: Dst MAC = nextHop Src MAC = port 4, Ethertype = IP Data Pkt length, input port = 0 0xff output port = 4 0x04

EthHdr: Dst MAC = nextHop Src MAC = port 4, MAC Tx Queue MAC Tx Queue 0x04 output port = 4 0xff Pkt length, input port = 0 EthHdr: Dst MAC = nextHop Src MAC = port 4, Ethertype = IP IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 IP Hdr: IP Dst: 192.168.2.3, TTL: 63, Csum:0x3ac2 Data

Exception Packet Example: TTL = 0 or TTL = 1 Packet has to be sent to the CPU which will generate an ICMP packet as a response Difference starts at the Output Port lookup stage

Exception Packet Path Software PCI Bus PW-OSPF Java GUI Driver DMA NetFPGA PW-OSPF Java GUI Driver CPU RxQ TxQ nf2_reg_grp user data path DMA Registers nf2c0 nf2c1 nf2c2 nf2c3 ioctl MAC Ethernet

Output Port Lookup Output Port Lookup 1- Check input port matches Dst MAC Output Port Lookup 0x04 output port = 1 2- Check TTL, checksum – EXCEPTION! 0xff Pkt length, input port = 0 EthHdr: Dst MAC = 0, Src MAC = x, Ethertype = IP IP Hdr: IP Dst: 192.168.2.3, TTL: 1, Csum:0x3ab4 3- Add output port module Data

Output Queues Output Queues OQ0 OQ1 OQ2 Pkt OQ7

CPU Tx Queue CPU Tx Queue Data IP Hdr: IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4 IP Dst: 192.168.2.3, TTL: 1, Csum:0x3ab4 EthHdr: Dst MAC = 0, Src MAC = x, Ethertype = IP Data Pkt length, input port = 0 0xff output port = 1 0x04

CPU Tx Queue CPU Tx Queue Data 0x04 output port = 1 0xff Pkt length, input port = 0 EthHdr: Dst MAC = 0, Src MAC = x, Ethertype = IP IP Hdr: IP Dst: 192.168.2.3, TTL: 1, Csum:0x3ab4 Data

ICMP Packet For the ICMP packet, the packet arrives at the CPU Rx Queue from the PCI Bus Follows the same path as a packet from the MAC until the Output Port Lookup. The OPL module seeing the packet is from the CPU Rx Queue 1, sets the output port directly to 0. The packet then continues on the same path as the non-exception packet to the Output Queues and then MAC Tx queue 0.

ICMP Packet Path Software PCI Bus PW-OSPF Java GUI Driver DMA NetFPGA CPU RxQ TxQ nf2_reg_grp user data path DMA Registers nf2c0 nf2c1 nf2c2 nf2c3 ioctl MAC Ethernet

NetFPGA-Host Interaction Linux driver interfaces with hardware Packet interface via standard Linux network stack Register reads/writes via ioctl system call (with convenience wrapper functions) readReg(nf2device *dev, int address, unsigned *rd_data) writeReg(nf2device *dev, int address, unsigned *wr_data) eg: readReg(&nf2, OQ_NUM_PKTS_STORED_0, &val);

NetFPGA-Host Interaction NetFPGA to host packet transfer 1. Packet arrives – forwarding table sends to CPU queue 2. Interrupt notifies driver of packet arrival 3. Driver sets up and initiates DMA transfer PCI Bus

NetFPGA-Host Interaction NetFPGA to host packet transfer (cont) 4. NetFPGA transfers packet via DMA 5. Interrupt signals completion of DMA PCI Bus 6. Driver passes packet to network stack

NetFPGA-Host Interaction Host to NetFPGA packet transfers 3. Interrupt signals completion of DMA 2. Driver sets up and initiates DMA transfer PCI Bus 1. Software sends packet via network sockets. Packet delivered to driver.

NetFPGA-Host Interaction Register access 2. Driver performs PCI memory read/write PCI Bus 1. Software makes ioctl call on network socket. ioctl passed to driver.

NetFPGA-Host Interaction Packet transfers shown using DMA interface Alternative: use programmed IO to transfer packets via register reads/writes slower but eliminates the need to deal with network sockets

Step 8 – Perfect the Router Exercise 2 If interested, go back to “Demo 2: Step 1” after synthesis is done and redo the steps with your own router. You can also change the bandwidth and queue size settings to see how that effects the queue occupancy evolution. To run your router: 1- cd NF2/projects/tutorial_router/sw 2- type “./tut_adv_router_gui.pl --use_bin ../../../bitfiles/tutorial_router.bit”

We’re done! Congratulations!

Acknowledgements NetFPGA Team : January 2007 Jianying Luo, Glen Gibb, Nick McKeown, Greg Watson, Jim Weaver, Jad Naous, Ramanan Raghuraman, Paul Hartke, John Lockwood

Acknowledgements Support for the NetFPGA project has been provided by the following companies and institutions Disclaimer: Any opinions, findings, conclusions, or recommendations expressed in this material do not necessarily reflect the views of the National Science Foundation or of any other sponsors supporting this project.

Reference on the Web NetFPGA homepage http://NetFPGA.org