NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius

Slides:

Advertisements

Similar presentations

IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.

Advertisements

A Search Memory Substrate for High Throughput and Low Power Packet Processing Sangyeun Cho, Michel Hanna and Rami Melhem Dept. of Computer Science University.

1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.

Berlin – November 10th, 2011 NetFPGA Programmable Networking for High-Speed Network Prototypes, Research and Teaching Presented by: Andrew W. Moore (University.

A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.

Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.

Router Architecture : Building high-performance routers Ian Pratt

10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.

t Popularity of the Internet t Provides universal interconnection between individual groups that use different hardware suited for their needs t Based.

Application of NetFPGA in Network Security Hao Chen 2/25/2011.

1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.

CS 838: NetFPGA Tutorial Theophilus Benson.

Chapter 4 Queuing, Datagrams, and Addressing

Computer Networks Switching Professor Hui Zhang

Gigabit Routing on a Software-exposed Tiled-Microprocessor

CN2668 Routers and Switches Kemtis Kunanuraksapong MSIS with Distinction MCTS, MCDST, MCP, A+

Aug 20 th, 2002 Sigcomm Education Workshop 1 Teaching tools for a network infrastructure teaching lab The Virtual Router and NetFPGA Sigcomm Education.

Paper Review Building a Robust Software-based Router Using Network Processors.

NetFPGA: Reusable Router Architecture for Experimental Research Jad Naous, Glen Gibb, Sara Bolouki, and Nick Presented.

Sarang Dharmapurikar With contributions from : Praveen Krishnamurthy,

PA3: Router Junxian (Jim) Huang EECS 489 W11 /

1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.

NetFPGA Cambridge Spring School Mar Day 2: NetFPGA Cambridge Spring School Module Development and Testing Presented by: Andrew W. Moore and.

06/04/ D Spanning Tree Compliant switch Gireesh Shrimali, Jeslin Puthenparambil EE384Y Course Project.

High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.

Applied research laboratory David E. Taylor Users Guide: Fast IP Lookup (FIPL) in the FPX Gigabit Kits Workshop 1/2002.

Router Architecture Overview

High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.

Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.

Hardware Implementation of Fast Forwarding Engine using Standard Memory and Dedicated Circuit Kazuya ZAITSU, Shingo ATA, Ikuo OKA (Osaka City University,

Design and Verification of a Layer-2 MAC Classification Engine for a Gigabit Ethernet Switch Jorge Tonfat Ricardo Reis Universidade Federal do Rio Grande.

XStream: Rapid Generation of Custom Processors for ASIC Designs Binu Mathew * ASIC: Application Specific Integrated Circuit.

4/19/20021 TCPSplitter: A Reconfigurable Hardware Based TCP Flow Monitor David V. Schuehler.

Field Programmable Port Extender (FPX) 1 Modular Design Techniques for the FPX.

Network On Chip Platform

Efficient Cache Structures of IP Routers to Provide Policy-Based Services Graduate School of Engineering Osaka City University

OpenFlow MPLS and the Open Source Label Switched Router Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,

1 CSE 5346 Spring Network Simulator Project.

Lecture Note on Switch Architectures. Function of Switch.

1 A quick tutorial on IP Router design Optics and Routing Seminar October 10 th, 2000 Nick McKeown

DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:

CS/CoE 536 : Lockwood 1 CS/CoE 536 Reconfigurable System On Chip Design Lecture 11 : Priority and Per-Flow Queuing in Machine Problem 3 (Revision 2) Washington.

1 A Deficit Round Robin 20MB/s Layer 2 Switch Muraleedhara Navada Francois Labonte.

Network Layer4-1 Chapter 4 Network Layer All material copyright J.F Kurose and K.W. Ross, All Rights Reserved Computer Networking: A Top Down.

CS/CoE 536 : Lockwood 1 CS/CoE 536 Reconfigurable System On Chip Design Lecture 10 : MP3 Working Draft Washington University Fall 2002

+ Lecture#2: Ethernet Asma ALOsaimi. + Objectives In this chapter, you will learn to: Describe the operation of the Ethernet sublayers. Identify the major.

Graciela Perera Department of Computer Science and Information Systems Slide 1 of 18 INTRODUCTION NETWORKING CONCEPTS AND ADMINISTRATION CSIS 3723 Graciela.

scheduling for local-area networks”

ETHANE: TAKING CONTROL OF THE ENTERPRISE

CS 268: Router Design Ion Stoica February 27, 2003.

ARP and RARP Objectives Chapter 7 Upon completion you will be able to:

Addressing: Router Design

Reference Router on NetFPGA 1G

Chapter 4: Network Layer

IP Forwarding Relates to Lab 3.

IP Forwarding Relates to Lab 3.

Network Core and QoS.

Jason Klaus Supervisor: Duncan Elliott August 2, 2007 (Confidential)

IP Forwarding Relates to Lab 3.

Implementing an OpenFlow Switch on the NetFPGA platform

EE 122: Lecture 7 Ion Stoica September 18, 2001.

Chapter 4 Network Layer Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April Network Layer.

Network Layer: Control/data plane, addressing, routers

Project proposal: Questions to answer

IP Forwarding Relates to Lab 3.

Reference Router on NetFPGA 1G

NetFPGA - an open network development platform

Chapter 4: Network Layer

Network Core and QoS.

Presentation transcript:

NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius

Agenda  NetFPGA Development Board  Project Introduction  Design Analysis Bandwidth Analysis Top Level Architecture Data Path Design Overview Control Path Design Overview Verification and Synthesis Update  Conclusion

NetFPGA Development Board

Project Introduction  4 Port Layer-2/3 Output Queued Switch Design  Ethernet (Layer-2), IPv4, ICMP, and ARP  Programmable Routing Tables – Longest Prefix Match, Exact Match  Register support for Switch Fwd On/Off, Statistics, Queue Status, etc.  Layer-2 Broadcast, and limited Layer-3 Multicast support  Limited support for Access Control  Highly Modular Design for future expandability

 Available Data Bandwidth Memory bandwidth: 32 bits * 25 MHz = 800 Mbits/sec CFPGA to Ingress FIFO/Control Block bandwidth: 32 bits * 25 MHz / 4 = 200 Mbits/sec Packet Queue to Egress bandwidth: 32 bits * 25 MHz / 4 = 200 Mbits/sec  Packet Processing Requirements 4 ports operating at 10 Mbits/sec => 40 Mbits/sec Minimum size packet 64 Byte => 512 bits 512 bits / 40 Mbits/sec = 12.8 us Internal clock is 25 MHz 12.8 us * 25 MHz = 320 clocks to process one packet Bandwidth Analysis

Top Level Architecture

Data Flow Diagram  Output Queued Shared Memory Switch  Round Robin Scheduling  Packet Processing Engine provides L2/L3 functionality  Coarse Pipelined Arch. at the Block Level

Master Arbiter  Round Robin Scheduling of service to Each Input and Output  Interfaces Rest of the Design with Control FPGA  Co-ordinates activities of all high level blocks  Maintains Queue Status for each Output

Ingress FIFO Control Block  Interfaces three blocks Control FPGA Forwarding Engine Packet Buffer Controller  Dual Packet Memories for coarse pipelining  Responsible for Packet Replication for Broadcast

Packet Processing Engine Overview  Goals Features – L3/L2/ICMP/ARP Processing Performance Requirements – 78Kpps Fit within 60% of Single User FPGA Block Modularity / Scalability Verification / Design Ease  Actual Support for all required features + L2 broadcast, L3 multicast, LPM, Statistics and Policing (coarse access control) Performance Achieved – 234Kpps ( worst case 69Kpps for ICMP echo requests 1500bytes ) Requires only 12% of Single UFPGA resources Highly Modular Design for design/verification/scalability ease

Pkt Processing Engine Block Diagram Forwarding Master State Machine First Level Parsing Packet Memory0 ARP ProcessingL3 Processing Native Packet To Packet Buffer Packet Memory1 ICMP ProcessingL2 Processing Statistics and Policing From CFPGA

Forwarding Master State Machine  Responsible for controlling individual processing blocks  Request/Grant Scheme for future expandability  Initiates a Request for Packet to Ingress FIFO and then assigns to responsible agents based on packet contents  Replication of MSM to provide more throughput

L3 Processing Engine  Parsing of the L3 Information: Src/Dest Addr, Protocol Type, Checksum, Length, TTL  Longest Prefix Match Engine Mask Bits to represent the prefix. Lookup Key is Dest Addr Associated Info Table (AIT) Indexed using the entry hit AIT provides Destination Port Map, Destination L2 Addr, Statistics Bucket Index Request/Done scheme to allow for expandability (e.g. future m-way Trie implementation project)  ICMP Support Engine Request (if Dest Addr is Routers IP Address + Protocol Type is ICMP)  Total 85 cycles for Packet Processing with 80% of the cycles spent on Table Lookup If using 4-way trie, total processing time can be reduced to less than 30 cycles.

L2 Processing Engine  If there is any processing problems with ARP, ICMP, and/or L3, then L2 switching is done  Exact Match Engine Re-use of the LPM match engine but with Mask Bits set to all 1’s. Associated Info Table (AIT) Indexed using the entry hit AIT provides Destination Port Map, and Statistics Bucket Index Request/Done scheme to allow for expandability (e.g. future Hash implementation project)  Learning Engine removed because of Switch/Router Hardware Verification problems (HP Switch bug)  Total 76 cycles for Packet Processing with over 80% of the cycles spent on Table Lookup If using Hashing Function, total processing time can be reduced to less than 20 cycles.

Packet Buffer Interface  Interfaces with Master Arbiter and Forward Engine  Output Queued Switch Statically Assigned Single Queue per port  Off-chip ZBT SRAM on NetFPGA board

Control Block  Typical Register Rd/Wr Functionality Status Register Control Register (forwarding disable, reset) Router’s IP Addresses (port 1-4) Queue Size Registers Statistics Registers Layer-2 Table Programming Registers Layer-3 Table Programming Registers

Verification  Three Levels of Verification Performed Simulations:  Module Level – to verify the module design intent and bus functional model  System Level – using the NetFPGA verification environment for packet level simulations Hardware Verification  Ported System Level tests to create tcpdump files for NetFPGA traffic server  Very good success on Hardware with all System Level tests passing.  Only one modification required (reset generation) after Hardware Porting  Demo - Greg can provide lab access to anyone interested

Synthesis Overview  Design was ported to Altera EP20K400 Device  Logic Elements Utilized – 5833 (35% of Total LEs)  RAM ESBs Used – (21% of Total ESBs)  Max Design Clock Frequency ~ 31MHz  No Timing Violations Design Block Name Flip-flops (Actual) Ram bits (Actual) Gates (Actual) Main Arbiter Memory Controller Control Block Ingress FIFO Controller Switching and Routing Engine Total

Conclusion  Easy to achieve “required” performance in an OQ Shared Memory Switch in NetFPGA  Modularity of the design allows more interesting and challenging future projects  Design/Verification Environment was essential to meet schedule  NetFPGA is an excellent design exploration platform