1G eth UDP IP stack SIMPLIFIED IMPLEMENTATION FROM THE FIX.QRL STABLES (CONTRIBUTOR – PETER FALL) V2.0.

Slides:



Advertisements
Similar presentations
CCNA3: Switching Basics and Intermediate Routing v3.0 CISCO NETWORKING ACADEMY PROGRAM Switching Concepts Introduction to Ethernet/802.3 LANs Introduction.
Advertisements

NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
Umut Girit  One of the core members of the Internet Protocol Suite, the set of network protocols used for the Internet. With UDP, computer.
Internet Control Protocols Savera Tanwir. Internet Control Protocols ICMP ARP RARP DHCP.
Input/Output Organization Asynchronous Bus
1 SMART Training S - Setup M - Measurement A - Analysis RT - ReporT.
1 Version 3.0 Module 6 Ethernet Fundamentals. 2 Version 3.0 Why is Ethernet so Successful? In 1973, it could carry data at 3 Mbps Now, it can carry data.
Network based System on Chip Final Presentation Part B Performed by: Medvedev Alexey Supervisor: Walter Isaschar (Zigmond) Winter-Spring 2006.
Oct 21, 2004CS573: Network Protocols and Standards1 IP: Addressing, ARP, Routing Network Protocols and Standards Autumn
Semester Copyright USM EEE442 Computer Networks Introduction: Protocols En. Mohd Nazri Mahmud MPhil (Cambridge, UK) BEng (Essex, UK)
EEC-484/584 Computer Networks Lecture 8 Wenbing Zhao
EEC-484/584 Computer Networks Lecture 14 Wenbing Zhao
IP Routing: an Introduction. Quiz
Gursharan Singh Tatla Transport Layer 16-May
1 TRANSCEIVER TECHNOLOGY Presentation explores the Transceiver Design using a leading Manufactures Sales and Specification Sheets in the field. 1. Signaling.
Winter 2013 Independent Internet Embedded System - Final A Preformed by: Genady Okrain Instructor: Tsachi Martsiano Duration: Two semesters
Students: Oleg Korenev Eugene Reznik Supervisor: Rolf Hilgendorf
Group Management n Introduction n Internet Group Management Protocol (IGMP) n Multicast Listener Discovery (MLD) protocol.
Module 10. Internet Protocol (IP) is the routed protocol of the Internet. IP addressing enables packets to be routed from source to destination using.
Document Number ETH West Diamond Avenue - Third Floor, Gaithersburg, MD Phone: (301) Fax: (301)
1 Token Passing: IEEE802.5 standard  4 Mbps  maximum token holding time: 10 ms, limiting packet length  packet (token, data) format:  SD, ED mark start,
Chapter 4: Managing LAN Traffic
University of Calgary – CPSC 441.  UDP stands for User Datagram Protocol.  A protocol for the Transport Layer in the protocol Stack.  Alternative to.
SDR Test bench Architecture WINLAB – Rutgers University Date : October Authors : Prasanthi Maddala,
Characteristics of Communication Systems
By: Daniel BarskyNatalie Pistunovich Supervisors: Rolf HilgendorfInna Rivkin.
Jon Turner, John DeHart, Fred Kuhns Computer Science & Engineering Washington University Wide Area OpenFlow Demonstration.
1 Internet Protocol. 2 Connectionless Network Layers Destination, source, hop count Maybe other stuff –fragmentation –options (e.g., source routing) –error.
The Layered Protocol Wrappers 1 Florian Braun, Henry Fu The Layered Protocol Wrappers: A Solution to Streamline Networking Functions to Process ATM Cells,
Token Passing: IEEE802.5 standard  4 Mbps  maximum token holding time: 10 ms, limiting packet length  packet (token, data) format:  SD, ED mark start,
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
TCP : Transmission Control Protocol Computer Network System Sirak Kaewjamnong.
Polytechnic University1 The internetworking solution of the Internet Prof. Malathi Veeraraghavan Elec. & Comp. Engg. Dept/CATT Polytechnic University
Hyung-Min Lee ©Networking Lab., 2001 Chapter 8 ARP and RARP.
Team Members Xuan Bao Jacob Cox Bryan Fleming Wenzhong Wu 20 February 2009.
CCNA 3 Week 4 Switching Concepts. Copyright © 2005 University of Bolton Introduction Lan design has moved away from using shared media, hubs and repeaters.
Agilent Technologies Copyright 1999 H7211A+221 v Capture Filters, Logging, and Subnets: Module Objectives Create capture filters that control whether.
Michael Wilson Block Design Review: Line Card Key Extract (Ingress and Egress)
1 Presented By: Eyal Enav and Tal Rath Eyal Enav and Tal Rath Supervisor: Mike Sumszyk Mike Sumszyk.
BAI513 - PROTOCOLS ARP BAIST – Network Management.
Lecture 4 Overview. Ethernet Data Link Layer protocol Ethernet (IEEE 802.3) is widely used Supported by a variety of physical layer implementations Multi-access.
FPGA firmware of DC5 FEE. Outline List of issue Data loss issue Command error issue (DCM to FEM) Command lost issue (PC with USB connection to GANDALF)
Cisco Network Devices Chapter 6 powered by DJ 1. Chapter Objectives At the end of this Chapter you will be able to:  Identify and explain various Cisco.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
1 CSE 5346 Spring Network Simulator Project.
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
Voice Over Internet Protocol (VoIP) Copyright © 2006 Heathkit Company, Inc. All Rights Reserved Presentation 5 – VoIP and the OSI Model.
UDP : User Datagram Protocol 백 일 우
Lab Environment and Miniproject Assignment Spring 2009 ECE554 Digital Engineering Laboratory.
David M. Zar Block Design Review: PlanetLab Line Card Header Format.
Address Resolution Protocol Yasir Jan 20 th March 2008 Future Internet.
Token Passing: IEEE802.5 standard  4 Mbps  maximum token holding time: 10 ms, limiting packet length  packet (token, data) format:
CIS 173 Networking Week #9 OBJECTIVES Chapter #6 Network Communications Protocols.
WINLAB Open Cognitive Radio Platform Architecture v1.0 WINLAB – Rutgers University Date : July 27th 2009 Authors : Prasanthi Maddala,
Chapter 4: server services. The Complete Guide to Linux System Administration2 Objectives Configure network interfaces using command- line and graphical.
Status and Plans for Xilinx Development
Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective.
IP: Addressing, ARP, Routing
Instructor Materials Chapter 5: Ethernet
ARP and RARP Objectives Chapter 7 Upon completion you will be able to:
Ken Gunnells, Ph.D. - Networking Paul Crigler - Programming
An NP-Based Router for the Open Network Lab
CS 457 – Lecture 10 Internetworking and IP
Topic 5: Communication and the Internet
New Crate Controller Development
Data Link Issues Relates to Lab 2.
An NP-Based Router for the Open Network Lab Overview by JST
Layered Protocol Wrappers Design and Interface review
Network Analyzer :- Introduction to Wireshark
LAN Addresses and ARP IP address: drives the packet to destination network LAN (or MAC or Physical) address: drives the packet to the destination node’s.
Presentation transcript:

1G eth UDP IP stack SIMPLIFIED IMPLEMENTATION FROM THE FIX.QRL STABLES (CONTRIBUTOR – PETER FALL) V2.0

FEATURES Implements UDP, IPv4, ARP protocols Zero latency between UDP and MAC layer  (combinatorial transfer during user data phase)  See simulation diagram below Allows full control of UDP src & dst ports on TX. Provides access to UDP src & dst ports on RX (user filtering) Couples directly to Xilinx Tri-Mode eth Mac via AXI interface Separate building blocks to create custom stacks Easy to tap into the IP layer directly Supports TX and RX with IP layer broadcast address Separate clock domains for tx & rx paths Choice of smaller single slot ARP or multislot up to 255 slots Tested for 1Gbit Ethernet, but applicable to 100M and 10M

SIMULATION DIAGRAM SHOWING ZERO LATENCY ON RECEIVE

LIMITATIONS Does not handle segmentation and reassembly  Assumes packets offerred for transmission will fit in a single ethernet frame  Discards packets received if they require reassembly

OVERALL BLOCK DIAGRAM UDP_Complete_nomac UDP TX bus UDP RX bus IP RX bus Clocks, controls & reset MAC TX bus MAC RX bus Our IP & MAC addr Arp & IP pkt count Generics (see block level descriptions) CLOCK_FREQ ARP_TIMEOUT ARP_MAX_PKT_TMO MAX_ARP_ENTRIES

STRUCTURAL DECOMPOSITION UDP TX bus UDP RX bus IP RX bus Clocks, controls & reset Our IP & MAC addr Arp & IP pkt count MAC TX bus MAC RX bus UDP_Complete_nomac UDP_TX UDP_RX IP_Complete_nomac Tx_arbitrator arp IPV4_TX IPV4_RX IPv4 req rsp rx tx

ARP BLOCK OPTIONS ARP can be instantiated in one of the following options:  arp – simple 1-slot ARP layer with timeout  arpv2 – multislot ARP layer with timeout These can be selected in the IP_Complete_nomac.vhd file by commenting out the appropriate line – --for arp_layer : arp use entity work.arp;-- single slot arbitrator for arp_layer : arp use entity work.arpv2;-- multislot arbitrator

ARP V2 BLOCK REQ RSP control IP RX bus Arp & IP pkt count TX RX arpv2 req sync store arp_TX arp_RX clear lookup write sync RX clock domain TX clock domain Legend:

INTERFACE entity UDP_Complete_nomac is Port ( -- UDP TX signals udp_tx_start : in std_logic;-- indicates req to tx UDP udp_txi : in udp_tx_type;-- UDP tx cxns udp_tx_result : out std_logic_vector (1 downto 0); -- tx status (changes during tx) udp_tx_data_out_ready: out std_logic;-- indicates udp_tx is ready to take data -- UDP RX signals udp_rx_start : out std_logic;-- indicates receipt of udp header udp_rxo : out udp_rx_type; -- IP RX signals ip_rx_hdr : out ipv4_rx_header_type; -- system signals rx_clk : in STD_LOGIC; tx_clk : in STD_LOGIC; reset : in STD_LOGIC; our_ip_address : in STD_LOGIC_VECTOR (31 downto 0); our_mac_address : in std_logic_vector (47 downto 0); control : in upd_control_type; -- status signals arp_pkt_count : out STD_LOGIC_VECTOR(7 downto 0); -- count of arp pkts received ip_pkt_count : out STD_LOGIC_VECTOR(7 downto 0); -- number of IP pkts received for us -- MAC Transmitter mac_tx_tdata : out std_logic_vector(7 downto 0);-- data byte to tx mac_tx_tvalid : out std_logic;-- tdata is valid mac_tx_tready : in std_logic;-- mac is ready to accept data mac_tx_tfirst : out std_logic;-- indicates firstbyte of frame mac_tx_tlast : out std_logic;-- indicates last byte of frame -- MAC Receiver mac_rx_tdata : in std_logic_vector(7 downto 0);-- data byte received mac_rx_tvalid : in std_logic;-- indicates tdata is valid mac_rx_tready : out std_logic;-- tells mac that we are ready to take data mac_rx_tlast : in std_logic-- indicates last byte of the trame ); end UDP_Complete_nomac;

THE AXI INTERFACE This implementation makes extensive use of the AXI interface (axi.vhd): package axi is type axi_in_type is record data_in : STD_LOGIC_VECTOR (7 downto 0); data_in_valid : STD_LOGIC;-- indicates data_in valid on clock data_in_last : STD_LOGIC;-- indicates last data in frame end record; type axi_out_type is record data_out_valid : std_logic;-- indicates data out is valid data_out_last : std_logic;-- indicates last byte of a frame data_out : std_logic_vector (7 downto 0); end record; end axi;

MAC INTERFACE The MAC interface is fairly simple with separate clocks for receiver and transmitter. Each interface (RX and TX) is based on the AXI interface and has an 8-bit data bus, a valid signal, a last byte signal, and a backchannel signal to indicate that the other end is ready to accept data. The Transmit interface has an additional signal ( mac_tx_tfirst) which can be used by MAC blocks that need something to indicate the start of frame. This signal is asserted simulaneous with the first byte to be transmitted (providing that tready is high). On the following diagram, tx_clk and rx_clk are shown sourced from the MAC transmit and receive blocks, but can come from an independent clock generator that feeds clocks to both the MAC blocks and the UDP_IP_stack. Data is clocked on the rising edge. UDP_IP_Stack Data (7..0) valid first last ready MAC Transmit MAC Receive Data(7..0) valid last ready tx_clk rx_clk

SYNTHESIS STATS 451 occupied slices on Xilinx xc6vlx240t (1%) (687 flipflops, 1294 LUTs) Test synthesis using  Xilinx ISE 13.4 ArchitectureSlicesFF / LUTSBlock Rams% slices used Arp (1 slot)490684/ % Arpv2 (255 slot) / %

MODULE DESCRIPTION: UDP_COMPLETE_NOMAC Simply wires up the following blocks:  UDP_TX  UDP_RX  IP_Complete_nomac Propagates the IP RX header info to the UDP_complete_nomac module interface.

MODULE DESCRIPTION: UDP_TX AND UDP_RX UDP_TX:  Very simple FSM to capture data from the supplied UDP TX header, and send out a UDP header.  Asserts data ready when in user data phase, and copies bytes from the user supplied data.  Assumes user will supply the CRC (specs allow CRC to be zero). UDP_RX  Very simple FSM to parse the UDP header from data supplied from the IP layer, and then to send user data from the IP layer to the interface (asserts udp_rxo.data.data_in_valid).  Discards IP pkts until it gets one with protocol=x11 (UDP pkt).

MODULE DESCRIPTION: IPV4 Simply wires up the following blocks:  IPv4  ARP  Tx_arbitrator Arp reads the MAX RX data in parallel with the IPv4 RX path. ARP is looking for ARP pkts, while IPv4 is looking for IP pkts. IPv4 interacts directly with ARP block during TX to ensure that the transmit destination MAC address is known. TX_arbitrator, controls access to the MAC TX layer, as both ARP and IPv4 may want to transmit at the same time.

MODULE DESCRIPTION: IPV4_TX IPv4_TX comprises two simple FSMs:  to control transmission of the header and user data  to calculate the header checksum To use,  set the TX header, and assert ip_tx_start.  The block begins to calculate the header CRC and transmit the header  Once in the user data stage, the block asserts ip_tx_data_out_ready and copies user data over to the MAC TX output

MODULE DESCRIPTION: IPV4_RX Simple FSM to parse both the ethernet frame header and the IP v4 header. Ignores packets that  Are not v4 IP packets  Require reassembly  Are not for our ip address and are not for the broadcast address Once all these checks are satisfied, the rx header data: ip_rx.hdr is valid and the module asserts ip_rx_start. Received user data is available through the ip_rx.data record.

MODULE DESCRIPTION: ARP (SINGLE SLOT VERSION) Handles receipt of ARP packets Handles transmission of ARP requests and timeout if no response received Handles request resolution (check ARP cache and request resolution if not found) Three FSMs, one for each of the above functions ARP mapper cache is only 1 deep in this implementation  which means that it is only really good for point-point comms.  Use ARPv2 if you want an implementation with more slots Input signals to module indicate our IP and MAC addresses ARP timeout is configured by generics in the ARP, IP, and UDP modules: CLOCK_FREQ : integer := ; ARP_TIMEOUT : integer := 60 CLOCK_FREQ is used to scale the rx_clk to produce a 1Hz signal for timing. ARP_TIMEOUT specifies the timeout in seconds. Note: on timeout, ARP does not retransmit the ARP req, but reports a transmit error. Send again, to send extra ARP requests.

MODULE DESCRIPTION: ARPV2 (MULTI SLOT VERSION) Handles receipt of ARP packets Handles transmission of ARP requests and timeout if no response received Handles request resolution (check ARP cache and request resolution if not found) Decomposed into modules: req- handles request response protocol and contains a single slot cache for fast lookup store- maintains a map of IP->MAC addresses, configurable size to 255 tx- encodes the «I Have» and «who has» ARP tx formats rx- decodes the ARP protocols «I have» and «who has» sync- performs clock sync between the RX and TX clock domains ARPV2 mapper cache is configurable up to 255 slots. Input signals to module indicate our IP and MAC addresses ARP ARP_MAX_PKT_TMO2 is configured by generics in the ARP, IP, and UDP modules: CLOCK_FREQ : integer := ; ARP_TIMEOUT : integer := 60 ARP_MAX_PKT_TMO : integer := 5 MAX_ARP_ENTRIES : integer := 255 CLOCK_FREQ is used to scale the rx_clk to produce a 1Hz signal for timing. ARP_TIMEOUT specifies the timeout in seconds. ARP_MAX_PKT_TMO specifies the number of received “I Have” ARP responses which don’t satisfy our request before timeout. Note: on timeout, ARP does not retransmit the ARP req, but reports a transmit error. Send again, to send extra ARP requests. MAX_ARP_ENTRIES specifies the number of slots in the ARP cache (max 255)

MODULE DESCRIPTION: TX_ARBITRATOR FSM to arbitrate access to the MAC TX layer by  IP TX path  ARP TX path One of the sources requests access and must wait until it is granted. Priority is given to the IP path as it is expected that that path has the highest request rate.

SIMULATION Every vdhl module has a corresponding RTL simulation test bench. Additionally, there are simulation test benches for various module integrations. In this version, verification is not completely automatic. The test benches test for some things, but much is left to manual inspection via the simulator waveforms.

TESTBENCH - HW The HW testbench is built around the Xilinx ML-605 prototyping card. It directly uses the card’s 200MHz clocks, Eth PHY (copper) and LEDs to indicate status. A simple VHDL driver module for the stack replies with a canned response whenever it receives a UDP pkt on a particular IP addr and port number. The Xilinx LogiCORE IP Virtex-6 FPGA Embedded Tri-Mode Ethernet MAC v2.1 is used to couple the UDP/IP stack to the board’s Ethernet PHY. This is used with the standard FIFO user buffering (which adds a one-frame delay). It should be possible also to remove this FIFO to reduce latency. A laptop provides stimulus by way of one of two Java programs:  UDPTest.java – writes one UDP pkt and waits for a response then prints it  UDPTestStream.java – writes a number of UDP pkts and prints responses The test network is a single twisted CAT-6 cable between the laptop and the ML-605 board. Wireshark (on the laptop) is used to capture the traffic on the wire (sample pcap files are included)

TEST SETUP UDP_Complete_ nomac UDP TX UDP RX Clocks & reset IP & MAC set Arp & IP pkt count: 4 leds each Xilinx mac_block TX response process Xilinx ML605 board Async TX Pushbutton Eth PHY Java Test Code running on Laptop UDP_integration_example network

TESTBENCH HW - ML605 MODULES UDP_Complete – integration of UDP with a mac layer IP Complete – integration of IP layer only with a mac layer UDP_Integration_Example – test example with vhdl process to reply to received UDP packets

TEST RESULTS The xilinx MAC layer used contains a FIFO which therefore introduces a 1 frame delay.  For tightly coupled low latency requirements, this can be removed. Output from UDPTest:  Sending packet: 1=45~34=201~18=23~ on port 2000 Got Output from UDPTestStream:  … Sending price tick 205 Sending price tick 204 Sending price tick 203 Sending price tick 202 Got Got Got Got …