John DeHart ONL NP Router Block Design Review: Lookup (Part of the PLC Block)

Slides:



Advertisements
Similar presentations
NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
Advertisements

Computer Networks20-1 Chapter 20. Network Layer: Internet Protocol 20.1 Internetworking 20.2 IPv IPv6.
OpenFlow overview Joint Techs Baton Rouge. Classic Ethernet Originally a true broadcast medium Each end-system network interface card (NIC) received every.
Paper Review Building a Robust Software-based Router Using Network Processors.
David M. Zar Applied Research Laboratory Computer Science and Engineering Department ONL Stats Block.
Jon Turner, John DeHart, Fred Kuhns Computer Science & Engineering Washington University Wide Area OpenFlow Demonstration.
Michael Wilson Block Design Review: ONL Header Format.
John DeHart and Mike Wilson SPP V2 Router Design.
Internetworking Internet: A network among networks, or a network of networks Allows accommodation of multiple network technologies Universal Service Routers.
1 - Charlie Wiseman - 05/11/07 Design Review: XScale Charlie Wiseman ONL NP Router.
Michael Wilson Block Design Review: Line Card Key Extract (Ingress and Egress)
Block Design Review: Queue Manager and Scheduler Amy M. Freestone Sailesh Kumar.
David M. Zar Applied Research Laboratory Computer Science and Engineering Department ONL Freelist Manager.
John DeHart Block Design Review: Lookup for IPv4 MR, LC Ingress and LC Egress.
Brandon Heller Block Design Review: Substrate Decap and IPv4 Parse.
Queue Manager and Scheduler on Intel IXP John DeHart Amy Freestone Fred Kuhns Sailesh Kumar.
1 - Charlie Wiseman, Shakir James - 05/11/07 Design Review: Plugin Framework Charlie Wiseman and Shakir James ONL.
John DeHart An NP-Based Router for the Open Network Lab Memory Map.
David M. Zar Block Design Review: PlanetLab Line Card Header Format.
1 - John DeHart, Jing Lu - 3/8/2016 SRAM ONL NP Router Rx (2 ME) HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) TCAM SRAM Mux (1 ME) Tx (1 ME) QM (1 ME) xScale.
Mart Haitjema Block Design Review: ONL NP Router Multiplexer (MUX)
WINLAB Open Cognitive Radio Platform Architecture v1.0 WINLAB – Rutgers University Date : July 27th 2009 Authors : Prasanthi Maddala,
John DeHart Netgames Plugin Issues. 2 - JDD - 6/13/2016 SRAM ONL NP Router Rx (2 ME) HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) TCAM SRAM Mux (1 ME) Tx.
1 Layer 3: Routing & Addressing Honolulu Community College Cisco Academy Training Center Semester 1 Version
Supercharged PlanetLab Platform, Control Overview
Flow Stats Module James Moscola September 12, 2007.
ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Design of a Diversified Router: TCAM Usage
Design of a Diversified Router: TCAM Usage
SPP Version 1 Router Plans and Design
An NP-Based Router for the Open Network Lab Design
An NP-Based Router for the Open Network Lab
John DeHart Design of a Diversified Router: Lookup Block with All Associated Data in SRAM John DeHart
An NP-Based Ethernet Switch for the Open Network Lab Design
Design of a Diversified Router: Packet Formats
ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
SPP Version 1 Router NAT John DeHart.
Design of a Diversified Router: Common Router Framework
Design of a Diversified Router: Project Management
ONL NP Router Plugins Shakir James, Charlie Wiseman, Ken Wong, John DeHart {scj1, cgw1, kenw,
An NP-Based Router for the Open Network Lab
Design of a Diversified Router: Packet Formats
Design of a Diversified Router: IPv4 MR (Dedicated NP)
SPP V2 Router Plans and Design
Flow Stats Module James Moscola September 6, 2007.
An NP-Based Router for the Open Network Lab Overview by JST
ONL Stats Engine David M. Zar Applied Research Laboratory Computer Science and Engineering Department.
Supercharged PlanetLab Platform, Control Overview
Next steps for SPP & ONL 2/6/2007
Network Core and QoS.
IXP Based Router for ONL: Architecture
An NP-Based Router for the Open Network Lab
An NP-Based Router for the Open Network Lab
QM Performance Analysis
Design of a Diversified Router: Project Assignments and Status Updates
SPP V1 Memory Map John DeHart Applied Research Laboratory Computer Science and Engineering Department.
Planet Lab Memory Map David M. Zar Applied Research Laboratory Computer Science and Engineering Department.
Design of a Diversified Router: November 2006 Demonstration Plans
Code Review for IPv4 Metarouter Header Format
Code Review for IPv4 Metarouter Header Format
An NP-Based Router for the Open Network Lab Meeting Notes
An NP-Based Router for the Open Network Lab Project Information
An NP-Based Router for the Open Network Lab Design
Implementing an OpenFlow Switch on the NetFPGA platform
SPP Router Plans and Design
IXP Based Router for ONL: Architecture
Design of a High Performance PlanetLab Node: Line Card
Design of a Diversified Router: Project Management
Network Core and QoS.
Chapter 4: outline 4.1 Overview of Network layer data plane
Presentation transcript:

John DeHart ONL NP Router Block Design Review: Lookup (Part of the PLC Block)

2 - John DeHart - 9/9/2015 Revision History 4/12/07 (JDD): »Started

3 - John DeHart - 9/9/2015 SRAM ONL NP Router Rx (2 ME) HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) TCAM SRAM Mux (1 ME) Tx (1 ME) QM (1 ME) xScale Assoc. Data ZBT-SRAM Plugin0Plugin1 Plugin2 Plugin3Plugin4 NN FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale Stats (1 ME) QM Copy Plugins SRAM NN SRAM Ring Scratch Ring NN Ring NN SRAM 64KW Each New Needs A Lot Of Mod. Needs Some Mod. Mostly Unchanged

4 - John DeHart - 9/9/2015 Contents Overview Design Latency Analysis Code Locations Test Procedures Implementation Status

5 - John DeHart - 9/9/2015 Overview Initialization »Control Plane initializes TCAM and Route and Filter DBs Runtime Updates »Control Plane updates to Route and Filter DBs Design – in upcoming slides Processing – in upcoming slides Lookup will be written in C »There are many things about writing IXP code in “C” that I need to learn. Here are some of them: Performing multiple memory operations in parallel and waiting on a set of signals (If needed for performance reasons) Performing timestamp waits Calling IDT microcode macros

6 - John DeHart - 9/9/2015 Lookup: Design -- Databases Three Databases: »Route Lookup: Unicast Ø Sorted by DAddr Prefix Length Multicast Ø Exact match on DAddr and prefix of SAddr »Primary Filter Filters should be sorted in the DB with higher priority filters first »Auxiliary Filter Filters should be sorted in the DB with higher priority filters first Priority between Primary Filter and Route Lookup »A priority will be stored with each Primary Filter »A priority will be assigned to RLs (all routes have same priority) »PF priority and RL priority compared after result is retrieved. One of them will be selected based on this priority comparison. Auxiliary Filters: »If matched, cause a copy of packet to be sent out according to the Aux Filter’s result.

7 - John DeHart - 9/9/2015 Lookup: Design -- Results Use SRAM Bank 0 (2 MB per NPU) for Results »B0 Byte Address Range: 0x – 0x1FFFFF 21 bits »B0 Word Address Range: 0x – 0x1FFFFC 19 significant bits 2 trailing 0’s Store result in two parts: »32-bit Associated Data SRAM result for Address of actual Result: TCAM Control Bits (3b) Ø Done: 1b Ø Hit: 1b Ø MHit: 1b Priority: 8b Ø Present for Primary Filters, for RL and Aux Filters should be 0 SRAM B0 Word Address: 21b Ø 2 spare bitS if needed for anything else »3 Words (<= 96 bits) of Result in SRAM Bank0 Use Multi-Database Lookup (MDL) Indirect for searching all 3 DBs »Order of fields in Key is important. Each thread will need one TCAM context

8 - John DeHart - 9/9/2015 Lookup Processing write KEY to TCAM use timestamp delay to wait appropriate time make delay long enough that we are as sure as possible that we will have to read the 1 st word of the Results MB only once while !DoneBit // DONE Bit BUG Fix requires reading just first word read 1 word from Results Mailbox and check DoneBit done read words 2 and 3 from Results Mailbox If (PrimaryFilter AND RouteLookup results HIT) { PrimaryResult.Valid  TRUE compare priorities store higher priority result as Primary Result (read result from SRAM Bank0) } else if (PrimaryFilter results HIT) { PrimaryResult.Valid  TRUE PrimaryResults.*  PrimaryFilter.* (read result from SRAM Bank0) } else if (RouterLookup results HIT) { PrimaryResult.Valid  TRUE PrimaryResults.*  RouteLookup.* (read result from SRAM Bank0) } else PrimaryResult.Valid  False if (AuxiliaryFilter result HIT) { AuxiliaryResult.Valid  TRUE AuxiliaryResults.*  (read result from SRAM Bank0) } else AuxiliaryResult.Valid  FALSE

9 - John DeHart - 9/9/2015 Lookup Key and Results Formats IP DAddr (32b)IP SAddr (32b)DPort (16b)SPort (16b) Proto (8b) TCP Flags (12b) Exceptions (16b) P (3b) P Tag (5b) QID (16b) Stats Index (16b) UCast MCast (12b) V (4b) Prio (8b) D (1b) H (1b) M H (1b) Address (21b) 32 Bit Result in TCAM Assoc. Data SRAM: 96 Bit Result in QDR SRAM Bank0: PF QID (16b) Stats Index (16b) Uni Cast (8b) V (4b) Res (8b) D (1b) H (1b) M H (1b) Address (21b) AF QID (16b) Stats Index (16b) UCast MCast (12b) V (4b) Res (8b) D (1b) H (1b) M H (1b) Address (21b) RL S B (2b) Entry Valid (1b) NH IP Valid (1b) NH MAC Valid (1b) IP MC Valid (1b) NH_MAC (48b) NH_IP (32b)Res (16b) NH_MAC (48b) NH_IP (32b)Res (16b) NH_MAC (48b) NH_IP (32b)Res (16b) Multicast Copy Vector (11b) PPS (1b) If IP MC Valid = 1 D (1b) PPS (1b) UCast Out Port (3b) UCast Out Plugin (3b) Reserved (4b) If IP MC Valid = Bit Key: RL PF and AF TCAM Ctrl Bits: D:Done H:HIT MH:Multi-Hit R e s (2b)

10 - John DeHart - 9/9/2015 Exception Bits in Lookup Key IP DAddr (32b)IP SAddr (32b)DPort (16b)SPort (16b) Proto (8b) TCP Flags (12b) Exceptions (16b) P (3b) P Tag (5b) Non-IP (1b) 140 Bit Key: RL PF and AF ARP (1b) IP Opt (1b) TTL (1b) Reserved (12b) Exception Bits: »TTL: TTL has expired. It was 0 or 1 on arriving packet »IP Opt: IP Packet contained Options »ARP: Ethertype field in ethernet header was ARP »Non-IP: Ethertype field in ethernet header was NOT IP NOTE: An ARP packet will have ARP bit and Non-IP bit set

11 - John DeHart - 9/9/2015 Performance What is our performance target? »To hit 5 Gb rate: Minimum Ethernet frame: 76B Ø 64B frame + 12B InterFrame Spacing 5 Gb/sec * 1B/8b * packet/76B = 8.22 Mpkt/sec »IXP ME processing: 1.4Ghz clock rate 1.4Gcycle/sec * 1 sec/ 8.22 Mp = cycles per packet compute budget: (MEs*170) Ø 1 ME: 170 cycles Ø 2 ME: 340 cycles Ø 3 ME: 510 cycles Ø 4 ME: 680 cycles latency budget: (threads*170) Ø 1 ME: 8 threads: 1360 cycles Ø 2 ME: 16 threads: 2720 cycles Ø 3 ME: 24 threads: 4080 cycles Ø 4 ME: 32 threads: 5440 cycles slide taken from ONL_NProuter.ppt

12 - John DeHart - 9/9/2015 Lookup Block Diagram Setup Lookup Key Write Lookup Key to TCAM TimeStamp Delay Read 1W Result from AD SRAM Write: 5W SRAM Read: 1W 150 cycles mem access Latency Check Done Bit ctx_swap Read 2W Result from AD SRAM Read: 2W 150 cycles ctx_swap Read 2 Full Results from QDR SRAM Read: 3W 150 cycles ctx_swap SRAM Read: 3W 150 cycles Setup Results for Copy 315 cycles TOTAL (No optimization) 915 cycles

13 - John DeHart - 9/9/2015 File locations (in …/ONL_Router/) Code »src/applications/ONL_Router/src/plc/ONL/lookup.c Include Paths »src/applications/ONL_Router/src/dispatch_loop/ONL/ dl_source.h and dl_source.c Ø dl_source() and dl_sink() functions »src/IDT_NSE/data_place_IXP2XXX/include IDT IIPC defines and macros »others?

14 - John DeHart - 9/9/2015 Test and Validation

15 - John DeHart - 9/9/2015 Implementation Status Still in pseudo-code Bugs Untested Optimizations:

16 - John DeHart - 9/9/2015 Extra Slides The rest of the slides are either for extra support information or are old and will be deleted when I am convinced they are no longer needed

17 - John DeHart - 9/9/2015 Route Lookup Route Lookup Key (72b) »Port (3b): Can be a wildcard (for Unicast, probably not for Multicast) Value of 111b in Port field can be used to denote a packet that originated from the XScale Value of 110b in Port field can be used to denots a packet that originated from a Plugin Ports numbered 0-4 »PluginTag (5b): Can be a wildcard (for Unicast, probably not for Multicast) Plugins numberd 0-4 »DAddr (32b) Prefixed for Unicast Exact Match for Multicast »SAddr (32b) Unicast entries always have this and its mask set to 0 Prefixed for Multicast Route Lookup: Result (96b) »Unicast/Multicast Fields (determined by IP_MCast_Valid bit (1:MCast, 0:Unicast) (13b) IP_MCast Valid (1b) MulticastFields (12b) Ø Plugin/Port Selection Bit (1b): –0: Send pkt to both Port and Plugin. Does it get the MCast CopyVector? –1: Send pkt to all Plugin bits set, include MCast CopyVector in data going to plugins Ø MCast CopyVector (11b) –One bit for each of the 5 ports and 5 plugins and one bit for the XScale, to drop a MCast, set MCast CopyVector to all 0’s UnicastFields (8b) Ø Drop Bit (1b) –0: handle normally –1: Drop Unicast pkt Ø Plugin/Port Selection Bit (1b): –0: Send packet to port indicated by Unicast Output Port field –1: Send packet to plugin indicated by Unicast Output Plugin field. Unicast Output Port, QID, Stats Index, and NH fields also get sent to plugin Ø Unicast Output Port (3b): Port or XScale –0: Port0, 1: Port1, 2: Port2, 3: Port3, 4: Port4 Ø Unicast Output Plugin (3b): –0: Plugin0, 1: Plugin1, 2: Plugin2, 3: Plugin3, 4: Plugin4 –5: XScale (treated like a plugin) »QID (16b) »Stats Index (16b) »NH_IP/NH_MAC (48b): At most one of NH_IP or NH_MAC should be valid »Valid Bits (3b): At most one of the following three bits should be set IP_MCast Valid (1b) (Also included above) NH_IP_Valid (1b) NH_MAC_Valid (1b)

18 - John DeHart - 9/9/2015 Primary Filter Primary Filter Lookup Key (140b) »Port (3b): Can be a wildcard (for Unicast, probably not for Multicast) Value of 111b in Port field to denote coming from the XScale Ports numbered 0-4 »PluginTag (5b): Can be a wildcard (for Unicast, probably not for Multicast) Plugins numberd 0-4 »DAddr (32b) »SAddr (32b) »Protocol (8b) »DPort (16b) »Sport (16b) »TCP Flags (12b) »Exception Bits (16b): Allow for directing of packets based on defined exceptions Primary Filter Result (104b) »Unicast/Multicast Fields (determined by IP_MCast_Valid bit (1:MCast, 0:Unicast) (13b) IP_MCast Valid (1b) MulticastFields (12b) Ø Plugin/Port Selection Bit (1b): –0: Send pkt to ports and plugins indicated by MCast Copy Vector. –1: Send pkt to plugin(s) indicated by MCast Copy Vector but not ports and send Plugin(s) the MuticastFields bits Ø MCast CopyVector (11b) –One bit for each of the 5 ports and 5 plugins and one bit for the XScale, to drop a MCast, set MCast CopyVector to all 0’s UnicastFields (8b) Ø Drop Bit (1b) –0: handle normally –1: Drop pkt Ø Plugin/Port Selection Bit (1b): –0: Send packet to port indicated by Unicast Output Port field –1: Send packet to plugin indicated by Unicast Output Plugin field. Unicast Output Port, QID, Stats Index, and NH fields also get sent to plugin Ø Unicast Output Port (3b): Port or XScale –0: Port0, 1: Port1, 2: Port2, 3: Port3, 4: Port4 Ø Unicast Output Plugin (3b): –0: Plugin0, 1: Plugin1, 2: Plugin2, 3: Plugin3, 4: Plugin4 –5: XScale (treated like a plugin) »QID (16b) »Stats Index (16b) »NH IP(32b)/MAC(48b) (48b): At most one of NH_IP or NH_MAC should be valid »Valid Bits (3b): At most one of the following three bits should be set IP_MCast Valid (1b) (also included above) NH IP Valid (1b) NH MAC Valid (1b) »Priority (8b)

19 - John DeHart - 9/9/2015 Auxiliary Filter Auxiliary Filter Lookup Key (140b) »Port (3b): Can be a wildcard (for Unicast, probably not for Multicast) Value of 111b in Port field to denote coming from the XScale Ports numbered 0-4 »PluginTag (5b): Can be a wildcard (for Unicast, probably not for Multicast) Plugins numberd 0-4 »DAddr (32b) »SAddr (32b) »Protocol (8b) »DPort (16b) »Sport (16b) »TCP Flags (12b) »Exception Bits (16b) Allow for directing of packets based on defined exceptions Can be wildcarded. Auxiliary Filter Lookup Result (93b) »Unicast Fields (8b): (No Multicast fields) Drop Bit (1b) (Should never actually be set by control software, but keep here for symmetry with other Unicast Fields) Ø 0: handle normally Ø 1: Drop pkt Plugin/Port Selection Bit (1b): Ø 0: Send packet to port indicated by Unicast Output Port field Ø 1: Send packet to plugin indicated by Unicast Output Plugin field. Unicast Output Port, QID, Stats Index, and NH fields also get sent to plugin Unicast Output Port (3b): Port or XScale Ø 0: Port0, 1: Port1, 2: Port2, 3: Port3, 4: Port4 Unicast Output Plugin (3b): Ø 0: Plugin0, 1: Plugin1, 2: Plugin2, 3: Plugin3, 4: Plugin4 Ø 5: XScale »QID (16b) »Stats Index (16b) »NH IP(32b)/MAC(48b) (48b): At most one of NH_IP or NH_MAC should be valid »Valid Bits (3b): At most one of the following three bits should be set NH IP Valid (1b) NH MAC Valid (1b) IP_MCast Valid (1b): Should always be 0 for AF Result »Sampling bits (2b) : For Aux Filters only 00: “Sample All” 01: Use Random Number generator 1 10: Use Random Number generator 2 11: Use Random Number generator 3