Presentation is loading. Please wait.

Presentation is loading. Please wait.

An NP-Based Router for the Open Network Lab

Similar presentations


Presentation on theme: "An NP-Based Router for the Open Network Lab"— Presentation transcript:

1 An NP-Based Router for the Open Network Lab
John DeHart

2 Schedule April 10: Header Format (Mike)
April 17: Parse,Lookup and Copy (Jing and John) April 24: Stats and FreelistMgr (Dave and John) May 1: Mux (Mart) May 8: XScale (Charlie) May 15: Plugins (Charlie and Shakir)

3 Svn with remote server To connect to our current svnserve configuration from a remote machine outside of WU, you will need to use an SSH tunnel.

4 Svn with remote server

5 Svn with remote server Via cygwin then use this command:
svn checkout svn://localhost:7071/techX

6 Notes from 3/27/07 Schedule? Bank1: check SRAM access bw with QM and Rings QID: Use 1-5 for port numbers in top 3 bits so that a DG qid of 0 will not result in a full QID=0 NSP: Should we allow users to assign traffic to a DG queue Do we need the Drop bit or is it sufficient to have a copy_vector of 0 to indicate a drop? Define how/where sampling filters are implemented and how the sampling happens. QIDs and Plugins are still fuzzy. Should there be a restricted set of QIDs that the Plugins can allocate that the RLI and XScale are not allowed to allocate? Can we put Output Port in data going to the XScale? Buffer Offset: Should it point to the start of the ethernet header or start of the IP pkt? Length fields in the buffer descriptor are they ethernet frame length or IP packet length? Add Buffer Chaining slides.

7 JST: Objectives for ONL Router
Reproduce approximately same functionality as current hardware router routes, filters (including sampling filters), stats, plugins Extensions multicast, explicit-congestion marking Use each NPU as separate 5 port router each responsible for half the external ports xScale on each NPU implements CP functions access to control variables, memory-resident statistics updating of routes, filters interaction with plugins through shared memory simple message buffer interface for request/response

8 JST: Unicast, ARP and Multicast
Each port has Ethernet header with fixed source MAC address – several cases for destination MAC address Case 1 – unicast packet with destination on attached subnet requires ARP to map dAdr to MAC address ARP cache holds mappings – issue ARP request on cache miss Case 2 – other unicast packets lookup must provide next-hop IP address then use ARP to obtain MAC address, as in case 1 Case 3 – Multicast packet lookup specifies copy-vector and QiD destination MAC address formed from IP multicast address Could avoid ARP in some cases e.g. point-to-point link but little advantage, since ARP mechanism required anyway Do we learn MAC Addresses from received pkts?

9 JST: Proposed Approach
Lookup does separate route lookup and filter lookup at most one match for route, up to two for filter (primary, aux) combine route lookup with ARP cache lookup xScale adds routes for multi-access subnets, based on ARP Route lookup for unicast, stored keys are (rcv port)+(dAdr prefix) lookup key is (rcv port)+(dAdr) result includes Port/Plugin, QiD, next-hop IP or MAC address, valid next-hop bit for multicast, stored keys are (rcv port)+(dAdr)+(sAdr prefix) lookup key is (rcv port)+(dAdr)+(sAdr) result includes 10 bit copy vector, QiD Filter lookup stored key is IP 5-tuple + TCP flags – arbitrary bit masks allowed lookup key is IP 5-tuple + flags if applicable result includes Port/Plugin or copy vector, QiD, next-hop IP or MAC address, valid next-hop bit, primary-aux bit, priority Destination MAC address passed through QM via being written in the buffer descriptor? Do we have 48 bits to spare? Yes, we actually have 14 free bytes. Enough for a full (non-vlan) ethernet header.

10 JST: Lookup Processing
On receiving unicast packet, do route & filter lookups if MAC address returned by route (or higher priority primary filter) is valid, queue the packet and continue else, pass packet to xScale, marking it as no-MAC leave it to xScale to generate ARP request, handle reply, insert route and re-inject packet into data path On receiving multicast packet, do route & filter lookups take higher priority result from route lookup or primary filter format MAC multicast address copy to queues specified by copy vector if matching auxiliary filter, filter supplies MAC address

11 ONL NP Router (Jon’s Original)
xScale xScale add large SRAM ring TCAM SRAM Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) Queue Manager (1 ME) HdrFmt (1 ME) Tx (2 ME) Stats (1 ME) large SRAM ring Each output has common set of QiDs Multicast copies use same QiD for all outputs QiD ignored for plugin copies Plugin Plugin Plugin Plugin Plugin xScale SRAM large SRAM ring

12 ONL NP Router (JDD) xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME)
Ring Scratch Ring TCAM Assoc. Data ZBT-SRAM SRAM NN NN Ring 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN Mostly Unchanged 64KW SRAM 64KW Each New NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 xScale SRAM Needs A Lot Of Mod. Needs Some Mod. Stats (1 ME) Tx, QM Parse Plugin XScale QM Copy Plugins FreeList Mgr (1 ME) SRAM

13 Project Assignments With Design and Policy help from Fred and Ken
XScale daemons, etc: Charlie With Design and Policy help from Fred and Ken PLC (Parse, Lookup and Copy): Jing and JohnD With consulting from Brandon QM: Dave and JohnD Rx: Dave Tx: Dave Stats: Dave Header Format: Mike Mux: Mart? Freelist_Mgr: JohnD Plugin Framework: Charlie and Shakir With consulting from Ken Dispatch loop and utilities: All Dl_sink_to_Stats, dl_sink_to_freelist_mgr These should take in a signal and not wait Documentation: Ken With help from All Test cases and test pkt generation: Brandon

14 Project Level Stuff Upgrade to IXA SDK 4.3.1 Project Files C vs. uc
Techx/Development/IXP_SDK_4.3/{cd1,cd2,4-3-1_update} Project Files We’re working on them right now. C vs. uc Probably any new blocks should be written in C Existing code (Rx, Tx, QM, Stats) can remain as uc. Freelist Mgr might go either way. Stubs: Do we need them this time around? SRAM rings: We need to understand the implications of using them. No way to pre-test for empty/full? Subversion Do we want to take this opportunity to upgrade? Current version: Cygwin (my laptop): Linux (bang.arl.wustl.edu): 1.3.2 Available: Cygwin: subversion.tigris.org: 1.4.3

15 Hardware Promentum™ ATCA-7010 (NP Blade): Two Intel IXP2850 NPs
1.4 GHz Core 700 MHz Xscale Each NPU has: 3x256MB RDRAM, 533 MHz 3 Channels Address space is striped across all three. 4 QDR II SRAM Channels Channels 1, 2 and 3 populated with 8MB each running at 200 MHz 16KB of Scratch Memory 16 Microengines Instruction Store: 8K 40-bit wide instructions Local Memory: bit words TCAM: Network Search Engine (NSE) on SRAM channel 0 Each NPU has a separate LA-1 Interface Part Number: IDT75K72234 18Mb TCAM Rear Transition Module (RTM) Connects via ATCA Zone 3 10 1GE Physical Interfaces Supports Fiber or Copper interfaces using SFP modules.

16 Hardware ATCA Chassis NP Blade RTM

17 NP Blades

18 ONL Router Architecture
Each NPU is one 5-port Router ONL Chassis has no switch Blade 1Gb/s Links on RTM connect to external ONL switch(es) / 5x1Gb/s NPUA / 5x1Gb/s SPI NPUB RTM 7010 Blade

19 Performance What is our performance target? To hit 5 Gb rate:
Minimum Ethernet frame: 76B 64B frame + 12B InterFrame Spacing 5 Gb/sec * 1B/8b * packet/76B = 8.22 Mpkt/sec IXP ME processing: 1.4Ghz clock rate 1.4Gcycle/sec * 1 sec/ 8.22 Mp = cycles per packet compute budget: (MEs*170) 1 ME: 170 cycles 2 ME: 340 cycles 3 ME: 510 cycles 4 ME: 680 cycles latency budget: (threads*170) 1 ME: 8 threads: 1360 cycles 2 ME: 16 threads: 2720 cycles 3 ME: 24 threads: 4080 cycles 4 ME: 32 threads: 5440 cycles

20 ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Assoc. Data ZBT-SRAM SRAM 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN 64KW SRAM 64KW Each SRAM Ring NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Scratch Ring xScale SRAM NN NN Ring Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale

21 Inter Block Rings Scratch Rings (sizes in 32b Words: 128, 256, 512, 1024) XScale  MUX 3 Word per pkt 256 Word Ring 256/3 pkts PLC  XScale MUX  PLC  QM 3 Words per pkt 1024 Word Ring 1024/3 Pkts HF  TX 5 Word per pkt 256/5 pkts  Stats 1 Word per pkt 256 pkts  Freelist Mgr Total Scratch Size: 4KW (16KB) Total Used in Rings: 2.5 KW

22 Inter Block Rings SRAM Rings (sizes in 32b KW: 0.5, 1, 2, 4, 8, 16, 32, 64) RX  MUX 2 Words per pkt 64KW Ring 32K Pkts PLC  Plugins (5 of them) 3 Words per pkt 64KW Rings ~21K Pkts Plugins  MUX (1 serving all plugins) NN Rings (128 32b words) QM HF 1 Word per pkt 128 Pkts Plugin N  Plugin N+1 (for N=1 to N=4) Words per pkt is plugin dependent

23 ONL SRAM Buffer Descriptor
Problem: With the use of Filters, Plugins and recycling back around for reclassification, we can end up with an arbitrary number of copies of one packet in the system at a time. Each copy of a packet could end up going to an output port and need a different MAC DAddr from all the other packets Having one Buffer Descriptor per packet regardless of the number of copies will not be sufficient. Solution: When there are multiple copies of the packet in the system, each copy will need a separate Header buffer descriptor which will contain the MAC DAddr for that copy. When the Copy block gets a packet that it only needs to send one copy to QM, it will read the current reference count and if this copy is the ONLY copy in the system, it will not prepend the Header buffer descriptor. SRAM buffer descriptors are the scarce resource and we want to optimize their use. Therefore: We do NOT want to always prepend a header buffer descriptor Otherwise, Copy will prepend a Header buffer descriptor to each copy going to the QM. Copy does NOT need to prepend a Header buffer descriptor to copies going to plugins Copy does NOT need to prepend a Header buffer descriptor to a copy going to the XScale The Header buffer descriptors will come from the same pool (freelist 0) as the PacketPayload buffer descriptors. There is no advantage to associating these Header buffer descriptors with small DRAM buffers. DRAM is not the scarce resource SRAM buffer descriptors are the scarce resource. We want to avoid getting a descriptor coming in to PLC for reclassification with and the Header buffer descriptor chained in front of the payload buffer descriptor. Plugins and XScale should append a Header Buffer descriptor when they are sending something that has copies that is going directly to the QM or to Mux and PLC for PassThrough.

24 ONL SRAM Buffer Descriptor
Buffer_Next (32b) LW0 Buffer_Size (16b) Offset (16b) LW1 Packet_Size (16b) Free_list 0000 (4b) Reserved (4b) Ref_Cnt (8b) LW2 MAC DAddr_47_32 (16b) Stats Index (16b) LW3 MAC DAddr_31_00 (32b) LW4 EtherType (16b) Reserved (16b) LW5 Reserved (32b) LW6 Packet_Next (32b) LW7 1 Written by Rx, Added to by Copy Decremented by Freelist Mgr Ref_Cnt (8b) Written by Freelist Mgr Written by Rx Written by Copy Written by Rx and Plugins Written by QM

25 ONL DRAM Buffer and SRAM Buffer Descriptor
SRAM Buffer Descriptor Fields: Buffer_Next: ptr to next buffer in a multi-buffer packet Buffer_Size: number of bytes in the associated DRAM buffer Packet_Size: total number of bytes in the pkt QM (dequeue) uses this to decrement qlength Offset: byte offset into DRAM buffer where packet (ethernet frame) starts. From RX: 0x180: Constant offset to start of Ethernet Hdr 0x18E: Constant offset to start of IP/ARP/etc hdr However, Plugins can do ANYTHING so we cannot depend on the constant offsets. The following slides will, however, assume that nothing funny has happened. Freelist: Id of freelist that this buffer came from and should be returned to when it is freed Ref_Cnt: Number of copies of this buffer currently in the system MAC_DAddr: Ethernet MAC Destination Address that should be used for this packet Stats Index: Index into statistics counters that should be used for this packet EtherType: Ethernet Type filed that should be used for this packet Packet_Next: ptr to next packet in the queue when this packet is queued by the QM Buffer_Next (32b) EtherType (16b) Packet_Next (32b) Reserved (4b) Free_list 0000 Ref_Cnt (8b) MAC DAddr_47_32 (16b) Stats Index (16b) MAC DAddr_31_00 (32b) Reserved (32b) Buffer_Size (16b) Packet_Size (16b) Offset (16b) Reserved (16b) 0x000 Empty 0x180 Ethernet Hdr 0x18E IP Packet 0x800

26 ONL DRAM Buffer and SRAM Buffer Descriptor
Normal Unicast case: One copy of packet being sent to one output port SRAM Buffer Descriptor Fields: Buffer_Next: NULL Buffer_Size: IP_Pkt_Length Packet_Size: IP_Pkt_Length Offset: x18E Freelist: Ref_Cnt: MAC_DAddr: <result of lookup> Stats Index: <from lookup result> EtherType: 0x0800 (IP) Packet_Next: <as used by QM> Buffer_Next (32b) EtherType (16b) Packet_Next (32b) Reserved (4b) Free_list 0000 Ref_Cnt (8b) MAC DAddr_47_32 (16b) Stats Index (16b) MAC DAddr_31_00 (32b) Reserved (32b) Buffer_Size (16b) Packet_Size (16b) Offset (16b) Reserved (16b) 0x000 Empty 0x180 Ethernet Hdr 0x18E IP Packet 0x800

27 ONL DRAM Buffer and SRAM Buffer Descriptor
Multi-copy case: >1 copy of packet in system This copy going from Copy to QM to go out on an output port Header Buf Descriptor Payload Buf Descriptor Buffer_Next (32b) EtherType (16b) Packet_Next (32b) Reserved (4b) Free_list 0000 Ref_Cnt (8b) MAC DAddr_47_32 (16b) Stats Index (16b) MAC DAddr_31_00 (32b) Reserved (32b) Buffer_Size (16b) Packet_Size (16b) Offset (16b) Reserved (16b) Buffer_Next (32b) EtherType (16b) Packet_Next (32b) Reserved (4b) Free_list 0000 Ref_Cnt (8b) MAC DAddr_47_32 (16b) Stats Index (16b) MAC DAddr_31_00 (32b) Reserved (32b) Buffer_Size (16b) Packet_Size (16b) Offset (16b) Reserved (16b) 0x000 Empty 0x000 Empty 0x180 Empty 0x180 Ethernet Hdr 0x18E Empty 0x18E IP Packet 0x800 0x800

28 ONL DRAM Buffer and SRAM Buffer Descriptor
Multi-copy case (continued): >1 copy of packet in system This copy going from Copy to QM to go out on an output port Header Buf Descriptor: SRAM Buffer Descriptor Fields: Buffer_Next: ptr to payload buf desc Buffer_Size: 0 (Don’t Care) Packet_Size: IP_Pkt_Length Offset: (Don’t Care) Freelist: Ref_Cnt: MAC_DAddr: <result of lookup> Stats Index: <from lookup result> Different copies of the same packet may actually have different Stats Indices EtherType: 0x0800 (IP) Packet_Next: <as used by QM> Header Buf Descriptor Payload Buf Descriptor Buffer_Next (32b) EtherType (16b) Packet_Next (32b) Reserved (4b) Free_list 0000 Ref_Cnt (8b) MAC DAddr_47_32 (16b) Stats Index (16b) MAC DAddr_31_00 (32b) Reserved (32b) Buffer_Size (16b) Packet_Size (16b) Offset (16b) Reserved (16b) Buffer_Next (32b) EtherType (16b) Packet_Next (32b) Reserved (4b) Free_list 0000 Ref_Cnt (8b) MAC DAddr_47_32 (16b) Stats Index (16b) MAC DAddr_31_00 (32b) Reserved (32b) Buffer_Size (16b) Packet_Size (16b) Offset (16b) Reserved (16b) 0x000 Empty 0x000 Empty 0x180 Empty 0x180 Ethernet Hdr 0x18E Empty 0x18E IP Packet 0x800 0x800

29 ONL DRAM Buffer and SRAM Buffer Descriptor
Multi-copy case (continued): >1 copy of packet in system This copy going from Copy to QM to go out on an output port Payload Buf Descriptor: SRAM Buffer Descriptor Fields: Buffer_Next: NULL Buffer_Size: IP_Pkt_Length Packet_Size: IP_Pkt_Length Offset: x18E Freelist: Ref_Cnt: <number of copies currently in system> MAC_DAddr: <don’t care> Stats Index: <should not be used> EtherType: <don’t care> Packet_Next: <should not be used> Header Buf Descriptor Payload Buf Descriptor Buffer_Next (32b) EtherType (16b) Packet_Next (32b) Reserved (4b) Free_list 0000 Ref_Cnt (8b) MAC DAddr_47_32 (16b) Stats Index (16b) MAC DAddr_31_00 (32b) Reserved (32b) Buffer_Size (16b) Packet_Size (16b) Offset (16b) Reserved (16b) Buffer_Next (32b) Buffer_Size (16b) Offset (16b) Packet_Size (16b) Free_list 0000 (4b) Reserved (4b) Ref_Cnt (8b) MAC DAddr_47_32 (16b) Stats Index (16b) MAC DAddr_31_00 (32b) EtherType (16b) Reserved (16b) Reserved (32b) Packet_Next (32b) 0x000 Empty 0x000 Empty 0x180 Empty 0x180 Ethernet Hdr 0x18E Empty 0x18E IP Packet 0x800 0x800

30 ONL SRAM Buffer Descriptor
Rx writes: Buffer_size  ethernet frame length Packet_size  ethernet frame length Offset  0x180 Freelist  0 Mux Block writes: Buffer_size  (frame length from Rx) -14 Packet_size  (frame length from Rx) -14 Offset  0x18E Ref_cnt  1 Copy Block initializes a newly allocated Hdr desc: Buffer_Next to point to original payload buffer Buffer_size  0 (don’t care, noone should be using this field) Packet_size  IP Pkt Length (should be length from input ring) Offset  0 (don’t care, noone should be using this field) Stats_Index  from lookup result MAC DAddr  from lookup result (or calculated for Mcast) EtherType  0x0800 IP If copy is making copies then we must have done a classification so it must have been an IP packet Packet_Next  0 The QM will now be using the IP Pkt length for its qlength increments and decrements.

31 SRAM Usage What will be using SRAM? Buffer descriptors
Current MR supports 229,376 buffers 32 Bytes per SRAM buffer descriptor 7 MBytes Queue Descriptors Current MR supports queues 16 Bytes per Queue Descriptor 1 MByte Queue Parameters 16 Bytes per Queue Params (actually only 12 used in SRAM) QM Scheduling structure: Current MR supports batch buffers per QM ME 44 Bytes per batch buffer Bytes QM Port Rates 4 Bytes per port Plugin “scratch” memory How much per plugin? Large inter-block rings Rx  Mux  Plugins  Plugins Stats/Counters Currently 64K sets, 16 bytes per set: 1 MByte Lookup Results

32 SRAM Bank Allocation SRAM Banks:
4 MB total, 2MB per NPU Same interface/bus as TCAM Bank1-3 8 MB each Criteria for how SRAM banks should be allocated? Size: SRAM Bandwidth: How many SRAM accesses per packet are needed for the various SRAM uses? QM needs buffer desc and queue desc in same bank

33 Proposed SRAM Bank Allocation
TCAM Lookup Results SRAM Bank 1 (2.5MB/8MB): QM Queue Params (1MB) QM Scheduling Struct (0.5 MB) QM Port Rates (20B) Large Inter-Block Rings (1MB) SRAM Rings are of sizes (in Words): 0.5K, 1K, 2K, 4K, 8K, 16K, 32K, 64K Rx  Mux (2 Words per pkt): 64KW (32K pkts): 128KB  Plugin (3 Words per pkt): 64KW each (21K Pkts each): 640KB  Plugin (3 Words per pkt): 64KW (21K Pkts): 256KB SRAM Bank 2 (8MB/8MB): Buffer Descriptors (7MB) Queue Descriptors (1MB) SRAM Bank 3 (6MB/8MB): Stats Counters (1MB) Global Registers (256 * 4B) Plugin “scratch” memory (5MB, 1MB per plugin)

34 Queues and QIDs Assigned Queues vs. Datagram Queues
A flow or set of flows can be assigned to a specific Queue by assigning a specific QID to its/their filter(s) and/or route(s) A flow can be assigned to use a Datagram queue by assigning QID=0 to its filter(s) and/or route(s) There are 64 datagram queues If it sees a lookup result with a QID=0, the PLC block will calculate the datagram QID for the result based on the following hash function: DG QID = SA[9:8] SA[6:5] DA[6:5] Concatenate IP src addr bits 9 and 8, IP src addr bits 6 and 5, IP dst addr bits 6 and 5 Who/What assigns QIDs to flows? The ONL User can assign QIDs to flows or sets of flows using the RLI The XScale daemon can assign QIDs to flows on behalf of the User/RLI if so requested: User indicates that they want an assigned QID but they want the system to pick it for them and report it back to them. The ONL User indicates that they want to use a datagram queue and the data path (Copy block) calculates the QID using a defined hash fct Using the same QID for all copies of a multicast does not work The QM does not partition QIDs across ports We cannot assume that the User will partition the QIDs so we will have to enforce a partitioning.

35 Queues and QIDs (continued)
Proposed partitioning of QIDs: QID[15:13]: Port Number 0-4 (numbered 1-5) Copy block will add these bits QID[12: 0] : per port queues 8128 Reserved queues per port 64 datagram queues per port yyy xx xxxx: Datagram queues for port <yyy> QIDs : per port Reserved Queues QIDs 0-63: per port Datagram Queues With this partitioning, only 13 bits of the QID should be made available to the ONL User.

36 Lookups How will lookups be structured?
Three Databases: Route Lookup: Containing Unicast and Multicast Entries Unicast: Port: Can be wildcarded Longest Prefix Match on DAddr Routes should be shorted in the DB with longest prefixes first. Multicast Port: Can be wildcarded? Exact Match on DAddr Longest Prefix Match on SAddr Routes should be sorted in the DB with longest prefixes first. Primary Filter Filters should be sorted in the DB with higher priority filters first Auxiliary Filter Priority between Primary Filter and Route Lookup A priority will be stored with each Primary Filter A priority will be assigned to RLs (all routes have same priority) PF priority and RL priority compared after result is retrieved. One of them will be selected based on this priority comparison. Auxiliary Filters: If matched, cause a copy of packet to be sent out according to the Aux Filter’s result.

37 Route Lookup Route Lookup Key (72b) Route Lookup: Result (96b)
Port (3b): Can be a wildcard (for Unicast, probably not for Multicast) Value of 111b in Port field can be used to denote a packet that originated from the XScale Value of 110b in Port field can be used to denots a packet that originated from a Plugin Ports numbered 0-4 PluginTag (5b): Can be a wildcard (for Unicast, probably not for Multicast) Plugins numberd 0-4 DAddr (32b) Prefixed for Unicast Exact Match for Multicast SAddr (32b) Unicast entries always have this and its mask set to 0 Prefixed for Multicast Route Lookup: Result (96b) Unicast/Multicast Fields (determined by IP_MCast_Valid bit (1:MCast, 0:Unicast) (13b) IP_MCast Valid (1b) MulticastFields (12b) Plugin/Port Selection Bit (1b): 0: Send pkt to both Port and Plugin. Does it get the MCast CopyVector? 1: Send pkt to all Plugin bits set, include MCast CopyVector in data going to plugins MCast CopyVector (11b) One bit for each of the 5 ports and 5 plugins and one bit for the XScale, to drop a MCast, set MCast CopyVector to all 0’s UnicastFields (8b) Drop Bit (1b) 0: handle normally 1: Drop Unicast pkt 0: Send packet to port indicated by Unicast Output Port field 1: Send packet to plugin indicated by Unicast Output Plugin field. Unicast Output Port, QID, Stats Index, and NH fields also get sent to plugin Unicast Output Port (3b): Port or XScale 0: Port0, 1: Port1, 2: Port2, 3: Port3, 4: Port4 Unicast Output Plugin (3b): 0: Plugin0, 1: Plugin1, 2: Plugin2, 3: Plugin3, 4: Plugin4 5: XScale (treated like a plugin) QID (16b) Stats Index (16b) NH_IP/NH_MAC (48b): At most one of NH_IP or NH_MAC should be valid Valid Bits (3b): At most one of the following three bits should be set IP_MCast Valid (1b) (Also included above) NH_IP_Valid (1b) NH_MAC_Valid (1b)

38 Lookup Key and Results Formats
IP DAddr (32b) IP SAddr (32b) P Tag (5b) P (3b) Proto (8b) DPort (16b) SPort (16b) Exceptions (16b) TCP Flags (12b) 140 Bit Key: Port RL Plugin Tag PF and AF 32 Bit Result in TCAM Assoc. Data SRAM: 96 Bit Result in QDR SRAM Bank0: PF Prio (8b) D (1b) H M Address (21b) V (4b) UCast MCast (12b) QID (16b) Stats Index (16b) NH_MAC (48b) NH_IP (32b) Res (16b) AF Res (8b) D (1b) H M Address (21b) V (4b) S B (2b) R e s (2b) Uni Cast (8b) QID (16b) Stats Index (16b) NH_MAC (48b) NH_IP (32b) Res (16b) RL Res (8b) D (1b) H M Address (21b) V (4b) UCast MCast (12b) QID (16b) Stats Index (16b) NH_MAC (48b) NH_IP (32b) Res (16b) TCAM Ctrl Bits: D:Done H:HIT MH:Multi-Hit Entry Valid (1b) NH IP MAC MC D (1b) PPS UCast Out Port (3b) Out Plugin Reserved (4b) If IP MC Valid = 0 Multicast Copy Vector (11b) PPS (1b) If IP MC Valid = 1

39 Multicast Copy Vector (11b)
Route Lookup Format of the UCast/MCast fields in Ring data going to XScale and Plugins: Multicast: IP MCV = 1 1 2 3 7 Multicast Copy Vector (11b) 8 9 10 11 15 Reserved (3b) PPS (1b) IP MCV 12 Unicast: IP MCV = 0 1 2 3 7 8 9 10 11 15 Reserved (3b) D (1b) IP MCV 12 PPS UCast Out Port UCast Out Plugin (4b)

40 Primary Filter Primary Filter Lookup Key (140b)
Port (3b): Can be a wildcard (for Unicast, probably not for Multicast) Value of 111b in Port field to denote coming from the XScale Ports numbered 0-4 PluginTag (5b): Can be a wildcard (for Unicast, probably not for Multicast) Plugins numberd 0-4 DAddr (32b) SAddr (32b) Protocol (8b) DPort (16b) Sport (16b) TCP Flags (12b) Exception Bits (16b): Allow for directing of packets based on defined exceptions Primary Filter Result (104b) Unicast/Multicast Fields (determined by IP_MCast_Valid bit (1:MCast, 0:Unicast) (13b) IP_MCast Valid (1b) MulticastFields (12b) Plugin/Port Selection Bit (1b): 0: Send pkt to ports and plugins indicated by MCast Copy Vector. 1: Send pkt to plugin(s) indicated by MCast Copy Vector but not ports and send Plugin(s) the MuticastFields bits MCast CopyVector (11b) One bit for each of the 5 ports and 5 plugins and one bit for the XScale, to drop a MCast, set MCast CopyVector to all 0’s UnicastFields (8b) Drop Bit (1b) 0: handle normally 1: Drop pkt 0: Send packet to port indicated by Unicast Output Port field 1: Send packet to plugin indicated by Unicast Output Plugin field. Unicast Output Port, QID, Stats Index, and NH fields also get sent to plugin Unicast Output Port (3b): Port or XScale 0: Port0, 1: Port1, 2: Port2, 3: Port3, 4: Port4 Unicast Output Plugin (3b): 0: Plugin0, 1: Plugin1, 2: Plugin2, 3: Plugin3, 4: Plugin4 5: XScale (treated like a plugin) QID (16b) Stats Index (16b) NH IP(32b)/MAC(48b) (48b): At most one of NH_IP or NH_MAC should be valid Valid Bits (3b): At most one of the following three bits should be set IP_MCast Valid (1b) (also included above) NH IP Valid (1b) NH MAC Valid (1b) Priority (8b)

41 Auxiliary Filter Auxiliary Filter Lookup Key (140b)
Port (3b): Can be a wildcard (for Unicast, probably not for Multicast) Value of 111b in Port field to denote coming from the XScale Ports numbered 0-4 PluginTag (5b): Can be a wildcard (for Unicast, probably not for Multicast) Plugins numberd 0-4 DAddr (32b) SAddr (32b) Protocol (8b) DPort (16b) Sport (16b) TCP Flags (12b) Exception Bits (16b) Allow for directing of packets based on defined exceptions Can be wildcarded. Auxiliary Filter Lookup Result (92b) Unicast Fields (7b): (No Multicast fields) Plugin/Port Selection Bit (1b): 0: Send packet to port indicated by Unicast Output Port field 1: Send packet to plugin indicated by Unicast Output Plugin field. Unicast Output Port, QID, Stats Index, and NH fields also get sent to plugin Unicast Output Port (3b): Port or XScale 0: Port0, 1: Port1, 2: Port2, 3: Port3, 4: Port4 Unicast Output Plugin (3b): 0: Plugin0, 1: Plugin1, 2: Plugin2, 3: Plugin3, 4: Plugin4 5: XScale QID (16b) Stats Index (16b) NH IP(32b)/MAC(48b) (48b): At most one of NH_IP or NH_MAC should be valid Valid Bits (3b): At most one of the following three bits should be set NH IP Valid (1b) NH MAC Valid (1b) IP_MCast Valid (1b): Should always be 0 for AF Result Sampling bits (2b) : For Aux Filters only 00: “Sample All” 01: Use Random Number generator 1 10: Use Random Number generator 2 11: Use Random Number generator 3

42 TCAM Operations for Lookups
Five TCAM Operations of interest: Lookup (Direct) 1 DB, 1 Result Multi-Hit Lookup (MHL) (Direct) 1 DB, <= 8 Results Simultaneous Multi-Database Lookup (SMDL) (Direct) 2 DB, 1 Result Each DBs must be consecutive! Care must be given when assigning segments to DBs that use this operation. There must be a clean separation of even and odd DBs and segments. Multi-Database Lookup (MDL) (Indirect) <= 8 DB, 1 Result Each Simultaneous Multi-Database Lookup (SMDL) (Indirect) Functionally same as Direct version but key presentation and DB selection are different. DBs need not be consecutive.

43 Lookups: Proposed Design
Use SRAM Bank 0 (2 MB per NPU) for all Results B0 Byte Address Range: 0x – 0x3FFFFF 22 bits B0 Word Address Range: 0x – 0x3FFFFC 20 bits Two trailing 0’s Use 32-bit Associated Data SRAM result for Address of actual Result: Done: 1b Hit: 1b MHit: 1b Priority: 8b Present for Primary Filters, for RL and Aux Filters should be 0 SRAM B0 Word Address: 21b 1 spare bit Use Multi-Database Lookup (MDL) Indirect for searching all 3 DBs Order of fields in Key is important. Each thread will need one TCAM context Route DB: Lookup Size: 68b (3 32b words transferred across QDR intf) Core Size: 72b AD Result Size: 32b SRAM B0 Result Size: 78b (3 Words) Primary DB: Lookup Size: 136b (5 32b words transferred across QDR intf) Core Size: 144b SRAM B0 Result Size: 82b (3 Words) Priority not included in SRAM B0 result because it is in AD result

44 Block Interfaces The next set of slides show the block interfaces
These slides are still very much a work in progress

45 ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Assoc. Data ZBT-SRAM SRAM 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN 64KW SRAM 64KW Each SRAM Ring NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Scratch Ring xScale SRAM NN NN Ring Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale

46 ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Assoc. Data ZBT-SRAM SRAM 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN 64KW SRAM Buf Handle(32b) InPort (4b) Reserved (12b) Eth. Frame Len (16b) 64KW Each SRAM Ring NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Scratch Ring xScale SRAM NN NN Ring Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale

47 ONL NP Router 1 2 3 7 xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME)
Assoc. Data ZBT-SRAM SRAM 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN Flags: Src: Source (2b): 00: Rx 01: XScale 10: Plugin 11: Undefined PT(1b): PassThrough(1)/Classify(0) Reserved (5b) Rsv (4b) Out Port (4b) Buffer Handle(24b) 64KW SRAM 64KW Each L3 (IP, ARP, …) Pkt Length (16b) QID(16b) Plugin Tag (5b) In Port (3b) Flags (8b) Stats Index (16b) SRAM Ring NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Scratch Ring xScale SRAM NN NN Ring Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale Reserved (5b) Src (2b) PT (1b) 1 2 3 7

48 ONL NP Router QM will not do any Stats Operations so it does not
xScale xScale TCAM Assoc. Data ZBT-SRAM SRAM 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN 64KW QM will not do any Stats Operations so it does not Need the Stats Index. SRAM 64KW Each L3 (IP, ARP, …) Pkt Length (16b) Buffer Handle(24b) QID(16b) Rsv (4b) Out Port SRAM Ring NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Scratch Ring xScale SRAM NN NN Ring Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale

49 ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Assoc. Data ZBT-SRAM SRAM 64KW Rx (2 ME) Mux (1 ME) Buffer Handle(24b) Rsv (3b) Port (4b) V 1 Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN 64KW SRAM 64KW Each SRAM Ring NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Scratch Ring xScale SRAM NN NN Ring Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale

50 ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Assoc. Data ZBT-SRAM SRAM 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN 64KW SRAM 64KW Each Buffer Handle(24b) Rsv (3b) Port (4b) V 1 Ethernet DA[47-16] (32b) Ethernet DA[15-0](16b) Ethernet SA[31-0] (32b) Ethernet SA[47-32](16b) Ethernet Type(16b) Reserved (16b) SRAM Ring NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Scratch Ring xScale SRAM NN NN Ring Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale

51 ONL NP Router 1 2 3 7 xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME)
Flags(8b): Why pkt is being sent to XScale TTL(1b): TTL expired Options(1b): IP Options present NoRoute(1b): No matching route or filter NonIP(1b): Non IP Packet received ARP_Needed(1b): NH_IP valid, but no MAC NH_Invalid(1b): NH_IP AND NH_MAC both invalid Reserved(2b): currently unused xScale xScale Rsv (8b) Buffer Handle(24b) TCAM Assoc. Data ZBT-SRAM L3 (IP, ARP, …) Pkt Length (16b) QID(16b) SRAM Plugin Tag (5b) In Port (3b) Flags (8b) Stats Index (16b) 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN NH MAC DA[47:16] (32b) MC: 1: Multiple copies of this pkt exist in the system 0: This is the only copy of pkt NH MAC DA[15:0] (16b) EtherType (16b) M C (1b) Rsv (3b) Out Port (4b) Buffer Handle(24b) 64KW Reserved (16b) Unicast/MCast Bits (16b) SRAM L3 (IP, ARP, …) Pkt Length (16b) QID(16b) 64KW Each Plugin Tag (5b) In Port (3b) Rsv (8b) Stats Index (16b) SRAM Ring NH MAC DA[47:16] (32b) NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Scratch Ring xScale SRAM NH MAC DA[15:0] (16b) EtherType (16b) NN NN Ring Reserved (16b) Unicast/MCast bits (16b) Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale 1 2 3 7 Reserved (2b) NR (1b) TTL Opt NI ARP NH INV

52 ONL NP Router 1 2 3 7 xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME)
Assoc. Data ZBT-SRAM SRAM 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN Flags: PT(1b): PassThrough(1)/Classify(0) Reserved (7b) Rsv (4b) Out Port (4b) Buffer Handle(24b) 64KW SRAM L3 (IP, ARP, …) Pkt Length (16b) 64KW Each QID(16b) Plugin Tag (5b) In Port (3b) Flags (8b) Stats Index (16b) SRAM Ring NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Scratch Ring xScale SRAM NN NN Ring Reserved (7b) PT (1b) 1 2 3 7 Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale

53 ONL NP Router 1 2 3 7 xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME)
Flags(8b): Why pkt is being sent to XScale TTL(1b): TTL expired Options(1b): IP Options present NoRoute(1b): No matching route or filter NonIP(1b): Non IP Packet received ARP_Needed(1b): NH_IP valid, but no MAC NH_Invalid(1b): NH_IP AND NH_MAC both invalid Reserved(2b): currently unused xScale xScale Rsv (8b) Buffer Handle(24b) L3 (IP, ARP, …) Pkt Length (16b) TCAM QID(16b) Assoc. Data ZBT-SRAM SRAM Plugin Tag (5b) In Port (3b) Flags (8b) Stats Index (16b) 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NH MAC DA[47:16] (32b) NN NH MAC DA[15:0] (16b) EtherType (16b) Reserved (16b) Unicast/MCast Bits (16b) 64KW SRAM 64KW Each SRAM Ring NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Scratch Ring xScale SRAM NN NN Ring 1 2 3 7 Reserved (2b) NR (1b) TTL Opt NI ARP NH INV Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale

54 ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Flags: PassThrough/Classify (1b): Reserved (7b) Rsv (4b) Out Port (4b) Buffer Handle(24b) TCAM L3 (IP, ARP, …) Pkt Length (16b) Assoc. Data ZBT-SRAM QID(16b) SRAM Plugin Tag (5b) In Port (3b) Flags (8b) Stats Index (16b) 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN 64KW SRAM 64KW Each SRAM Ring NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Scratch Ring xScale SRAM NN NN Ring Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale

55 ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Assoc. Data ZBT-SRAM SRAM 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN 64KW SRAM 64KW Each SRAM Ring NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Opcode (4b) Data (12b) Stats Index (16b) Scratch Ring xScale SRAM NN NN Ring Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale

56 ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Assoc. Data ZBT-SRAM SRAM 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN 64KW SRAM 64KW Each SRAM Ring NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 Buffer Handle(24b) Reserved (8b) Scratch Ring xScale SRAM NN NN Ring Stats (1 ME) QM Copy Plugins SRAM FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale

57 Extra Slides Everything after this is either OLD or is just extra support data for me to use.

58 ONL NP Router TCAM xScale Parse Lookup Copy QM SRAM Plugins
Assoc. Data ZBT-SRAM Input Data Buffer Handle In Plugin In Port Out Port Flags Source (3b): Rx/XScale/Plugin PassThrough/Classify (1b): Reserved (4b) QID Frame Length Stats Index Exception Bits (16b) TTL Expired IP Options present No Route Auxiliary Result Valid (1b) CopyVector (10b) NH IP/MAC (48b) QID (16b) LD (1b): Send to XScale Drop (1b): Drop pkt NH IP Valid (1b) NH MAC Valid (1b) IP_MCast Valid (1b) Sampling bits (2b) xScale Parse Lookup Copy QM SRAM Plugins Key (136b) Port/Plugin (4b) 0-4: Port 5-9: Plugin 15: XScale DAddr (32b) SAddr (32b) Protocol (8b) DPort (16b) Sport (16b) TCP Flags (12b) Exception Bits (16b) Control Flags PassThrough/Reclassify Primary Result Valid (1b) CopyVector (10b) NH IP/MAC (48b) QID (16b) LD (1b): Send to XScale Drop (1b): Drop pkt Valid Bits (3b) NH IP Valid (1b) NH MAC Valid (1b) IP_MCast Valid (1b)

59 Lookup Results Results of a lookup could be: 1 PF/RL Result:
IP Unicast: 1 packet sent to a Port Plugin Unicast: 1 packet sent to a Plugin Unicast with Plugin Copies: 0 or 1 packet sent to a port 1-5 copies sent to plugin(s) IP Multicast: 0-10 copies sent 1 to each of 5 ports and one to each of 5 plugins 1 Aux Filter Result: 0 or 1 copy sent to a Port 1-5 copies sent to plugins

60 PLC Main() { If (PassThrough) { Copy() } Else { Parse() if (!Drop) {
Lookup()

61 PLC Lookup() { write KEY to TCAM
use timestamp delay to wait appropriate time while !DoneBit // DONE Bit BUG Fix requires reading just first word read 1 word from Results Mailbox check DoneBit done read words 2 and 3 from Results Mailbox If (PrimaryFilter and RouteLookup results HIT) { compare priorities PrimaryResult.Valid  TRUE store higher priority result as Primary Result (read result from SRAM Bank0) } else if (PrimaryFilter results HIT) { PrimaryResults.*  PrimaryFilter.* (read result from SRAM Bank0) else if (RouterLookup results HIT) { PrimaryResults.*  RouteLookup.* (read result from SRAM Bank0) if (AuxiliaryFilter result HIT) { store result as Auxiliary Result (read result from SRAM Bank0) mark Auxiliary Result VALID

62 PLC Copy() { currentRefCnt  Read(Buffer Descriptor Ref Cnt)
copyCount  0 outputData.bufferHandle  inputData.bufferHandle outputData.QID  inputData.QID outputData.frameLength  inputData.frameLength outputData.statsIndex  inputData.statsIndex if (PassThrough) { // It came from either XScale or Plugin, process inputData copyCount  1 if (inputData.outPort == XScale) { // Do we need to include any additional flags when sending to XScale? outputData.outPort  inputData.outPort outputData.Flags  inputData.Flags outputData.inPort  inputData.inPort outputData.Plugin  inputData.Plugin // Packets to XScale do not (we think) need addition Header buf desc. sendToXScale() } if (inputData.outPort == {Port}) { // Pass Through pkt should already have MAC DAddr in buffer desc. // Pass Through pkt should not need any additional Header buf desc. sendToQM() if (inputData.outPort == {Plugin}) { // Packets to Plugins do not need addition Header buf desc. sendToPlugin(Plugin#) return

63 PLC else { // Process Lookup Results
// PrimaryResult is either Primary Filter or Route Lookup, depending on Priority if (PrimaryResult.Valid == TRUE) { if (PrimaryResult.IP_MCastValid == TRUE) { IP_MCast_Daddr = read DRAM MacDAddr = calculateMCast(IP_MCast_Daddr) } else { // Unicast if (countPorts(PrimaryResult.copyVector) > 1) { ILLEGAL if (PrimaryResult.NH_Mac_Valid == TRUE) { MacDAddr = PrimaryResult.NH_Address copyCount = copyCount + countOnes(PrimaryResult.copyVector); if (AuxiliaryResult.Valid == TRUE) { if (countPorts(AuxiliaryResult.copyVector) > 1) { copyCount = copyCount + countOnes(AuxialiaryResult.copyVector); update reference counter in pkt buffer descriptor for each copy{ if ((copy is going to QM) and ((copyCount + currentRefCnt) > 1)) { Add header SRAM buffer descriptor and header DRAM buffer sendCopy(header Buffer Descriptor) else { sendCopy(Pkt Buffer Descriptor)

64 ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Ring Scratch Ring TCAM Assoc. Data ZBT-SRAM SRAM NN NN Ring 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) QM (1 ME) HdrFmt (1 ME) Tx (1 ME) NN Mostly Unchanged 64KW SRAM 64KW Each New NN NN NN NN Plugin0 Plugin1 Plugin2 Plugin3 Plugin4 xScale SRAM Needs A Lot Of Mod. Needs Some Mod. Stats (1 ME) Tx, QM Parse Plugin XScale QM Copy Plugins FreeList Mgr (1 ME) SRAM

65 ONL NP Router TCAM SRAM Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy
(3 MEs) Queue Manager (1 ME) HdrFmt (1 ME) Tx (2 ME)

66 ONL NP Router TCAM SRAM Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy
Frame Length (16b) Buffer Handle(32b) Stats Index (16b) QID(20b) Rsv (4b) Port Buf Handle(32b) Port (8b) Reserved Eth. Frame Len (16b) TCAM SRAM Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) Queue Manager (1 ME) HdrFmt (1 ME) Tx (2 ME) Buf Handle(24b) Frm Offset (16b) Frm Length(16b) Port (8) Buffer Handle(24b) Rsv (3b) Port (4b) V 1 Buffer Handle(24b) Rsv (3b) Port (4b) V 1

67 ONL NP Router Parse Lookup Do IP Router checks Extract lookup key
Buf Handle(24b) Frm Offset (16b) Frm Length(16b) Port (8) Frame Length (16b) Buffer Handle(32b) Stats Index (16b) QID(20b) Rsv (4b) Port TCAM Copy Port: Identifies Source MAC Addr Write it to buffer descriptor or let HF determine it via port? Unicast: Valid MAC: Write MAC Addr to Buffer descriptor and queue pkt No Valid MAC: Prepare pkt to be sent to XScale for ARP processing Multicast: Calculate Ethernet multicast Dst MAC Addr Fct(IP Multicast Dst Addr) Write Dst MAC Addr to buf desc. Same for all copies! For each bit set in copy bit vector: Queue a packet to port represented by bit in bit vector. Reference Count in buffer desc. Parse, Lookup, PHF&Copy (3 MEs) Parse Do IP Router checks Extract lookup key Lookup Perform lookups – potentially three lookups: Route Lookup Primary Filter lookup Auxiliary Filter lookup

68 Notes Need a reference count for multicast. (in buffer descriptor)
How to handle freeing buffer for multicast packet? Drops can take place in the following blocks: Parse QM Plugin Tx Mux  Parse Reclassify bit For traffic that does not get reclassified after coming from a Plugin or the XScale we need all the data that the QM will need: QID Stats Index Output Port If a packet matches an Aux filter AND it needs ARP processing, the ARP processing takes precedence and we do not process the Aux filter result. Does anything other than ARP related traffic go to the XScale? IP exceptions like expired TTL? Can users direct traffic for delivery to the XScale and add processing there? Probably not if we are viewing the XScale as being like our CPs in the NSP implementation.

69 Notes Combining Parse/Lookup/Copy
Dispatch loop Build settings TCAM mailboxes (there are 128 contexts) So with 24 threads we can have up to 5 TCAM contexts per thread. Rewrite Lookup in C Input and Output on Scratch rings Configurable priorities on Mux inputs Xscale, Plugins, Rx Should we allow plugins to write directly to QM input scratch ring for packets that do not need reclassification? If we allow this is there any reason for a plugin to send a packet back through Parse/Lookup/Copy if it wants it to NOT be reclassified? We can give Plugins the capability to use NN rings between themselves to chain plugins.

70 ONL NP Router xScale xScale add configurable per port delay (up to 150 ms total delay) add large SRAM ring TCAM Assoc. Data ZBT-SRAM SRAM Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (4 MEs) Queue Manager (1 ME) HdrFmt (1 ME) Tx (1 ME) Stats (1 ME) large SRAM ring Each output has common set of QiDs Multicast copies use same QiD for all outputs QiD ignored for plugin copies Plugin Plugin Plugin Plugin Plugin xScale SRAM large SRAM ring Plugin write access to QM Scratch Ring

71 ONL NP Router Each output has common set of QiDs
xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (4 MEs) Queue Manager (1 ME) HdrFmt (1 ME) Tx (1 ME) Plugin1 Plugin2 Plugin3 Plugin4 Each output has common set of QiDs Multicast copies use same QiD for all outputs QiD ignored for plugin copies Stats (1 ME) NN NN NN NN Plugin0 xScale SRAM

72 Lookup Results Results of a lookup could be:
1 PF/RL Result: IP Unicast: 1 packet sent to a Port Plugin Unicast: 1 packet sent to a Plugin Unicast with Plugin Copies: 0 or 1 packet sent to a port 1-5 copies sent to plugin(s) IP Multicast: 0-10 copies sent 1 to each of 5 ports and one to each of 5 plugins 1 Aux Filter Result: 0 or 1 copy sent to a Port 1-5 copies sent to plugins Valid Combinations of the Above: (A1 or A3) and (B1 or B3) Potentially two different unicast MAC DAddresses needed (A1 or A3) and B2 A1 and (B1 or B3) A2 and B2 A4 and B4 Potentially 1 unicast MAC DAddr and 1 multicast MAC DAddr needed

73 PLC Input Data Control Flags Key (136b) Primary Result
Buffer Handle In Plugin In Port Out Port Flags Source (3b): Rx/XScale/Plugin PassThrough/Classify (1b): Reserved (4b) QID Frame Length Stats Index Control Flags PassThrough/Reclassify Key (136b) Port/Plugin (4b) 0-4: Port 5-9: Plugin 15: XScale DAddr (32b) SAddr (32b) Protocol (8b) DPort (16b) Sport (16b) TCP Flags (12b) Exception Bits (16b) TTL Expired IP Options present No Route Primary Result Valid (1b) CopyVector (10b) NH IP/MAC (48b) QID (16b) LD (1b): Send to XScale Drop (1b): Drop pkt Valid Bits (3b) NH IP Valid (1b) NH MAC Valid (1b) IP_MCast Valid (1b) Auxiliary Result Sampling bits (2b) Output Data Buffer Handle Plugin (To XScale only) In Port (To XScale only) Out Port (To XScale or QM only) Flags (To XScale only) QID Frame Length Stats Index

74 Notes from 3/23/07 ONL Control Mtg
Using the same QID for all copies of a multicast does not work The QM does not partition QIDs across ports Do we need to support Datagram queues? Yes, we will support 64 datagram queues per port We will use the same Hash Function as in the NSP router For testing purposes, can users assign the datagram queues to filters/routes? Proposed partitioning of QIDs: QID[15:13]: Port Number 0-4 QID[12]: Reserved by RLI vs XScale 0: RLI Reserved 1: XScale Reserved QID[11: 0] : per port queues 4096 RLI reserved queues per port 4032 XScale reserved queues per port 64 datagram queues per port yyy xx xxxx: Datagram queues for port <yyy> IDT XScale software kernel memory issues still need to be resolved.

75 Notes from 3/13/07 Ethertype needs to be written to buffer descriptor so HF can get it. Who tags non-IP pkts for being sent to XScale: Parse? We will not be supporting ethernet headers with: VLANs LLC/SNAP encapsulation Add In Plugin in data going to a Plugin: In Plugin: tells the last plugin that had the packet Plugins can write to other Plugins sram rings Support for XScale participation in an IP multicast For use with Control protocols? Add In Port values for Plugin and XScale generated packets Include both In Port and In Plugin to lookup key? Should flag bits also go to Plugins For users to use our IP MCast support they must abide by the IP multicast addressing rules. i.e. Copy will do the translation of IP MCast DAddr to Ethernet MCast DAddr so if the IP DA does not conform it can’t do it.

76 Issues and Questions Upgrade to IXA SDK 4.3.1 Which Rx to use?
Techx/Development/IXP_SDK_4.3/{cd1,cd2,4-3-1_update} Which Rx to use? Intel Rx from IXA SDK is our base for further work Which Tx to use? Three options: Our current Tx (Intel IXA SDK 4.0, Radisys modifications, WU Modifications) Among other changes, we removed some code that supported buffer chaining. Radisys Tx based on SDK 4.0 – we would need to re-do our modifications This would get the buffer chaining code back if we need/want it Intel IXA SDK Tx – no Radisys modifications, we would need to re-do our modifications How will we write L3 Headers? When there are >1 copies: For a copy going to the QM, Copy allocates a buffer and buffer descriptor for the L3 Header Copy writes the DAddr into the buffer descriptor Options: HF writes full L3 header to DRAM buffer and Tx initiates the transfer from DRAM to TBUF Unicast: to packet DRAM buffer Multicast: to prepended header DRAM buffer HF/Tx writes/reads L3 header to/from Scratch ring and Tx writes it directly to TBUF When there is only one copy of the packet: No extra buffer and buffer descriptor are allocated L3 header is given to Tx in same way as it is for the >1 copy case How should Exceptions be handled? TTL Expired IP Options present No Route C vs. uc Probably any new blocks should be written in C Existing code (Rx, Tx, QM, Stats) can remain as uc. Freelist Mgr? Continued on next slide…

77 Issues and Questions Need to add Global counters
See ONLStats.ppt Global counters: Per port Rx and Tx: Pkt and Byte counters Drop counters: Rx (out of buffers) Parse (malformed IP header/pkt) QM (queue overflow) Plugin XScale Copy (lookup result has Drop bit set, lookup MISS?) Tx (internal buffer overflow) What is our performance target? 5-port Router, full link rates. How should SRAM banks be allocated? How many packets should be able to be resident in system at any given time? How many queues do we need to support? Etc. How will lookups be structured? One operation across multiple DBs vs. multiple operations each on one DB Will results be stored in Associated Data SRAM or in one of our SRAM banks? Can we use SRAM Bank0 and still get the throughput we want? Multicast: Are we defining how an ONL user should implement multicast? Or are we just trying to provide some mechanisms to allow ONL users to experiment with multicast? Do we need to allow a Unicast lookup with one copy going out and one copy going to a plugin? If so, this would use the NH_MAC field and the copy vector field Continued on next slide…

78 Issues and Questions Plugins: XScale:
Can they send pkts directly to the QM instead of always going back through Parse/Lookup/Copy? Use of NN rings between Plugins to do plugin chaining Plugins should be able to write to Stats module ring also to utilize stats counters as they want. XScale: Can it send pkts directly to the QM instead of always going through Parse/Lookup/Copy path? ARP request and reply? What else will it do besides handling ARP? Do we need to guarantee in-order delivery of packets for a flow that triggers an ARP operation? Re-injected packet may be behind a recently arrived packet for same flow. What is the format of our Buffer Descriptor: Add Reference Count (4 bits) Add MAC DAddr (48 bits) Does the Packet Size or Offset ever change once written? Yes, Plugins can change the packet size and offset. Other? Continued on next slide…

79 Issues and Questions How will we manage the Free list?
Support for Multicast (ref count in buf desc) makes reclaiming buffers a little trickier. Scratch ring to Separate ME Do we want it to batch requests? Read 5 or 10 from the scratch ring at once, compare the buffer handles and accumulate Depending on queue, copies of packets will go out close in time to one another… But vast majority of packets will be unicast so no accumulation will be possible. Or, use the CAM to accumulate 16 buffer handles Evict unicast or done multicast from CAM and actually free descriptor Do we want to put Freelist Mgr ME just ahead of Rx and use NN ring into Rx to feed buffer descriptors when we can? We might be able to have Mux and Freelist Mgr share an ME (4 threads per or something) Modify dl_buf_drop() Performance assumptions of blocks that do drops may have to be changed if we add an SRAM operation to a drop It will also add a context swap. The drop code will need to do a test_and_decr, wait for the result (i.e. context swap) and then depending on the result perhaps do the drop. Note: test_and_decr SRAM atomic operation returns pre-modified value Usage Scenarios: It would be good to document some typical ONL usage examples. This might just be extracting some stuff from existing ONL documentation and class projects. Ken? It might also be good to document a JST dream sequence for an ONL experiment Oh my, what I have done now… Do we need to worry about balancing MEs across the two clusters? QM and Lookup are probably heaviest SRAM users Rx and Tx are probably heaviest DRAM users. Plugins need to be in neighboring MEs QM and HF need to be in neighboring MEs

80 SRAM Buffer Descriptor
Problem: With the use of Filters, Plugins and recycling back around for reclassification, we can end up with an arbitrary number of copies of one packet in the system at a time. Each copy of a packet could end up going to an output port and need a different MAC DAddr from all the other packets Having one Buffer Descriptor per packet regardless of the number of copies will not be sufficient. Solution: When there are multiple copies of the packet in the system, each copy will need a separate Header buffer descriptor which will contain the MAC DAddr for that copy. When the Copy block gets a packet that it only needs to send one copy to QM, it will read the current reference count and if this copy is the ONLY copy in the system, it will not prepend the Header buffer descriptor. SRAM buffer descriptors are the scarce resource and we want to optimize their use. Therefore: We do NOT want to always prepend a header buffer descriptor Otherwise, Copy will prepend a Header buffer descriptor to each copy going to the QM. Copy does not need to prepend a Header buffer descriptor to copies going to plugins We have to think some more about the case of copies going to the XScale. The Header buffer descriptors will come from the same pool (freelist 0) as the PacketPayload buffer descriptors. There is no advantage to associating these Header buffer descriptors with small DRAM buffers. DRAM is not the scarce resource SRAM buffer descriptors are the scarce resource.

81 MR Buffer Descriptor Buffer_Next (32b) Buffer_Size (16b) Offset (16b)
LW0 Buffer_Size (16b) Offset (16b) LW1 Packet_Size (16b) Free_list 0000 (4b) Reserved (4b) Reserved (8b) LW2 Reserved (16b) Stats Index (16b) LW3 Reserved (16b) Reserved (8b) Reserved (4b) Reserved (4b) LW4 Reserved (4b) Reserved (4b) Reserved (32b) LW5 Reserved (16b) Reserved (16b) LW6 Packet_Next (32b) LW7

82 Intel Buffer Descriptor
Buffer_Next (32b) LW0 Buffer_Size (16b) Offset (16b) LW1 Packet_Size (16b) Free_list (4b) Rx_stat (4b) Hdr_Type (8b) LW2 Input_Port (16b) Output_Port (16b) LW3 Next_Hop_ID (16b) Fabric_Port (8b) Reserved (4b) NHID type (4b) LW4 ColorID (4b) Reserved (4b) FlowID (32b) LW5 Class_ID (16b) Reserved (16b) LW6 Packet_Next (32b) LW7

83 SRAM Accesses Per Packet
To support 8.22 M pkts/sec we can have 24 Reads and 24 Writes per pkt (200M/8.22M) Rx: SRAM Dequeue (1 Word) To retrieve a buffer descriptor from free list Write buffer desc (2 Words) Parse Lookup TCAM Operations Reading Results Copy Write buffer desc (3 Words) Ref_cnt MAC DAddr Stats Index Pre-Q stats increments Read: 2 Words Write: 2 Words HF Should not need to read or write any of the buffer descriptor Tx Read buffer desc (4 Words) Freelist Mgr: SRAM Enqueue – Write 1 Word To return buffer descriptor to free list.

84 QM SRAM Accesses Per Packet
QM (Worst case analysis) Enqueue (assume queue is idle and not loaded in Q-Array) Write Q-Desc (4 Words) Eviction of Least Recently Used Queue Write Q-Params ? When we evict a Q do we need to write its params back? The Q-Length is the only thing that the QM is changing. Looks like it writes it back ever time it enqueues or dequeues AND it writes it back when it evcicts (we can probably remove the one when it evicts) Read Q-Desc (4 Words) Read Q-Params (3 Words) Q-Length, Threshold, Quantum Write Q-Length (1 Word) SRAM Enqueue -- Write (1 Word) Scheduling structure accesses? They are done once every 5 pkts (when running full rate) Dequeue (assume queue is not loaded in Q-Array) See notes in enqueue section SRAM Dequeue -- Read (1 Word) Post-Q stats increments 2 Reads 2 Writes

85 QM SRAM Accesses Per Packet
QM (Worst case analysis) Total Per Pkt accesses: Queue Descriptors and Buffer Enq/Deq: Write: 9 Words Read: 9 Words Queue Params: Write: 2 Words Read: 6 Words Scheduling Structure Accesses Per Iteration (batch of 5 packets): Advance Head: Read 11 Words Write Tail: Write 11 Words Update Freelist Read 2 Words OR Write 5 Words

86 TCAM Core Lookup Performance
Routes Filters Lookup/Core size of 72 or 144 bits, Freq=200MHz CAM Core can support 100M searches per second For 1 Router on each of NPUA and NPUB: 8.22 MPkt/s per Router 3 Searches per Pkt (Primary Filter, Aux Filter, Route Lookup) Total Per Router: M Searches per second TCAM Total: M Searches per second So, the CAM Core can keep up Now lets look at the LA-1 Interfaces…

87 TCAM LA-1 Interface Lookup Performance
Routes Filters Lookup/Core size of 144 bits (ignore for now that Route size is smaller) Each LA-1 interface can support 40M searches per second. For 1 Router on each of NPUA and NPUB (each NPU uses a separate LA-1 Intf): 8.22 MPkt/s per Router Maximum of 3 Searches per Pkt (Primary Filter, Aux Filter, Route Lookup) Max of 3 assumes they are each done as a separate operation Total Per Interface: M Searches per second So, the LA-1 Interfaces can keep up Now lets look at the AD SRAM Results …

88 TCAM Assoc. Data SRAM Results Performance
8.22M 72b or 144b lookups 32b results consumes 1/12 64b results consumes 1/6 128b results consumes 1/3 Routes Filters Lookup/Core size of 72 or 144 bits, Freq=200MHz, SRAM Result Size of 128 bits Associated SRAM can support up to 25M searches per second. For 1 Router on each of NPUA and NPUB: 8.22 MPkt/s per Router 3 Searches per Pkt (Primary Filter, Aux Filter, Route Lookup) Total Per Router: M Searches per second TCAM Total: M Searches per second So, the Associated Data SRAM can NOT keep up

89 Lookups: Latency Three searches in one MDL Indirect Operation
Latencies for operation QDR xfer time: 6 clock cycles 1 for MDL Indirect subinstruction 5 for 144 bit key transferred across QDR Bus Instruction Fifo: 2 clock cycles Synchronizer: 3 clock cycles Execution Latency: search dependent Re-Synchronizer: 1 clock cycle Total: 12 clock cycles

90 Lookups: Latency 144 bit DB, 32 bits of AD (two of these)
Instruction Latency: 30 Core blocking delay: 2 Backend latency: 8 72 bit DB, 32 bits of AD Core blocking delay:2 Latency of first search (144 bit DB): = 41 clock cycles Latency of subsequent searchs: (previous search latency) – (backend latency of previous search) + (core block delay of previous search) + (backend latency of this search) Latency of second 144 bit search: 41 – = 43 Latency of third search (72 bit): 43 – = 45 clock cycles 45 QDR Clock cycles (200 MHz clock)  315 IXP Clock cycles (1400 MHz clock) This is JUST for the TCAM operation, we also need to read the SRAM: SRAM Read to retrieve TCAM Results Mailbox (3 words – one per search) TWO SRAM Reads to then retrieve the full results (3 Words each) from SRAM Bank 0 but we don’t have to wait for one to complete before issuing the second. About 150 IXP cycles for an SRAM Read  = 615 IXP Clock cycles Lets estimate 650 IXP Clock cycles for issuing, performing and retrieving results for a lookup. (multi-word, two reads, …) Does not include any lookup block processing

91 Lookups: SRAM Bandwidth
Analysis is PER LA-1 QDR Interface That is, each of NPUA and NPUB can do the following. 16-bit QDR SRAM at 200 MHz Separate read and write bus Operations on rising and falling edge of each clock 32 bits of read AND 32 bits of write per clock tick QDR Write Bus: 6 32-bit cycles per instruction Cycle 0: Write Address bus contains the TCAM Indirect Instruction Write Data bus contains the TCAM Indirect MDL Sub-Instruction Cycles 1-5 Write Data bus contains the 5 words of the Lookup Key Write Bus can support 200M/6 = M searches/sec QDR Read Bus: Retrieval of Results Mailbox: 3 32-bit cycles per instruction Retrieval of two full results from QDR SRAM Bank 0: Total of 9 32-bit cycles per instruction Read Bus can support 200M/9 = M searches/sec Conclusion: Plenty of SRAM bandwidth to support TCAM operations AND SRAM Bank 0 accesses to perform all aspects of lookups at over 8.22 M searches/sec.

92 Lookups Route Lookup: Key (72b)
Port (4b): Can be a wildcard (for Unicast, probably not for Multicast) Value of 1111b in Port field to denote coming from the XScale Ports numbered 0-4 Plugin (4b): Can be a wildcard (for Unicast, probably not for Multicast) Plugins numberd 0-4 DAddr (32b) Prefixed for Unicast Exact Match for Multicast SAddr (32b) Unicast entries always have this and its mask set to 0 Prefixed for Multicast Result (99b) CopyVector (11b) One bit for each of the 5 ports and 5 plugins and one bit for the XScale PluginOutputPortVector(5b) (under consideration) This would allow users to send packets to a plugin which could then send it along to output port(s). The copyvector is not useful for this since bits set in the copyvector would cause the Copy block to send out multiple copies to different places. QID (16b) Stats Index (16b) NH_IP/NH_MAC (48b) At most one of NH_IP or NH_MAC should be valid Valid Bits (3b) At most one of the following three bits should be set IP_MCast Valid (1b) NH_IP_Valid (1b) NH_MAC_Valid (1b)

93 Lookups Filter Lookup Key (140b)
Port (4b): Can be a wildcard (for Unicast, probably not for Multicast) Value of 1111b in Port field to denote coming from the XScale Ports numbered 0-4 Plugin (4b): Can be a wildcard (for Unicast, probably not for Multicast) Plugins numberd 0-4 DAddr (32b) SAddr (32b) Protocol (8b) DPort (16b) Sport (16b) TCP Flags (12b) Exception Bits (16b) Allow for directing of packets based on defined exceptions Result (109b) CopyVector (11b) One bit for each of the 5 ports and 5 plugins and one bit for the XScale PluginOutputPortVector(5b) (under consideration) This would allow users to send packets to a plugin which could then send it along to output port(s). The copyvector is not useful for this since bits set in the copyvector would cause the Copy block to send out multiple copies to different places. NH IP(32b)/MAC(48b) (48b) At most one of NH_IP or NH_MAC should be valid QID (16b) Stats Index (16b) Valid Bits (3b) At most one of the following three bits should be set NH IP Valid (1b) NH MAC Valid (1b) IP_MCast Valid (1b) Sampling bits (2b) : For Aux Filters only 00: “Sample All” Priority (8b) : For Primary Filters only

94 Filters, Ports/Plugins, Unicast/Multicast, Etc
The following slides are my thoughts on the current problems we have been having with filters, plugins etc. and my proposal on how to address the problems. We have perhaps over-generalized our model/design. We have a copy vector in all of our filter and route lookup results allowing users to copy any packet to <= 1 Port and 0 or more plugins, even when the filter/route is “Unicast” We have also generalized the design to allow plugins to send a packet directly to the QM either by putting it directly into the QM input ring or sending it back to the MUX block and then the PLC block with a flag bit indicating the packet is to NOT be reclassified. This has led to difficulties in supporting what we think Plugins will want to do with packets when they receive them. There is only one set of output information (Port, QID, NH Address) There is no easy way to indicate to a Plugin through the Filter or Route Lookup result where it should send the packet next. We have also generalized the Route Lookups so that they could be easily implemented as a Primary Filter. This can be somewhat confusing to someone who may ask the question of why we have both and which one should be used when.

95 Filters, Ports/Plugins, Unicast/Multicast, Etc
I propose we return to a simpler model: A Unicast Primary Filter has a result that can send ONE copy to either ONE Port or to ONE Plugin. A Unicast Route Lookup has a result that can send ONE copy to either ONE Port or to ONE Plugin. A Unicast Auxiliary Filter can be used to send ONE copy to either ONE Port or to ONE Plugin. Plugins are allowed to make copies of packets and send them where they want: The Plugin Framework will support the making of copies Packets with an IP Multicast Destination Address can match: Primary Filters and Route Lookups: Result in 0 to 10 copies Each copy can go to 1 of the 5 output ports or 1 of the 5 plugins Two copies can NOT go to the same Port or to the same Plugin directly: Plugins can redirect a packet anywhere after it processes thus causing two or more copies end up going to the same Port or Plugin. Auxiliary Filters: Result in 1 copy going to either ONE port or ONE Plugin An Auxiliary Filter matching an IP Multicast Destination Address should provide the NH IP or NH MAC Address if it is directed to an Output Port. (IE: Aux filters will not utilize the IP MCast Address to Ethernet MCast Address translation)

96 Filters, Ports/Plugins, Unicast/Multicast, Etc
Route Lookup: Result (101b) MCast CopyVector (11b) One bit for each of the 5 ports and 5 plugins and one bit for the XScale Plugin/Port Selection Bit (1b): 0: Send packet to port indicated by Unicast Output Port field 1: Send packet to plugin indicated by Unicast Output Plugin field Unicast Output Port, QID, Stats Index, and NH fields also get sent to plugin Unicast Output Port (3b): Port or XScale 0: Port0, 1: Port1, 2: Port2, 3: Port3, 4: Port4 Unicast Output Plugin (3b): 0: Plugin0, 1: Plugin1, 2: Plugin2, 3: Plugin3, 4: Plugin4 5: XScale QID (16b) Stats Index (16b) NH_IP/NH_MAC (48b) At most one of NH_IP or NH_MAC should be valid Valid Bits (3b) At most one of the following three bits should be set IP_MCast Valid (1b) NH_IP_Valid (1b) NH_MAC_Valid (1b) We can probably be clever and overload the MCast CopyVector field and use it for the Plugin/Port Selection bit, Unicast Output Port and Unicast Output Plugin fields as well. If the IP_MCast Valid bit is set then it is an MCast_CopyVector if the IP_MCast Valid Bit is not set then it is used as the Unicast fields.

97 Filters, Ports/Plugins, Unicast/Multicast, Etc
Primary Filter Lookup Result (119b) Plugin/Port Selection Bit (1b): 0: Unicast: Send packet to port indicated by Unicast Output Port field MCast: Send pkt to both Port and Plugin. Does it get the MCast CopyVector? 1: Unicast: Send packet to plugin indicated by Unicast Output Plugin field. Unicast Output Port, QID, Stats Index, and NH fields also get sent to plugin MCast: Send pkt to all Plugin bits set, include MCast CopyVector in data going to plugins MCast CopyVector (11b) One bit for each of the 5 ports and 5 plugins and one bit for the XScale Unicast Output Port (3b): Port or XScale 0: Port0, 1: Port1, 2: Port2, 3: Port3, 4: Port4 Unicast Output Plugin (3b): 0: Plugin0, 1: Plugin1, 2: Plugin2, 3: Plugin3, 4: Plugin4 5: XScale NH IP(32b)/MAC(48b) (48b) At most one of NH_IP or NH_MAC should be valid QID (16b) Stats Index (16b) Valid Bits (3b) At most one of the following three bits should be set NH IP Valid (1b) NH MAC Valid (1b) IP_MCast Valid (1b) Priority (8b) We can probably be clever and overload the MCast CopyVector field and use it for the Plugin/Port Selection bit, Unicast Output Port and Unicast Output Plugin fields as well. If the IP_MCast Valid bit is set then it is an MCast_CopyVector if the IP_MCast Valid Bit is not set then it is used as the Unicast fields.

98 Filters, Ports/Plugins, Unicast/Multicast, Etc
Auxiliary Filter Lookup Result (92b) Plugin/Port Selection Bit (1b): 0: Send packet to port indicated by Unicast Output Port field Ignore Unicast Output Plugin field 1: Send packet to plugin indicated by Unicast Output Plugin field Unicast Output Port, QID, Stats Index, and NH fields also get sent to plugin Unicast Output Port (3b): Port or XScale 0: Port0, 1: Port1, 2: Port2, 3: Port3, 4: Port4 5: XScale Unicast Output Plugin (3b): 0: Plugin0, 1: Plugin1, 2: Plugin2, 3: Plugin3, 4: Plugin4 NH IP(32b)/MAC(48b) (48b) At most one of NH_IP or NH_MAC should be valid QID (16b) Stats Index (16b) Valid Bits (3b) At most one of the following three bits should be set NH IP Valid (1b) NH MAC Valid (1b) IP_MCast Valid (1b) Sampling bits (2b) : For Aux Filters only 00: “Sample All”

99 Filters, Ports/Plugins, Unicast/Multicast, Etc
A packet could match either a Primary Filter or a Route Lookup but not both. If it matches both, one of the will be selected based on priority. In addition to a Primary Filter or Route Lookup, a packet could match an Auxiliary Filter Extra copies of Unicast flows could be made using Auxiliary filters, Plugins and sending packets back through PLC for reclassification. With this model, I believe a Plugin would have the information it would need to process a packet and send it on to the specified output port.

100 Filters, Ports/Plugins, Unicast/Multicast, Etc
This is the end of the slides on Filters, Ports/Plugins, Unicast/Multicast, Etc.

101 ONL NP Router xScale Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs)
Flags: PassThrough/Classify (1b): Reserved (7b) Flags: Source (3b): Rx/XScale/Plugin PassThrough/Classify (1b): Reserved (4b) Rsv (4b) Out Port (4b) Buffer Handle(24b) Rsv (4b) Out Port (4b) Buffer Handle(24b) L3 (IP, ARP, …) Pkt Length (16b) QID(16b) xScale L3 (IP, ARP, …) Pkt Length (16b) QID(16b) In Plugin (4b) In Port (4b) Flags (8b) Stats Index (16b) In Plugin (4b) In Port (4b) Flags (8b) Stats Index (16b) 64KW Rx (2 ME) Mux (1 ME) Parse, Lookup, Copy (3 MEs) 64KW Flags: PassThrough/Classify (1b): Reserved (7b) Buf Handle(32b) InPort (4b) Reserved (12b) Eth. Frame Len (16b) Rsv (4b) Out Port (4b) Buffer Handle(24b) L3 (IP, ARP, …) Pkt Length (16b) QID(16b) plugins In Plugin (4b) In Port (4b) Flags (8b) Stats Index (16b) What is the priority for servicing input rings In Port: Used as part of lookup key In Plugin: Used as part of lookup key Out Port: Used to tell QM, HF and Tx physical interface pkt is destined for SRAM Ring Scratch Ring NN Ring

102 ONL NP Router TCAM xScale Lookup Parse Copy QM SRAM Plugins
Rsv (4b) Out Port (4b) Buffer Handle(24b) Rsv (4b) Out Port (4b) Buffer Handle(24b) L3 (IP, ARP, …) Pkt Length (16b) QID(16b) L3 (IP, ARP, …) Pkt Length (16b) QID(16b) In Plugin (4b) In Port (4b) Flags (8b) Stats Index (16b) In Plugin (4b) In Port (4b) Flags (8b) Stats Index (16b) TCAM Assoc. Data ZBT-SRAM xScale Lookup Rsv (4b) Out Port (4b) Buffer Handle(24b) Parse Copy QM L3 (IP, ARP, …) Pkt Length (16b) QID(16b) SRAM Plugins Reserved (8b) Buffer Handle(24b) L3 (IP, ARP, …) Pkt Length (16b) QID(16b) In Plugin (4b) In Port (4b) Rsv (8b) Stats Index (16b)


Download ppt "An NP-Based Router for the Open Network Lab"

Similar presentations


Ads by Google