Mart Haitjema Block Design Review: ONL NP Router Multiplexer (MUX)
2 - Mart Haitjema - 3/11/2016 Revision History 5/1/07 (MAH): »Released
3 - Mart Haitjema - 3/11/2016 ONL NP Router SRAM Rx (2 ME) HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) TCAM SRAM Tx (1 ME) QM (1 ME) xScale Assoc. Data ZBT-SRAM Plugin1Plugin2 Plugin3 Plugin4Plugin5 NN FreeList Mgr (1 ME) Tx, QM Parse Plugin XScale Stats (1 ME) QM Copy Plugins SRAM NN SRAM Ring Scratch Ring NN Ring NN SRAM 64KW 32KW Each Mux (1 ME) (Slide modified from ONL_NProuter.ppt) 64KW
4 - Mart Haitjema - 3/11/2016 Contents Overview »MUX Function »Handling RX »Configurable Multiplexer Policy Design »Compute & Latency Budget »Design Overview Implementation Status
5 - Mart Haitjema - 3/11/2016 Overview - Function Multiplex input from: »RX MUX 2 Word per pkt 64KW SRAM Ring 64KW/2 = 32K pkts »xScale MUX 3 Word per pkt 64KW SRAM Ring 64KW/3 = 21.3K pkts »Plugins MUX 3 Word per pkt 64KW SRAM Ring 64KW/3 = 21.3K pkts xScale 64KW Mux (1 ME) Plugins RX PLC To Parse-Lookup-Copy »MUX PLC 3 Word per pkt 256 Word Scratch Ring 256/3 = 85 pkts 64KW
6 - Mart Haitjema - 3/11/2016 Overview - Handling RX Rx (2 ME) Parse, Lookup, Copy (3 MEs) xScale Plugin0Plugin1 NN SRAM 64KW Each Flags: Src: Source (2b): 00: Rx 01: XScale 10: Plugin 11: Undefined PT(1b): PassThrough(1)/Classify(0) Reserved (5b) L3 (IP, ARP, …) Pkt Length (16b) Buffer Handle(24b) Stats Index (16b) QID(16b) In Port (3b) Plugin Tag (5b) Flags (8b) Rsv (4b) Out Port (4b) Buf Handle(32b) InPort (4b) Reserved (12b) Eth. Frame Len (16b) 64KW Mux (1 ME) Reserved (5b)Src (2b) PT (1b) (Slide modified from ONL_NProuter.ppt) Modify Header Buffer Descriptor from RX Buffer Handle(24b) Rsv (8b)
7 - Mart Haitjema - 3/11/2016 Overview - Handling RX Mux Block writes: »Buffer_size (frame length from Rx) -14 »Packet_size (frame length from Rx) -14 »Offset 0x18E »Freelist 0 »Ref_cnt 1 (Slide from ONL_NProuter.ppt)
8 - Mart Haitjema - 3/11/2016 Overview - Multiplexer Policy MUX should service input queues based on a configurable policy Round-Robin Policy »Queues are serviced in round-robin fashion »Each input queue is assigned a quantum which specifies the number of packets (0 to 255) to be serviced from queue (if available) before moving on to the next queue »Quantum value of 0 means skip queue unless all other queues are empty »Quantum values are stored as 3 contiguous bytes in scratch memory
9 - Mart Haitjema - 3/11/2016 Compute & Latency Budget What is our performance target? »To hit 5 Gb rate: Minimum Ethernet frame: 76B Ø 64B frame + 12B InterFrame Spacing 5 Gb/sec * 1B/8b * packet/76B = 8.22 Mpkt/sec »IXP ME processing: 1.4Ghz clock rate 1.4Gcycle/sec * 1 sec/ 8.22 Mp = cycles per packet »Compute budget: 1 ME thus 170 cycles per packet »Latency budget: (threads*170) Ø 1 ME: 1 threads: 170 cycles Ø 1 ME: 4 threads: 680 cycles Ø 1 ME: 8 threads: 1360 cycles (Slide modified from ONL_NProuter.ppt)
10 - Mart Haitjema - 3/11/2016 Design Overview Read Quantum Values Read All Occupancy Counters Select Queue Read RX Input Ring Read xScale Input Ring Read Plugins Input Ring Write RX Occupancy Counter Write xScale Occupancy Counter Write Plugins Occupancy Counter Format & Write Buffer Descriptor Update Stats Counter Write PLC Output Ring (dl_sink) Service RX Service xScale Service Plugins 300 Cycles 60 Cycles Latency Total: ~420 Swap Wait For prev. sig_start Signal next_start 150 Cycles
11 - Mart Haitjema - 3/11/2016 Implementation Status MUX Assembly Stub: »Currently reads only from RX »Performs most of functionality for RX Need to Implement: »Thread ordering »Quantum Policy »Conditional block to process from Plugins and xScale »Read and Write Occupancy Counters
12 - Mart Haitjema - 3/11/2016 File locations (in …/ONL_Router/) Code »src/mux/ONL/mux.c Includes »src/dispatch_loop/ONL/dl_source.[h,c] dl_source() and dl_sink() functions