Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design of a Diversified Router: IPv4 MR (Dedicated NP)

Similar presentations


Presentation on theme: "Design of a Diversified Router: IPv4 MR (Dedicated NP)"— Presentation transcript:

1 Design of a Diversified Router: IPv4 MR (Dedicated NP)
John DeHart

2 Revision History 5/22/06 (JDD): 6/1/06 (JDD): 6/2/06 (JDD): Created
Buffer descriptor stuff probably needs updating. 6/1/06 (JDD): Updating data going between blocks, still in progress. 6/2/06 (JDD): More cleanup of data going between blocks. Buffer descriptor details still need updating.

3 IPv4 MR (Dedicated) Functional Blocks
Lookup Rx Tx QM Parse Header Format DeMux VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Lets look at What data passes from block to block What blocks touch the Buffer Descriptor

4 IPv4 MR (Dedicated) Functional Blocks
Lookup Rx Tx QM Parse Header Format DeMux VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size RBUF Buf Handle(32b) CRC (32b) Rx: Function Coordinate transfer of packets from RBUF to DRAM Notes: We’ll pass the Buffer Handle which contains the SRAM address of the buffer descriptor. From the SRAM address of the descriptor we can calculate the DRAM address of the buffer data.

5 IPv4 MR (Dedicated) Functional Blocks
Lookup Rx Tx QM Parse Header Format DeMux VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Buf Handle(32b) Rx MI(16b) Buf Handle(32b) MR ID (VLAN) (16b) IP Pkt Offset (16b) Reserved (16b) CRC (32b) DeMux Function Read CRC from end of pkt in DRAM and check against CRC from Rx. Read Pkt Header from DRAM Extract MI and VLAN from Pkt Header and pass to Parse Calculate offset into buffer of start of IP Pkt Header and pass to Parse Notes: This Demux block will become the basis for the shared NP Demux block.

6 IPv4 MR (Dedicated) Functional Blocks
Lookup Rx Tx QM Parse Header Format DeMux Lookup Key[127-96] (32b) Rx MI(16b) Buf Handle(32b) IP Pkt Length (16b) Lookup Key [143:128] (16b) IP Pkt Offset (16b) Lookup Key[ 95-64] (32b) Lookup Key[ 63-32] (32b) Lookup Key[31- 0] (32b) Exception Bits (16b) Reserved (16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Rx MI(16b) Buf Handle(32b) MR ID (VLAN) (16b) IP Pkt Offset (16b) Reserved (16b) Parse Function MR-specific header processing Handles IPv4 header validation Decrements TTL and recalculates Hdr Checksum. Generate MR-specific lookup key (144 bits) from packet Generate Exception bits to be passed on to Hdr Format (via Lookup) so Hdr Format can create shim fields for slow path packets going to Control Processor. Notes: Can Parse adjust the buffer/packet size and offset? Can Parse do something like, terminate a tunnel and strip off an outer header? MR ID and Rx MI are included in MR Lookup Key also. Rx MI needs to be passed to Header Format (through Lookup) so that Header Format can include it in the shim of packets that end up on the slow path. This will allow the Control Processor to know what interface the exception packets arrived on.

7 IPv4 MR (Dedicated) Functional Blocks
Rx DeMux Parse Lookup Header Format QM Tx Lookup Key[127-96] (32b) Rx MI(16b) Buf Handle(32b) IP Pkt Length (16b) Lookup Key [143:128] (16b) IP Pkt Offset (16b) Lookup Key[ 95-64] (32b) Lookup Key[ 63-32] (32b) Lookup Key[31- 0] (32b) Exception Bits (16b) Reserved (16b) Buf Handle(32b) Exception Bits (16b) Rx MI(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size IP Pkt Length (16b) IP Pkt Offset (16b) TxMI(16b) DA(8b) Port(8b) H (1b) D Rsv (2b) MrBits [39:32](8b) QID(20b) MrBits[31:0](32b) Lookup Function Perform lookup in TCAM based on MR Id and lookup key Increment counters based on Stats Index in result Priority resolution of results from multiple databases, if needed. Output: Buf Handle Exception Bits: For Parse to communicate to Header format info about exception packets Rx MI IP Pkt Length: Length of just the IP Pkt IP Pkt Offset: Offset from start of buffer to the start of IP Pkt header Tx MI QID H: Hit, D:Drop, Rsv: Reserved MR Bits: For MR-specific usage Notes: MR ID and Input MI are included in MR Lookup Key also.

8 IPv4 MR (Dedicated) Functional Blocks
Rx DeMux Parse Lookup Header Format QM Tx Buf Handle(32b) IP Pkt Length (16b) Buffer Handle(32b) Reserved (16b) QID(20b) Rsv (4b) Port(8b) Exception Bits (16b) Rx MI(16b) IP Pkt Length (16b) IP Pkt Offset (16b) TxMI(16b) DA(8b) Port(8b) H (1b) D (1b) Rsv (2b) MrBits [39:32](8b) QID(20b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size MrBits[31:0](32b) HD: Hit, Drop Header Format Function MR specific packet header formatting MR specific Lookup Result processing Drop and Miss bits

9 IPv4 MR (Dedicated) Functional Blocks
Rx DeMux Parse Lookup Header Format QM Tx QID(20b) IP Pkt Length (16b) Buffer Handle(32b) Rsv (4b) Reserved (16b) Port(8b) Buffer Handle(32b) Port(8b) Reserved (24b) QM Function CRF queue management for Meta Interface queues For performance reasons, QM may actually be implemented as multiple instances Each instance on a separate ME would support a separate set of Meta Interfaces. See next slide for more details… VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size

10 QM/Scheduler on Multiple MEs
Header Format Input Hlpr (1 ME) QM/Schd (1 ME) Tx MR-1 MR-n . . . QM/Schd (1 ME) Tx QID(20b) IP Pkt Length (16b) Buffer Handle(32b) Rsv (4b) Reserved (16b) Port(8b) NN/Scratch Rings Buffer Handle(32b) Port(8b) Reserved (24b) NN Ring QID(32b): Reserved (8b) QM ID (3b) QID(17b): 1M queues per QM Input Hlpr would use QM ID to select Scratch ring on which to put request. QM/Sched then sends on its output NN/scratch ring to its associated Tx With 64 entries in Q-Array and 16 entries in CAM, max number of QM/Schds is probably 4 (2 bits). We’ll set aside 3 bits to give us flexibility in the future.

11 IPv4 MR (Dedicated) Functional Blocks
Rx DeMux Parse Lookup Header Format QM Tx Buffer Handle(32b) Port(8b) Reserved (24b) TBUF VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Tx Function Coordinate transfer of packets from DRAM to TBUF

12 Extra The next set of slides are for templates or extra information if needed

13 Text Slide Template

14 Image Slide Template

15 Packet Buffer Descriptor Tradeoffs
Why use a Buffer Descriptor at all? QM needs something to link packets/buffers in queues ME-to-ME communications costs vs. SRAM access costs Specific to Radisys, from dl_meta.u

16 Packet Buffer Descriptor def
Meta Data structure of Packet Buffers (LSB to MSB) buffer_next 32 bits Next Buffer Pointer (in a chain of buffers) offset bits Offset to start of data in bytes BufferSize 16 bits Length of data in the current buffer in bytes header_type bits type of header at offset bytes in to the buffer rx_stat bits Receive status flags free_list bits Freelist ID packet_size 16 bits (Total packet size across multiple buffers) output_port 16 bits Output Port on the egress processor input_port 16 bits Input Port on the ingress processor nhid_type bits Nexthop ID type. reserved bits Reserved fabric_port bits Output port for fabric indicating blade ID. nexthop_id 16 bits NextHop IP ID color bits Qos Color flow_id 24 bits QOS flow ID or MPLS label/flow id reserved 16 bits Reserved class_id 16 bits Class ID packet_next 32 bits pointer to next packet (unused in cell mode) Specific to Radisys, from dl_meta.u

17 Packet Buffer Descriptor Gets
buffer_next: tx Offset: rx, tx, fwd BufferSize: tx, fwd header_type: tx, fwd rx_stat: NONE free_listpacket_size: NONE output_port: qm(?), tx input_port: rx, fwd nhid_type: NONE fabric_port: qm(?), tx nexthop_id color flow_id class_id packet_next Specific to Radisys, from dl_meta.u

18 Meta Data Caching Meta Data can be cached in one of three places:
SRAM Xfer Registers DRAM Xfer Registers GPR Registers Size of Meta Data Cache is controlled by #define META_CACHE_SIZE Macro dl_meta_load_cache[] loads meta data cache buffer_handle: buffer handle for which meta data is to be fetched dl_meta: read transfer register prefix Xbuf_alloc[] should be used to allocate the needed registers signal_number: START_LW: starting long word for fetch NUM_LW: number of long words to fetch Each microengine (microblock?) can use Meta Data Caching differently. Specific to Radisys, from dl_meta.u

19 Meta Data Caching Specific to Radisys, from dl_meta.u
In the ipv4_v6_forwarder sample app, dl_meta_load_cache() used in: Egress ethernet_arp.uc pkt_tx_16p.uc statistics_util.uc tx_helper.uc Ingress dl_meta_get_*[] used in: Ether.uc Ipv4_fwder.uc Ipv4_fwder_util.uc Ipv6_fwder.uc V6v4_tunnel_decap.uc V6v4_tunnel_encap.uc dl_meta_set_*[] used in: pkt_rx_init.uc pkt_rx_two_me_util.uc Specific to Radisys, from dl_meta.u

20 Buffer Handle

21 Buffer Descriptor Usage
Is there a different Buffer Descriptor defn for LC and PE? Will we support Multi-Buffer Packets? If not, we do not need buffer_next(32b) or buffer_size(16b) QM uses packet_next for its packet chaining in qarray. Output Port and Input Port probably translate to TxMI and RxMI Next Hop fields (nhid_type(4b) and nexthop_id(16b)) probably can go away. QOS fields (color(8b) and flow_id(24b)) probably can go away. Two reserved fields 4b and 16b can go away. class_id(16b) (virtual queue id?) can probably go away. fabric_port can probably go away.

22 Buffer Descriptor Usage
PE Buffer Descriptor: MR_ID (16b) TxMI (16b) VLAN (16b) buffer_next 32 bits Next Buffer Pointer (in a chain of buffers) offset bits Offset to start of data in bytes BufferSize 16 bits Length of data in the current buffer in bytes header_type bits type of header at offset bytes in to the buffer rx_stat bits Receive status flags free_list bits Freelist ID packet_size 16 bits (Total packet size across multiple buffers) output_port 16 bits Output Port on the egress processor input_port 16 bits Input Port on the ingress processor nhid_type bits Nexthop ID type. reserved bits Reserved fabric_port bits Output port for fabric indicating blade ID. nexthop_id 16 bits NextHop IP ID color bits Qos Color flow_id 24 bits QOS flow ID or MPLS label/flow id reserved 16 bits Reserved class_id 16 bits Class ID packet_next 32 bits pointer to next packet (unused in cell mode)

23 Buffer Descriptor Usage
PE Buffer Descriptor: LW0: buffer_next 32 bits Next Buffer Pointer (in a chain of buffers) LW1: offset bits Offset to start of data in bytes LW1: BufferSize 16 bits Length of data in the current buffer in bytes LW2: reserved bits reserved/unused LW2: reserved bits reserved/unused LW2: free_list bits Freelist ID LW2: packet_size 16 bits (Total packet size across multiple buffers) LW3: MR_ID bits Meta Router ID LW3: TxMI bits Transmit Meta Interface LW4: VLAN bits VLAN LW4: reserved 16 bits reserved/unused LW5: reserved 32 bits reserved/unused LW6: reserved 32 bits reserved/unused LW7: packet_next 32 bits pointer to next packet (unused in cell mode) Leave multi-buffer fields there as a template for the dedicated blade implementation of a jumbo-frame MR. Also reduces changes to Rx, Tx, and QM and reduces potential problems.

24 Multicast Alternatives
At least Three Options Force MRs that need Multicast to be Dedicated Blade MRs and do their own Multicast For our short term goals this is probably sufficient and the best course. Perhaps longer term we can look at adding it to the CRF Treat as exception and send to Xscale Provide support in CRF for Multicast Use Multi-Hit Lookup capability of the TCAM MI Bit mask defined in Lookup Result Will put a bound on the number of MIs that can be supported on an MR because of the size of the lookup result. Has issues of mapping bits in the bit mask to actual MIs. Lookup Result contains an index into a table containing MI bit masks Allow but do not force MRs to provide code to interpret Lookup Result. This would also allow other possible extensions on an MR-specific basis This carries with it the problem of bounding the execution time of the MR-specific code in the Lookup block. For general multicast, this could be a serious issue. There are also issues with generating a QID based on an MI when the QID is not included in the Lookup Result. Other options?

25 CRF Support for Multicast
Default/Unicast path MR Interp Parse Header Format MR-Specific Path Post Process Lookup MR-1 MR-1 MR-n . . . . . . MR-n DRAM Buf Ptr MR Id Input MI MR Ctrl Blk Ptr MR Mem Ptr DRAM Buf Ptr MR Id MR Lookup Key MR Ctrl Blk Ptr MR Mem Ptr DRAM Buf Ptr MR Id Output MI QID MR Specific Lookup Result Stats Index MR Ctrl Blk Ptr MR Mem Ptr DRAM Buf Ptr MR Id Output MI Buffer Offset QID

26 CRF Support for Multicast
Default path MR Interp MR-Specific Path Post Process Lookup DRAM Buf Ptr DRAM Buf Ptr MR Id Output MI QID MR Specific Lookup Result Stats Index MR Ctrl Blk Ptr MR Mem Ptr Copy Cnt DRAM Buf Ptr MR Id Output MI QID MR Specific Lookup Result Stats Index MR Ctrl Blk Ptr MR Mem Ptr Copy Cnt=1 MR Id MR Lookup Key MR Ctrl Blk Ptr MR Mem Ptr We will need some kind of copy count or multicast bit and last copy bit to let TX know when it can release the DRAM buffer that holds the packet.

27 CRF Support for Multicast
Default path MR Interp MR-Specific Path Post Process Lookup DRAM Buf Ptr MR Id Output MI QID MR Specific Lookup Result Stats Index MR Ctrl Blk Ptr MR Mem Ptr Copy Cnt DRAM Buf Ptr MR Id Output MI QID MR Specific Lookup Result Stats Index MR Ctrl Blk Ptr MR Mem Ptr Copy Cnt DRAM Buf Ptr MR Id Output MI QID MR Specific Lookup Result Stats Index MR Ctrl Blk Ptr MR Mem Ptr Copy Cnt DRAM Buf Ptr DRAM Buf Ptr Output MI Copy Cnt Output MI Copy Cnt MR Id MR Lookup Key Output MI Copy Cnt MR Lookup Key MR Specific Lookup Result MR Ctrl Blk Ptr MR Ctrl Blk Ptr MR Mem Ptr MR Mem Ptr We will need some kind of copy count or multicast bit and last copy bit to let TX know when it can release the DRAM buffer that holds the packet.

28 OLD The rest of these are old slides that should be deleted at some point.

29 Common Router Framework (CRF) Functional Blocks
Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size RBUF Buf Handle(32b) Rx: Function Coordinate transfer of packets from RBUF to DRAM Notes: We’ll pass the Buffer Handle which contains the SRAM address of the buffer descriptor. From the SRAM address of the descriptor we can calculate the DRAM address of the buffer data.

30 Common Router Framework (CRF) Functional Blocks
Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n Buf Handle(32b) MR Id(16b) Input MI(16b) MR Mem Ptr(32b) Buf Handle(32b) DRAM Buf Ptr(32b) Buffer Offset(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size DeMux Function Read Pkt Header from DRAM Use VLAN from Ethernet header to determine destination MR in order to locate: MR Parse code MR specific memory pointers Write MR Id to Buffer Descriptor Write VLAN to Buffer Descriptor

31 Common Router Framework (CRF) Functional Blocks
Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n MR Id(16b) Input MI(16b) MR Mem Ptr(32b) Buf Handle(32b) DRAM Buf Ptr(32b) Buffer Offset(16b) Buf Handle(32b) DRAM Buf Ptr(32b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Buffer Offset(16b) MR Id(16b) Input MI(16b) MR Mem Ptr(32b) MR Lookup Key(16B) Parse Function MR-specific header processing Generate MR-specific lookup key (16 Bytes) from packet Need CRF functionality to managed multiple MRs in shared PE. Notes: Can Parse adjust the buffer/packet size and offset? Can Parse do something like, terminate a tunnel and strip off an outer header?

32 CRF Wrapper Around Parse
MR-1 MR-n . . . MR Selector MR Id Input MI MR Mem Ptr Buf Handle(32b) DRAM Buf Ptr Buffer Offset MR Lookup Key MR Id Input MI MR Mem Ptr Buf Handle(32b) DRAM Buf Ptr Buffer Offset DRAM Buf Ptr Input MI MR Mem Ptr Buffer Offset MR Lookup Key Buffer Offset

33 Common Router Framework (CRF) Functional Blocks
Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n MR Lookup Key(16B) MR Id(16b) Input MI(16b) MR Mem Ptr(32b) Buf Handle(32b) DRAM Buf Ptr(32b) Buffer Offset(16b) Buffer Handle(32b) MR Id(16b) Lookup Result(Nb) MR Mem Ptr(32b) DRAM Buf Ptr(32b) Buffer Offset(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Lookup Function Perform lookup in TCAM based on MR Id and lookup key Result: Output MI QID Stats index MR-specific Lookup Result (flags, etc. ?) How wide can/should this be?

34 Common Router Framework (CRF) Functional Blocks
Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n Buffer Handle(32b) Buffer Handle(32b) QID(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size DRAM Buf Ptr(32b) Header Format Function MR specific packet header formatting MR specific Lookup Result processing Drop and Miss bits Need CRF functionality to managed multiple MRs in shared PE. Pulls out QID, Length and Port from MR Result, etc. Checks for Drop and Miss bits and deals with those actions. Buffer Offset(16b) Size (16b) Port(8b) MR Id(16b) MR Mem Ptr(32b) Lookup Result(Nb) Includes drop and miss bits

35 CRF Wrapper Around Header Format
MR-1 MR-n . . . Buffer Handle MR Selector Buffer Handle QID Size Port DRAM Buf Ptr(32b) Buffer Offset MR Id DRAM Buf Ptr Output MI MR Specific Lookup Result MR Mem Ptr Buffer Offset MR Mem Ptr Buffer Offset Gets written to Buffer Descriptor May also cause size(s) in Descriptor to be updated. (what about trimming data, What if it is a buffer’s worth Which would change the chaining, Can they add/trim at either end? Lookup Result

36 Common Router Framework (CRF) Functional Blocks
Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n Buffer Handle(32b) Buf Handle(32b) QM Function CRF queue management for Meta Interface queues For performance reasons, QM may actually be implemented as multiple instances Each instance on a separate ME would support a separate set of Meta Interfaces. See next slide for more details… QID(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Size (16b) Port(8b)

37 QM/Scheduler on Multiple MEs
Header Format Input Hlpr (1 ME) QM/Schd (1 ME) Output Hlpr (1 ME) Tx MR-1 MR-n . . . . . . QM/Schd (1 ME) Buffer Handle(32b) QID(32b) Buf Handle(32b) Scratch Rings Size (16b) NN Ring NN Ring Port(8b) QID(32b): Reserved (8b) QM ID (4b) QID(20b): 1M queues per QM Input Hlpr would use QM ID to select Scratch ring on which to put request. Output Hlpr would process all Scratch rings coming from QM/Schd MEs and multiplex onto one NN ring to TX With 64 entries in Q-Array and 16 entries in CAM, max number of QM/Schds is probably 4 (2 bits). We’ll set aside 4 bits to give us flexibility in the future.

38 Common Router Framework (CRF) Functional Blocks
Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n Buffer Handle(32b) TBUF VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Tx Function Coordinate transfer of packets from DRAM to TBUF


Download ppt "Design of a Diversified Router: IPv4 MR (Dedicated NP)"

Similar presentations


Ads by Google