Design of a Diversified Router: Dedicated CRF for IPv4 Metarouter

Slides:



Advertisements
Similar presentations
Chapter 3 Review of Protocols And Packet Formats
Advertisements

John DeHart ONL NP Router Block Design Review: Lookup (Part of the PLC Block)
David M. Zar Applied Research Laboratory Computer Science and Engineering Department ONL Stats Block.
Jon Turner, John DeHart, Fred Kuhns Computer Science & Engineering Washington University Wide Area OpenFlow Demonstration.
Michael Wilson Block Design Review: ONL Header Format.
John DeHart and Mike Wilson SPP V2 Router Design.
Washington WASHINGTON UNIVERSITY IN ST LOUIS Packet Routing Within MSR Fred Kuhns
1 - Charlie Wiseman - 05/11/07 Design Review: XScale Charlie Wiseman ONL NP Router.
Michael Wilson Block Design Review: Line Card Key Extract (Ingress and Egress)
Block Design Review: Queue Manager and Scheduler Amy M. Freestone Sailesh Kumar.
David M. Zar Applied Research Laboratory Computer Science and Engineering Department ONL Freelist Manager.
John DeHart Block Design Review: Lookup for IPv4 MR, LC Ingress and LC Egress.
Brandon Heller Block Design Review: Substrate Decap and IPv4 Parse.
1 - Charlie Wiseman, Shakir James - 05/11/07 Design Review: Plugin Framework Charlie Wiseman and Shakir James ONL.
David M. Zar Block Design Review: PlanetLab Line Card Header Format.
Mart Haitjema Block Design Review: ONL NP Router Multiplexer (MUX)
John DeHart Netgames Plugin Issues. 2 - JDD - 6/13/2016 SRAM ONL NP Router Rx (2 ME) HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) TCAM SRAM Mux (1 ME) Tx.
Supercharged PlanetLab Platform, Control Overview
Flow Stats Module James Moscola September 12, 2007.
Design of a High Performance PlanetLab Node
Design of a Diversified Router: Memory Usage
Design of a Diversified Router: TCAM Usage
Design of a Diversified Router: TCAM Usage
SPP Version 1 Router Plans and Design
An NP-Based Router for the Open Network Lab
John DeHart Design of a Diversified Router: Lookup Block with All Associated Data in SRAM John DeHart
An NP-Based Ethernet Switch for the Open Network Lab Design
Design of a Diversified Router: Line Card
Design of a Diversified Router: Packet Formats
ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Design of a Diversified Router: Common Router Framework
Design of a Diversified Router: Project Management
Design of a Diversified Router: Line Card
ONL NP Router Plugins Shakir James, Charlie Wiseman, Ken Wong, John DeHart {scj1, cgw1, kenw,
An NP-Based Router for the Open Network Lab
Design of a Diversified Router: Packet Formats
Design of a Diversified Router: IPv4 MR (Dedicated NP)
SPP V2 Router Plans and Design
Flow Stats Module James Moscola September 6, 2007.
Documentation for Each Block
Design of a Diversified Router: Line Card
Design of a Diversified Router: Monitoring
An NP-Based Router for the Open Network Lab Overview by JST
ONL Stats Engine David M. Zar Applied Research Laboratory Computer Science and Engineering Department.
Supercharged PlanetLab Platform, Control Overview
Next steps for SPP & ONL 2/6/2007
Network Core and QoS.
IXP Based Router for ONL: Architecture
An NP-Based Router for the Open Network Lab
John DeHart and Mike Wilson
Design of a Diversified Router: Project Assignments and Status Updates
SPP V1 Memory Map John DeHart Applied Research Laboratory Computer Science and Engineering Department.
Planet Lab Memory Map David M. Zar Applied Research Laboratory Computer Science and Engineering Department.
Design of a Diversified Router: Dedicated CRF plus IPv4 Metarouter
Design of a Diversified Router: November 2006 Demonstration Plans
Code Review for IPv4 Metarouter Header Format
Code Review for IPv4 Metarouter Header Format
SPP Version 1 Router Plans and Design
An NP-Based Router for the Open Network Lab Meeting Notes
Design of a Diversified Router: Memory Usage
Implementing an OpenFlow Switch on the NetFPGA platform
John DeHart and Mike Wilson
SPP Router Plans and Design
IXP Based Router for ONL: Architecture
Design of a High Performance PlanetLab Node: Line Card
Network Layer: Control/data plane, addressing, routers
Design of a Diversified Router: Project Management
Chapter 11 Processor Structure and function
Network Core and QoS.
Chapter 4: outline 4.1 Overview of Network layer data plane
Presentation transcript:

Design of a Diversified Router: Dedicated CRF for IPv4 Metarouter John DeHart, Brandon Heller jdd@arl.wustl.edu, bdh4@cec.wustl.edu http://arl.wustl.edu/projects/techX/

Revision History 5/22/06 (JDD): 6/1/06 (JDD): 6/2/06 (JDD): Created Buffer descriptor stuff probably needs updating. 6/1/06 (JDD): Updating data going between blocks, still in progress. 6/2/06 (JDD): More cleanup of data going between blocks. Buffer descriptor details still need updating. 6/5/06 (JDD): Slight change to format for Lookup Key and defining what goes in each word in the NN ring. Add IP Pkt Length to data Demux passes to Parse 6/6/06 (JDD): Reorganized the Lookup Result given to Hdr Format to distinguish between MR portion and Substrate portion. Clean up labeling of data to Parse (MN vs. IP Pkt) Output from Parse is still IP Pkt Offset and Length. Data from Parse to Lookup needs update to reflect case where lookup is just for Substrate mapping of MI to LC. 6/7/06 (JDD): Updated notes about Parse block’s input/output and functionality 6/15/06 (JDD): Removed CRC from Rx to Demux data. MSF does not pass us a CRC like we thought so we will skip the CRC checking. Updated data going from Demux to Parse, Parse to Lookup and Lookup to Hdr Format

Revision History 6/19/06 (BDH): 6/21/06 (BDH): 6/26/06 (BDH): Split Header Format into MR Header Format and Substrate Encap Demux is now Substrate Decap Reorganization of all slides into logical and physical formats, coloring scheme IPv4 MR now has own section, integrated JL’s internal format slides 6/21/06 (BDH): H Flags nuked MN Pkt Length into Lookup is now substrate-defined Logical communication added from Lookup to Substrate Encap Port fields are all 4 bits now 6/26/06 (BDH): Substrate Decap to Parse format changed Changed block diagram to better show that substrate encloses the MR-specific portions Added details on Substrate Decap Moved IPv4 slides to techX\bdh4\techx\IPv4_MR_shared. These slides should be done by mid-July. 6/29/06 (BDH): Updated format of Tx input 6/30/06 (BDH): Updated format of Hdr Format to Substrate Encap data, only handles IPv4 NH_MN_ADDRs now

Dedicated CRF Slide Organization Lookup Rx Tx QM Substr Decap Encap L1 L2 L3 Parse Header Format MR Block Substrate Input Data Output Data Metarouter In the “at-a-glance” format, all blocks are logical Logical inputs and outputs High-level overview of processing Each logical block is like an Intel microblock, not necessarily an ME In the detailed format, all blocks are physical Physical inputs and outputs Specific functionality and implementation notes Color scheme Blue = Substrate, should not change! Green = Metarouter, different for each MR

Logical Formats

Receive Rx Coordinate transfer of packets from RBUF to DRAM Rx Tx QM Lookup Rx Tx QM Substr Decap Encap L1 L2 L3 Parse Header Format MR Rx Buffer Handle RBUF Ethernet Frame Len Port Coordinate transfer of packets from RBUF to DRAM

Substrate Decap Substr Decap Lookup Rx Tx QM Substr Decap Encap L1 L2 L3 Parse Header Format MR Buffer Handle Substr Decap Buffer Handle Destination MPE Ethernet Frame Len Source ID Port MN Frame Length MN Frame Offset Read and validate Ethernet header from DRAM Read and validate substrate header from DRAM Extract Source ID Calculate MN frame length and offset

Parse Parse Rx Tx QM Substrate matches the destination MPE Lookup Rx Tx QM Substr Decap Encap L1 L2 L3 Parse Header Format MR Buffer Handle Buffer Handle Parse Destination MPE Lookup Flags to Lookup Source ID Lookup Key MN Frame Length Source ID MN Frame Offset MN Pkt Length MR Data to MR Hdr Format Substrate matches the destination MPE Read and align MN header (includes IPv4 Hdr) from DRAM MR-specific Consume internal header (if packet from other MPE of MR) Header validation Header modification Exception checks Extract lookup key and set lookup flags Write aligned modified IPv4 header back to DRAM

Lookup Lookup Perform lookup in TCAM Rx Tx QM Substr Decap Encap L1 L2 L3 Parse Header Format MR Buffer Handle Lookup Buffer Handle Lookup Input Flags to MR Header Format Lookup Result Flags Lookup Key MR Lookup Result Source ID MN Pkt Length Dest Addr to Substrate Encap Output Port QID Perform lookup in TCAM Increment counters based on Stats Index Priority resolution of results from multiple databases, if needed

Header Format Process Lookup result Rx Tx QM Substr Decap Encap L1 L2 L3 Parse Header Format MR Buffer Handle MR Hdr Format Buffer Handle MN Frame Length from Lookup Lookup Result Flags MN Frame Offset MR Lookup Result Substrate Type Substr. Type-dep. Data from MR Parse MR Data Process Lookup result For exceptions, generate internal header Decide substrate type

Substrate Encap Substr Encap Write substrate and ethernet headers Rx Lookup Rx Tx QM Substr Decap Encap L1 L2 L3 Parse Header Format MR Buffer Handle MN Frame Length Substr Encap Buffer Handle from MR Header Format MN Frame Offset Output Port Substrate Type QID Substr. Type-dep. Data MN Frame Length Dest Addr from Lookup Output Port QID Write substrate and ethernet headers

Queue Manager CRF queue management for Meta Interface queues WRR? Lookup Rx Tx QM Substr Decap Encap L1 L2 L3 Parse Header Format MR Buffer Handle QM Buffer Handle Output Port Valid Output Port QID MN Frame Length CRF queue management for Meta Interface queues WRR? Details

Transmit Tx Coordinate transfer of packets from DRAM to TBUFs Lookup Rx Tx QM Substr Decap Encap L1 L2 L3 Parse Header Format MR Tx Buffer Handle Output Port Valid TBUF Coordinate transfer of packets from DRAM to TBUFs Recycle buffer handle

Physical Formats

Receive RBUF format details here Buf Handle details here Notes: VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size RBUF Buf Handle(32b) Port (8b) Reserved Eth. Frame Len (16b) will be: Buf Handle(32b) Size (16b) Offset (16b) Hdr Type (8b) Free list (4b) Port (16b) Rx status currently: RBUF format details here Buf Handle details here Notes: We’ll pass the Buffer Handle which contains the SRAM address of the buffer descriptor. From the SRAM address of the descriptor we can calculate the DRAM address of the buffer data.

Substrate Decap SourceID: specifies RxMI or MPE (each 15-bit) will be: Buf Handle(32b) Port (8b) Reserved Eth. Frame Len (16b) Source ID(16b) Buf Handle(32b) MN Frm Offset (16b) MN Frm Length(16b) Dest MPE (16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size will be: Buf Handle(32b) Size (16b) Offset (16b) Hdr Type (8b) Free list (4b) Port (16b) Rx status currently: SourceID: specifies RxMI or MPE (each 15-bit)

Substrate Decap Functions Read Ethernet VLAN and Substrate header from DRAM Validate Ethernet VLAN packet Valid Length? Known protocol (VLAN)? Broadcast/Multicast source? Multicast destination? Broadcast destination? Local Dest? Validate Substrate header Known substrate header type (Internal or Ingress)? Substrate-reported MN frm len == Enet-deduced MN frm len? Fill NN ring fields

Substrate Decap Implementation 8 threads, ordered thread execution 121 cycles per thread per packet, common case ~670 cycles of latency, within 1360 cycle limit for 8 threads Resource use: SRAM refs: 1 per counter to increment (disabled currently) DRAM refs: 3 8B reads: Enet and Substrate header 2 8B reads: Enet checksum Optimizations could reduce cycle count further projected: 80-100 cycles combined initial error-check to remove branch mispredicts remove/combine DRAM read signals remove volatile keywords single-critical-section ordered threading

Lookup Key[143-112] MR/MI (32b) Parse Source ID (16b) Buf Handle(32b) MN Frm Offset (16b) MN Frm Length(16b) Dest MPE (16b) Buf Handle(32b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size L Flags (4b) MR Data (28b) MN Pkt Len (16b) MR Data (16b) Lookup Key[143-112] MR/MI (32b) Lookup Key[111-80] (32b) Lookup Key[ 79-48] (32b) Lookup Key[ 47-16] (32b) Lookup Key [15- 0] (16b) Reserved (16b) Can Parse adjust the buffer/packet size and offset? Can Parse do something like, terminate a tunnel and strip off an outer header?

Lookup Key[143-112] MR/MI (32b) Buf Handle(32b) Buf Handle(32b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size L Flags (4b) MR Data (28b) Rsv (4b) MR Data (28b) MN Pkt Len (16b) MR Data (16b) MN Pkt Len (16b) MR Data (16b) Lookup Key[143-112] MR/MI (32b) MR Lookup Result (32b) Lookup Key[111-80] (32b) MR Lookup Result (32b) Lookup Key[ 79-48] (32b) Lookup Key[ 47-16] (32b) DA(8b) Port (4b) QID(20b) Lookup Key [15- 0] (16b) Reserved (16b) L Flags: bit 0: 0: Normal, 1: Substrate Lookup bit 1: 0: Normal, 1: NH MN Address present in Key Word[1] Key Word[0] = MR/MI Bit 1 should never be set without bit 0 also being set.

NH MN IPv4 Addr / MAC Lo (32b) Header Format Buf Handle(32b) MN Pkt Length (16b) Buffer Handle(32b) MN Pkt Offset (16b) Source ID(16b) NH MN IPv4 Addr / MAC Lo (32b) Port (4b) QID(20b) DA(8b) Dest ID(16b) MAC Hi (16b) Rsv (8b) SH Type VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Rsv (4b) MR Data (28b) MN Pkt Len (16b) MR Data (16b) MR Lookup Result (32b) MR Lookup Result (32b) DA(8b) Port (4b) QID(20b) Egress Simple and Internal formats use just the dest ID, source ID, ad SH type MAC fields used for MAC_ADDR Egress format NH MN Addr field used for NH_MN_Addr format

Substrate Encapsulation MN Pkt Length (16b) Buffer Handle(32b) MN Pkt Offset (16b) Source ID(16b) NH MN IPv4 Addr / MAC Lo (32b) Port (4b) QID(20b) DA(8b) Dest ID(16b) MAC Hi (16b) Rsv (8b) SH Type Buffer Handle(32b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size DA(8b) Port (4b) QID(20b) Reserved (16b) MN Pkt Length (16b) Substrate header types/formats here?

Queue Manager Text QID(20b) Buffer Buffer Handle(32b) Descriptor Port (4b) Rsv (3b) V (1b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size DA(8b) Port (4b) QID(20b) Reserved (16b) MN Pkt Length (16b) Text

Transmit Text Buffer Descriptor TBUF Buffer_Next Buffer_Size Offset VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Buffer Handle (24b) Port (4b) Rsv (3b) V (1b) TBUF Text

IPv4 Metarouter techx\bdh4\techx\IPv4_MR_shared Look at: … for Metarouter-specific IPv4 slides

Extra The next set of slides are for templates or extra information if needed

Text Slide Template

Image Slide Template

At-a-glance Block Template Lookup Rx Tx QM MR Parse MR Hdr Format Substr Decap Encap L1 L2 L3 Block Buffer Handle RBUF Ethernet Frame Len Port Text

Detailed Block Template VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size RBUF Buf Handle(32b) Port (8b) Reserved Eth. Frame Len (16b) Text

QM/Scheduler on Multiple MEs Header Format Input Hlpr (1 ME) QM/Schd (1 ME) Tx MR-1 MR-n . . . QM/Schd (1 ME) Tx QID(20b) IP Pkt Length (16b) Buffer Handle(32b) Rsv (4b) Reserved (16b) Port(8b) NN/Scratch Rings Buffer Handle(32b) Port(8b) Reserved (24b) NN Ring QID(32b): Reserved (8b) QM ID (3b) QID(17b): 1M queues per QM Input Hlpr would use QM ID to select Scratch ring on which to put request. QM/Sched then sends on its output NN/scratch ring to its associated Tx With 64 entries in Q-Array and 16 entries in CAM, max number of QM/Schds is probably 4 (2 bits). We’ll set aside 3 bits to give us flexibility in the future.

Packet Buffer Descriptor Tradeoffs Why use a Buffer Descriptor at all? QM needs something to link packets/buffers in queues ME-to-ME communications costs vs. SRAM access costs Specific to Radisys, from dl_meta.u

Packet Buffer Descriptor def Meta Data structure of Packet Buffers (LSB to MSB) buffer_next 32 bits Next Buffer Pointer (in a chain of buffers) offset 16 bits Offset to start of data in bytes BufferSize 16 bits Length of data in the current buffer in bytes header_type 8 bits type of header at offset bytes in to the buffer rx_stat 4 bits Receive status flags free_list 4 bits Freelist ID packet_size 16 bits (Total packet size across multiple buffers) output_port 16 bits Output Port on the egress processor input_port 16 bits Input Port on the ingress processor nhid_type 4 bits Nexthop ID type. reserved 4 bits Reserved fabric_port 8 bits Output port for fabric indicating blade ID. nexthop_id 16 bits NextHop IP ID color 8 bits Qos Color flow_id 24 bits QOS flow ID or MPLS label/flow id reserved 16 bits Reserved class_id 16 bits Class ID packet_next 32 bits pointer to next packet (unused in cell mode) Specific to Radisys, from dl_meta.u

Packet Buffer Descriptor Gets buffer_next: tx Offset: rx, tx, fwd BufferSize: tx, fwd header_type: tx, fwd rx_stat: NONE free_listpacket_size: NONE output_port: qm(?), tx input_port: rx, fwd nhid_type: NONE fabric_port: qm(?), tx nexthop_id color flow_id class_id packet_next Specific to Radisys, from dl_meta.u

Meta Data Caching Meta Data can be cached in one of three places: SRAM Xfer Registers DRAM Xfer Registers GPR Registers Size of Meta Data Cache is controlled by #define META_CACHE_SIZE Macro dl_meta_load_cache[] loads meta data cache buffer_handle: buffer handle for which meta data is to be fetched dl_meta: read transfer register prefix Xbuf_alloc[] should be used to allocate the needed registers signal_number: START_LW: starting long word for fetch NUM_LW: number of long words to fetch Each microengine (microblock?) can use Meta Data Caching differently. Specific to Radisys, from dl_meta.u

Meta Data Caching Specific to Radisys, from dl_meta.u In the ipv4_v6_forwarder sample app, dl_meta_load_cache() used in: Egress ethernet_arp.uc pkt_tx_16p.uc statistics_util.uc tx_helper.uc Ingress dl_meta_get_*[] used in: Ether.uc Ipv4_fwder.uc Ipv4_fwder_util.uc Ipv6_fwder.uc V6v4_tunnel_decap.uc V6v4_tunnel_encap.uc dl_meta_set_*[] used in: pkt_rx_init.uc pkt_rx_two_me_util.uc Specific to Radisys, from dl_meta.u

Buffer Handle

Buffer Descriptor Usage Is there a different Buffer Descriptor defn for LC and PE? Will we support Multi-Buffer Packets? If not, we do not need buffer_next(32b) or buffer_size(16b) QM uses packet_next for its packet chaining in qarray. Output Port and Input Port probably translate to TxMI and RxMI Next Hop fields (nhid_type(4b) and nexthop_id(16b)) probably can go away. QOS fields (color(8b) and flow_id(24b)) probably can go away. Two reserved fields 4b and 16b can go away. class_id(16b) (virtual queue id?) can probably go away. fabric_port can probably go away.

Buffer Descriptor Usage PE Buffer Descriptor: MR_ID (16b) TxMI (16b) VLAN (16b) buffer_next 32 bits Next Buffer Pointer (in a chain of buffers) offset 16 bits Offset to start of data in bytes BufferSize 16 bits Length of data in the current buffer in bytes header_type 8 bits type of header at offset bytes in to the buffer rx_stat 4 bits Receive status flags free_list 4 bits Freelist ID packet_size 16 bits (Total packet size across multiple buffers) output_port 16 bits Output Port on the egress processor input_port 16 bits Input Port on the ingress processor nhid_type 4 bits Nexthop ID type. reserved 4 bits Reserved fabric_port 8 bits Output port for fabric indicating blade ID. nexthop_id 16 bits NextHop IP ID color 8 bits Qos Color flow_id 24 bits QOS flow ID or MPLS label/flow id reserved 16 bits Reserved class_id 16 bits Class ID packet_next 32 bits pointer to next packet (unused in cell mode)

Buffer Descriptor Usage PE Buffer Descriptor: LW0: buffer_next 32 bits Next Buffer Pointer (in a chain of buffers) LW1: offset 16 bits Offset to start of data in bytes LW1: BufferSize 16 bits Length of data in the current buffer in bytes LW2: reserved 8 bits reserved/unused LW2: reserved 4 bits reserved/unused LW2: free_list 4 bits Freelist ID LW2: packet_size 16 bits (Total packet size across multiple buffers) LW3: MR_ID 16 bits Meta Router ID LW3: TxMI 16 bits Transmit Meta Interface LW4: VLAN 16 bits VLAN LW4: reserved 16 bits reserved/unused LW5: reserved 32 bits reserved/unused LW6: reserved 32 bits reserved/unused LW7: packet_next 32 bits pointer to next packet (unused in cell mode) Leave multi-buffer fields there as a template for the dedicated blade implementation of a jumbo-frame MR. Also reduces changes to Rx, Tx, and QM and reduces potential problems.

Multicast Alternatives At least Three Options Force MRs that need Multicast to be Dedicated Blade MRs and do their own Multicast For our short term goals this is probably sufficient and the best course. Perhaps longer term we can look at adding it to the CRF Treat as exception and send to Xscale Provide support in CRF for Multicast Use Multi-Hit Lookup capability of the TCAM MI Bit mask defined in Lookup Result Will put a bound on the number of MIs that can be supported on an MR because of the size of the lookup result. Has issues of mapping bits in the bit mask to actual MIs. Lookup Result contains an index into a table containing MI bit masks Allow but do not force MRs to provide code to interpret Lookup Result. This would also allow other possible extensions on an MR-specific basis This carries with it the problem of bounding the execution time of the MR-specific code in the Lookup block. For general multicast, this could be a serious issue. There are also issues with generating a QID based on an MI when the QID is not included in the Lookup Result. Other options?

CRF Support for Multicast Default/Unicast path MR Interp Parse Header Format MR-Specific Path Post Process Lookup MR-1 MR-1 MR-n . . . . . . MR-n DRAM Buf Ptr MR Id Input MI MR Ctrl Blk Ptr MR Mem Ptr DRAM Buf Ptr MR Id MR Lookup Key MR Ctrl Blk Ptr MR Mem Ptr DRAM Buf Ptr MR Id Output MI QID MR Specific Lookup Result Stats Index MR Ctrl Blk Ptr MR Mem Ptr DRAM Buf Ptr MR Id Output MI Buffer Offset QID

CRF Support for Multicast Default path MR Interp MR-Specific Path Post Process Lookup DRAM Buf Ptr DRAM Buf Ptr MR Id Output MI QID MR Specific Lookup Result Stats Index MR Ctrl Blk Ptr MR Mem Ptr Copy Cnt DRAM Buf Ptr MR Id Output MI QID MR Specific Lookup Result Stats Index MR Ctrl Blk Ptr MR Mem Ptr Copy Cnt=1 MR Id MR Lookup Key MR Ctrl Blk Ptr MR Mem Ptr We will need some kind of copy count or multicast bit and last copy bit to let TX know when it can release the DRAM buffer that holds the packet.

CRF Support for Multicast Default path MR Interp MR-Specific Path Post Process Lookup DRAM Buf Ptr MR Id Output MI QID MR Specific Lookup Result Stats Index MR Ctrl Blk Ptr MR Mem Ptr Copy Cnt DRAM Buf Ptr MR Id Output MI QID MR Specific Lookup Result Stats Index MR Ctrl Blk Ptr MR Mem Ptr Copy Cnt DRAM Buf Ptr MR Id Output MI QID MR Specific Lookup Result Stats Index MR Ctrl Blk Ptr MR Mem Ptr Copy Cnt DRAM Buf Ptr DRAM Buf Ptr Output MI Copy Cnt Output MI Copy Cnt MR Id MR Lookup Key Output MI Copy Cnt MR Lookup Key MR Specific Lookup Result MR Ctrl Blk Ptr MR Ctrl Blk Ptr MR Mem Ptr MR Mem Ptr We will need some kind of copy count or multicast bit and last copy bit to let TX know when it can release the DRAM buffer that holds the packet.

OLD The rest of these are old slides that should be deleted at some point.

Common Router Framework (CRF) Functional Blocks Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size RBUF Buf Handle(32b) Rx: Function Coordinate transfer of packets from RBUF to DRAM Notes: We’ll pass the Buffer Handle which contains the SRAM address of the buffer descriptor. From the SRAM address of the descriptor we can calculate the DRAM address of the buffer data.

Common Router Framework (CRF) Functional Blocks Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n Buf Handle(32b) MR Id(16b) Input MI(16b) MR Mem Ptr(32b) Buf Handle(32b) DRAM Buf Ptr(32b) Buffer Offset(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size DeMux Function Read Pkt Header from DRAM Use VLAN from Ethernet header to determine destination MR in order to locate: MR Parse code MR specific memory pointers Write MR Id to Buffer Descriptor Write VLAN to Buffer Descriptor

Common Router Framework (CRF) Functional Blocks Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n MR Id(16b) Input MI(16b) MR Mem Ptr(32b) Buf Handle(32b) DRAM Buf Ptr(32b) Buffer Offset(16b) Buf Handle(32b) DRAM Buf Ptr(32b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Buffer Offset(16b) MR Id(16b) Input MI(16b) MR Mem Ptr(32b) MR Lookup Key(16B) Parse Function MR-specific header processing Generate MR-specific lookup key (16 Bytes) from packet Need CRF functionality to managed multiple MRs in shared PE. Notes: Can Parse adjust the buffer/packet size and offset? Can Parse do something like, terminate a tunnel and strip off an outer header?

CRF Wrapper Around Parse MR-1 MR-n . . . MR Selector MR Id Input MI MR Mem Ptr Buf Handle(32b) DRAM Buf Ptr Buffer Offset MR Lookup Key MR Id Input MI MR Mem Ptr Buf Handle(32b) DRAM Buf Ptr Buffer Offset DRAM Buf Ptr Input MI MR Mem Ptr Buffer Offset MR Lookup Key Buffer Offset

Common Router Framework (CRF) Functional Blocks Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n MR Lookup Key(16B) MR Id(16b) Input MI(16b) MR Mem Ptr(32b) Buf Handle(32b) DRAM Buf Ptr(32b) Buffer Offset(16b) Buffer Handle(32b) MR Id(16b) Lookup Result(Nb) MR Mem Ptr(32b) DRAM Buf Ptr(32b) Buffer Offset(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Lookup Function Perform lookup in TCAM based on MR Id and lookup key Result: Output MI QID Stats index MR-specific Lookup Result (flags, etc. ?) How wide can/should this be?

Common Router Framework (CRF) Functional Blocks Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n Buffer Handle(32b) Buffer Handle(32b) QID(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size DRAM Buf Ptr(32b) Header Format Function MR specific packet header formatting MR specific Lookup Result processing Drop and Miss bits Need CRF functionality to managed multiple MRs in shared PE. Pulls out QID, Length and Port from MR Result, etc. Checks for Drop and Miss bits and deals with those actions. Buffer Offset(16b) Size (16b) Port(8b) MR Id(16b) MR Mem Ptr(32b) Lookup Result(Nb) Includes drop and miss bits

CRF Wrapper Around Header Format MR-1 MR-n . . . Buffer Handle MR Selector Buffer Handle QID Size Port DRAM Buf Ptr(32b) Buffer Offset MR Id DRAM Buf Ptr Output MI MR Specific Lookup Result MR Mem Ptr Buffer Offset MR Mem Ptr Buffer Offset Gets written to Buffer Descriptor May also cause size(s) in Descriptor to be updated. (what about trimming data, What if it is a buffer’s worth Which would change the chaining, Can they add/trim at either end? Lookup Result

Common Router Framework (CRF) Functional Blocks Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n Buffer Handle(32b) Buf Handle(32b) QM Function CRF queue management for Meta Interface queues For performance reasons, QM may actually be implemented as multiple instances Each instance on a separate ME would support a separate set of Meta Interfaces. See next slide for more details… QID(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Size (16b) Port(8b)

QM/Scheduler on Multiple MEs Header Format Input Hlpr (1 ME) QM/Schd (1 ME) Output Hlpr (1 ME) Tx MR-1 MR-n . . . . . . QM/Schd (1 ME) Buffer Handle(32b) QID(32b) Buf Handle(32b) Scratch Rings Size (16b) NN Ring NN Ring Port(8b) QID(32b): Reserved (8b) QM ID (4b) QID(20b): 1M queues per QM Input Hlpr would use QM ID to select Scratch ring on which to put request. Output Hlpr would process all Scratch rings coming from QM/Schd MEs and multiplex onto one NN ring to TX With 64 entries in Q-Array and 16 entries in CAM, max number of QM/Schds is probably 4 (2 bits). We’ll set aside 4 bits to give us flexibility in the future.

Common Router Framework (CRF) Functional Blocks Parse Header Format Rx DeMux Lookup QM Tx MR-1 . . . MR-1 MR-n . . . MR-n Buffer Handle(32b) TBUF VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Tx Function Coordinate transfer of packets from DRAM to TBUF

Old Template Tx Function Rx DeMux Parse Lookup Header Format QM Tx Buffer Handle(32b) Port(8b) Reserved (24b) TBUF VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Tx Function Coordinate transfer of packets from DRAM to TBUF

Old Rejected Overly Busy Slide Lookup Rx Tx QM MR Parse MR Hdr Format Substr Decap Encap L1 L2 L3 Block  Logical interface Data passing between layers Notes here  Physical format Actual Format of data Shows type of communication Scratch Ring NN Ring SRAM Rings  Buf Descriptor shows fields read/written Input Data Output Data Input Data Output Data VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size