Download presentation
Presentation is loading. Please wait.
1
IXP Based Router for ONL: Architecture
John DeHart
2
Overview These slides are a start at building up some informational and design slides for the ONL IXP-based Router.
3
Hardware Promentum™ ATCA-7010 (NP Blade): Two Intel IXP2850 NPs
1.4 GHz Core 700 MHz Xscale Each NPU has: 3x256MB RDRAM, 533 MHz 3 Channels Address space is striped across all three. 4 QDR II SRAM Channels Channels 1, 2 and 3 populated with 8MB each running at 200 MHz 16KB of Scratch Memory 16 Microengines Instruction Store: 8K 40-bit wide instructions Local Memory: bit words TCAM: Network Search Engine (NSE) on SRAM channel 0 Each NPU has a separate LA-1 Interface Part Number: IDT75K72234 18Mb TCAM Rear Transition Module (RTM) Connects via ATCA Zone 3 10 1GE Physical Interfaces Supports Fiber or Copper interfaces using SFP modules.
4
Hardware ATCA Chassis NP Blade RTM
5
NP Blades
6
Router Block Diagram Control TCAM XScale XScale Rx Rx Lookup Lookup Tx
Parse Parse Rx Lookup Lookup Tx QM Header Format Header Format QM Tx NPUA NPUB
7
NP-Based Router Design
1 ME 1-2 ME 1 ME enq deq TX 1 ME 1 ME RX Parse & Key Ext Lookup Hdr Fmt 1-2 ME enq deq 1 ME NN Rings 1 ME Scratch or SRAM Rings plugin SRAM Rings plugin Scr/SRAM ring SRAM rings 1 ME My re-drawing of JST’s design Add Parse and Key Extract Block (Parse and Key Ext) Add Header Format Block (Hdr Fmt) Add Plugin return path go to Parse and Key Extract instead of Lookup. Add ME estimates Rx and Tx may be 1 or 2 MEs each. If we are only targeting 5 Ports then they may be 1 each. Add designations of Next Neighbor, Scratch and SRAM Rings Use NN along all of non-plugin path. My understanding is that multiple ME can Read from or Write to the same Scratch or SRAM ring.
8
ONL Router Functional Blocks
Rx Parse Lookup Hdr Format QM Tx VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Lets look at What data passes from block to block What blocks touch the Buffer Descriptor
9
ONL Router Functional Blocks
Lookup Parse Rx Tx QM Hdr Format VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size RBUF Buf Handle(32b) InPort(8b) Rx: Function Coordinate transfer of packets from RBUF to DRAM Notes: This should be almost if not exactly the same version as in the techX implementation. We’ll pass the Buffer Handle which contains the SRAM address of the buffer descriptor. From the SRAM address of the descriptor we can calculate the DRAM address of the buffer data.
10
ONL Router Functional Blocks
Lookup Parse Rx Tx QM Hdr Format DAddr (32b) SAddr (32b) Sport (16b) TCP_Flags (12b) Protocol (8b) DPort (16b) MI/Port (16b) MR ID (16b) Buf Handle(32b) Port(8b) Buf Handle(32b) Buffer Offset(16b) Lookup Key(148b) InPort(8b) Parse Function IPv4 header processing Generate IPv4 lookup key from packet Notes: This should be almost if not exactly the version we use in the IPv4 MR Can Parse adjust the buffer/packet size and offset? Can Parse do something like, terminate a tunnel and strip off an outer header?
11
ONL Router Functional Blocks
Lookup Parse Rx Tx QM Hdr Format Buf Handle(32b) Buffer Offset(16b) Lookup Key(148b) InPort(8b) Buf Handle(32b) Buffer Offset(16b) Lookup Result(53b) QID (20b) Priority (8b) Drop (1b) Output MI (16b) OutPort (8b) Lookup Function Perform lookup in TCAM based on lookup key Result: Notes: This should be almost if not exactly the version we use in the IPv4 MR Needs to handle Primary/Secondary filters Primary == Exclusive Secondary == Non-exclusive How do sample apps that do multicast handle multiple copies?
12
ONL Router Functional Blocks
Lookup Parse Rx Tx QM Hdr Format Buf Handle(32b) Buffer Offset(16b) Lookup Result(53b) Buffer Handle(32b) QID(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Header Format Function IPv4 packet header formatting IPv4 Lookup Result processing Drop and Miss bits Extract QID and Port Notes: This should be almost if not exactly the version we use in the IPv4 MR Size (16b) OutPort(8b)
13
ONL Router Functional Blocks
Lookup Parse Rx Tx QM Hdr Format Buffer Handle(32b) Buf Handle(32b) QM Function queue management Notes: This should be almost if not exactly the same version as in the techX implementation. QID(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Size (16b) OutPort(8b)
14
ONL Router Functional Blocks
Lookup Parse Rx Tx QM Hdr Format Buffer Handle(32b) TBUF VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Tx Function Coordinate transfer of packets from DRAM to TBUF Notes: This should be almost if not exactly the same version as in the techX implementation.
15
JST: Network Configuration Switches
3 3 3 3 3 13 13 13 13 13 GE GE GE GE GE 48 GE 48 GE 48 GE 48 GE 48 GE NPU WUGS-20 x10 x4 No blocking possible (I think). 20 spare ports per configuration switch enough capacity for twice as many NPUs plus hosts $1500/switch, so can buy 6 for $9K giving us a spare
16
JDD:Network Configuration Switches
Edge 48 GE 16 / WUGS-20 4 / Level-1 Inter- connect 48 GE Switch #1 Edge 48 GE 16 NPU 2 / 8 / 16 / 8 / WUGS-20 4 / 8 / 2 / WUGS-20 4 / NPU 2 /24 16 / 16 4 / Level-1 Inter- connect 48 GE Switch #2 Edge 48 GE WUGS-20 4 NPU 2 / 8 / 8 / 16 / 3 12 / GE 12 3 GE 1 8 / 2 / 3 GE 1 NPU 2 1 3 GE 1
17
JDD:Network Configuration Switches
8 NP Blades, 4 WUGS NSPs, 8 48-pt GE Switches 128 hosts Interconnect 48 GE Interconnect 48 GE 8 8 NP Bl. 2 8 16 NP Bl. 2 8 16 8 8 8 8 8 8 Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE WUGS 3 GE 1 7 14 14 8 8 16 8 8 16 7 7 1 1 GE GE NP Bl. NP Bl. NP Bl. NP Bl. WUGS WUGS 3 3 2 2 16 2 2 16 1 1
18
Network Configuration Switches
5 NP Blades, 4 WUGS NSPs, 7 48-pt GE Switches 94 hosts Interconnect 48 GE Interconnect 48 GE 8 8 NP Bl. 2 8 16 NP Bl. 2 8 16 8 8 8 8 Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE WUGS 3 GE 1 7 14 14 8 8 7 7 1 1 GE GE NP Bl. WUGS WUGS 3 3 2 8 1 1
19
Notes on JDD: Configuration Switches
Routers would be assigned hosts from those connected to its Edge switch only. This does limit the availability of the 3-Host GE clusters to the WUGS-20 routers only. If we have extra hosts we might want to put some 3 Host GE clusters on the NPU side. 1 NPU Blade can replace 2 WUGS-20 NSPs If we need to grow beyond this configuration we can add Level-1 Interconnection Switches 3 and 4 and then connect all 4 Level-1 Interconnection switches through a 96 port Level-2 Interconnection switch with 24 ports to each of the Level-1 Interconnection switches. We can add a chassis based switch which can be expanded as needed.
20
Extra The next set of slides are for templates or extra information if needed
21
Text Slide Template
22
Image Slide Template
23
OLD The rest of these are old slides that should be deleted at some point.
24
Notes on Memory Usage Scheduling Data Structure
3 SRAM Channels of 8MB each per NP 3 RDRAM Channels 0f 256MB 768 MB per NP XScale uses the RDRAM. There is no separate DRAM for the XScale!!! bit words of Local Memory per MicroEngine Parameters: N: Max number of packets in the system at any given time. M: Max number of queues that need to be supported BatchSize: Number of slots in Scheduling Data Structure Segments (8 for now) Data Structures Stored in SRAM: Buffer Descriptors 32 Bytes each Number needed: N IXP Queue Descriptors 16 Bytes each Number needed: M QM Queue Data (QLen, Weight, Threshold) 12 Bytes each Scheduling Data Structure Segments: BatchSize*8 + 4B (address) + 4B (pointer to next) + 1 Bytes each Number needed: (M/8) + x Where x is the number of extra/spare needed to operate alogorithm X <= 10 is probably sufficient Data stored in DRAM: Packet Buffers 2KB each SRAM Adr(32b) Next Ptr(32b) QID(20b) Credit(32b) QID credit QID credit QID credit QID credit QID credit QID credit QID credit Scheduling Data Structure
25
Notes on Memory Usage 1 SRAM Channel for IXP Queue Descs, QM Queue Data and Scheduling Data Structure: 16*M + 12*M + ((M/8)+10)*(72) <= 8MB (0x = ) 28*M + 9M <= 8MB ( ) 37*M <= M <= So, lets say we will support 128K Queues (131071) 17 bits of QID gives us a range of 0 – 1 SRAM Channel for Buffer Descriptors 32*N <= 8MB (0x = ) N <= (0x40000) Max of 256K packets in the system 1 SRAM Channel still free On NPE this would be used for: MR specific data region MR configuration data Etc. DRAM usage for 256K Packets: 256K * 2K per buffer = 512 MB Out of 768MB available.
26
Core Components (sample App)
Xscale MicroEngines
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.