IXP Based Router for ONL: Architecture

Slides:



Advertisements
Similar presentations
Engineering Patrick Crowley, John DeHart, Mart Haitjema, Fred Kuhns, Jyoti Parwatikar, Ritun Patney, Jon Turner, Charlie Wiseman, Mike Wilson, Ken Wong,
Advertisements

Supercharging PlanetLab A High Performance,Multi-Alpplication,Overlay Network Platform Reviewed by YoungSoo Lee CSL.
John DeHart ONL NP Router Block Design Review: Lookup (Part of the PLC Block)
David M. Zar Applied Research Laboratory Computer Science and Engineering Department ONL Stats Block.
Michael Wilson Block Design Review: ONL Header Format.
John DeHart and Mike Wilson SPP V2 Router Design.
1 - Charlie Wiseman - 05/11/07 Design Review: XScale Charlie Wiseman ONL NP Router.
Block Design Review: Queue Manager and Scheduler Amy M. Freestone Sailesh Kumar.
David M. Zar Applied Research Laboratory Computer Science and Engineering Department ONL Freelist Manager.
John DeHart Block Design Review: Lookup for IPv4 MR, LC Ingress and LC Egress.
1 - Charlie Wiseman, Shakir James - 05/11/07 Design Review: Plugin Framework Charlie Wiseman and Shakir James ONL.
John DeHart An NP-Based Router for the Open Network Lab Memory Map.
David M. Zar Block Design Review: PlanetLab Line Card Header Format.
Mart Haitjema Block Design Review: ONL NP Router Multiplexer (MUX)
John DeHart Netgames Plugin Issues. 2 - JDD - 6/13/2016 SRAM ONL NP Router Rx (2 ME) HdrFmt (1 ME) Parse, Lookup, Copy (3 MEs) TCAM SRAM Mux (1 ME) Tx.
Supercharged PlanetLab Platform, Control Overview
Flow Stats Module James Moscola September 12, 2007.
ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Design of a High Performance PlanetLab Node
Design of a Diversified Router: Memory Usage
Design of a Diversified Router: TCAM Usage
Design of a Diversified Router: TCAM Usage
SPP Version 1 Router Plans and Design
An NP-Based Router for the Open Network Lab Design
An NP-Based Router for the Open Network Lab
John DeHart Design of a Diversified Router: Lookup Block with All Associated Data in SRAM John DeHart
An NP-Based Ethernet Switch for the Open Network Lab Design
Design of a Diversified Router: Line Card
Design of a Diversified Router: Packet Formats
ONL NP Router xScale xScale TCAM SRAM Rx (2 ME) Mux (1 ME) Parse,
Design of a Diversified Router: Common Router Framework
Design of a Diversified Router: Project Management
Design of a Diversified Router: Line Card
ONL NP Router Plugins Shakir James, Charlie Wiseman, Ken Wong, John DeHart {scj1, cgw1, kenw,
A Proposed Architecture for the GENI Backbone Platform
Design of a Diversified Router: Dedicated CRF for IPv4 Metarouter
An NP-Based Router for the Open Network Lab Hardware
An NP-Based Router for the Open Network Lab
Design of a Diversified Router: Packet Formats
Design of a Diversified Router: IPv4 MR (Dedicated NP)
SPP V2 Router Plans and Design
Flow Stats Module James Moscola September 6, 2007.
Documentation for Each Block
Design of a Diversified Router: Line Card
Design of a Diversified Router: Monitoring
An NP-Based Router for the Open Network Lab Overview by JST
ONL Stats Engine David M. Zar Applied Research Laboratory Computer Science and Engineering Department.
Supercharged PlanetLab Platform, Control Overview
Next steps for SPP & ONL 2/6/2007
IXP Based Router for ONL: Architecture
An NP-Based Router for the Open Network Lab
An NP-Based Router for the Open Network Lab
Design of a Diversified Router: Project Assignments and Status Updates
QM Performance Analysis
John DeHart and Mike Wilson
Design of a Diversified Router: Project Assignments and Status Updates
SPP V1 Memory Map John DeHart Applied Research Laboratory Computer Science and Engineering Department.
Planet Lab Memory Map David M. Zar Applied Research Laboratory Computer Science and Engineering Department.
Design of a Diversified Router: Dedicated CRF plus IPv4 Metarouter
Design of a Diversified Router: November 2006 Demonstration Plans
Code Review for IPv4 Metarouter Header Format
Code Review for IPv4 Metarouter Header Format
SPP Version 1 Router Plans and Design
An NP-Based Router for the Open Network Lab Meeting Notes
Design of a Diversified Router: Memory Usage
An NP-Based Router for the Open Network Lab Design
SPP Router Plans and Design
Design of a High Performance PlanetLab Node: Line Card
SPP Version 1 Router QM Design
Design of a Diversified Router: Project Management
Presentation transcript:

IXP Based Router for ONL: Architecture John DeHart jdd@arl.wustl.edu http://www.arl.wustl.edu/arl

Overview These slides are a start at building up some informational and design slides for the ONL IXP-based Router.

Hardware Promentum™ ATCA-7010 (NP Blade): Two Intel IXP2850 NPs 1.4 GHz Core 700 MHz Xscale Each NPU has: 3x256MB RDRAM, 533 MHz 3 Channels Address space is striped across all three. 4 QDR II SRAM Channels Channels 1, 2 and 3 populated with 8MB each running at 200 MHz 16KB of Scratch Memory 16 Microengines Instruction Store: 8K 40-bit wide instructions Local Memory: 640 32-bit words TCAM: Network Search Engine (NSE) on SRAM channel 0 Each NPU has a separate LA-1 Interface Part Number: IDT75K72234 18Mb TCAM Rear Transition Module (RTM) Connects via ATCA Zone 3 10 1GE Physical Interfaces Supports Fiber or Copper interfaces using SFP modules.

Hardware ATCA Chassis NP Blade RTM

NP Blades

Router Block Diagram Control TCAM XScale XScale Rx Rx Lookup Lookup Tx Parse Parse Rx Lookup Lookup Tx QM Header Format Header Format QM Tx NPUA NPUB

NP-Based Router Design 1 ME 1-2 ME 1 ME enq deq TX 1 ME 1 ME RX Parse & Key Ext Lookup Hdr Fmt 1-2 ME enq deq 1 ME NN Rings 1 ME Scratch or SRAM Rings plugin SRAM Rings plugin Scr/SRAM ring SRAM rings 1 ME My re-drawing of JST’s design Add Parse and Key Extract Block (Parse and Key Ext) Add Header Format Block (Hdr Fmt) Add Plugin return path go to Parse and Key Extract instead of Lookup. Add ME estimates Rx and Tx may be 1 or 2 MEs each. Add designations of Next Neighbor, Scratch and SRAM Rings Use NN along all of non-plugin path. My understanding is that multiple ME can Read from or Write to the same Scratch or SRAM ring.

ONL Router Functional Blocks Rx Parse Lookup Hdr Format QM Tx VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Lets look at What data passes from block to block What blocks touch the Buffer Descriptor

ONL Router Functional Blocks Lookup Parse Rx Tx QM Hdr Format VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size RBUF Buf Handle(32b) InPort(8b) Rx: Function Coordinate transfer of packets from RBUF to DRAM Notes: This should be almost if not exactly the same version as in the techX implementation. We’ll pass the Buffer Handle which contains the SRAM address of the buffer descriptor. From the SRAM address of the descriptor we can calculate the DRAM address of the buffer data.

ONL Router Functional Blocks Lookup Parse Rx Tx QM Hdr Format DAddr (32b) SAddr (32b) Sport (16b) TCP_Flags (12b) Protocol (8b) DPort (16b) MI/Port (16b) MR ID (16b) Buf Handle(32b) Port(8b) Buf Handle(32b) Buffer Offset(16b) Lookup Key(148b) InPort(8b) Parse Function IPv4 header processing Generate IPv4 lookup key from packet Notes: This should be almost if not exactly the version we use in the IPv4 MR Can Parse adjust the buffer/packet size and offset? Can Parse do something like, terminate a tunnel and strip off an outer header?

ONL Router Functional Blocks Lookup Parse Rx Tx QM Hdr Format Buf Handle(32b) Buffer Offset(16b) Lookup Key(148b) InPort(8b) Buf Handle(32b) Buffer Offset(16b) Lookup Result(53b) Lookup Function Perform lookup in TCAM based on lookup key Result: Notes: This should be almost if not exactly the version we use in the IPv4 MR QID (20b) Priority (8b) Drop (1b) Output MI (16b) OutPort (8b)

ONL Router Functional Blocks Lookup Parse Rx Tx QM Hdr Format Buf Handle(32b) Buffer Offset(16b) Lookup Result(53b) Buffer Handle(32b) QID(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Header Format Function IPv4 packet header formatting IPv4 Lookup Result processing Drop and Miss bits Extract QID and Port Notes: This should be almost if not exactly the version we use in the IPv4 MR Size (16b) OutPort(8b)

ONL Router Functional Blocks Lookup Parse Rx Tx QM Hdr Format Buffer Handle(32b) Buf Handle(32b) QM Function queue management Notes: This should be almost if not exactly the same version as in the techX implementation. QID(16b) VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Size (16b) OutPort(8b)

ONL Router Functional Blocks Lookup Parse Rx Tx QM Hdr Format Buffer Handle(32b) TBUF VLAN Packet_Next MR_ID TxMI Free_List Packet_Size Buffer_Next Offset Buffer Descriptor Buffer_Size Tx Function Coordinate transfer of packets from DRAM to TBUF Notes: This should be almost if not exactly the same version as in the techX implementation.

JST: Network Configuration Switches 3 3 3 3 3 13 13 13 13 13 GE GE GE GE GE 48 GE 48 GE 48 GE 48 GE 48 GE NPU WUGS-20 x10 x4 No blocking possible (I think). 20 spare ports per configuration switch enough capacity for twice as many NPUs plus hosts $1500/switch, so can buy 6 for $9K giving us a spare

JDD:Network Configuration Switches Edge 48 GE 16 / WUGS-20 4 / Level-1 Inter- connect 48 GE Switch #1 Edge 48 GE 16 NPU 2 / 8 / 16 / 8 / WUGS-20 4 / 8 / 2 / WUGS-20 4 / NPU 2 /24 16 / 16 4 / Level-1 Inter- connect 48 GE Switch #2 Edge 48 GE WUGS-20 4 NPU 2 / 8 / 8 / 16 / 3 12 / GE 12 3 GE 1 8 / 2 / 3 GE 1 NPU 2 1 3 GE 1

JDD:Network Configuration Switches 8 NP Blades, 4 WUGS NSPs, 8 48-pt GE Switches 128 hosts Interconnect 48 GE Interconnect 48 GE 8 8 NP Bl. 2 8 16 NP Bl. 2 8 16 8 8 8 8 8 8 Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE WUGS 3 GE 1 7 14 14 8 8 16 8 8 16 7 7 1 1 GE GE NP Bl. NP Bl. NP Bl. NP Bl. WUGS WUGS 3 3 2 2 16 2 2 16 1 1

Network Configuration Switches 5 NP Blades, 4 WUGS NSPs, 7 48-pt GE Switches 94 hosts Interconnect 48 GE Interconnect 48 GE 8 8 NP Bl. 2 8 16 NP Bl. 2 8 16 8 8 8 8 Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE WUGS 3 GE 1 7 14 14 8 8 7 7 1 1 GE GE NP Bl. WUGS WUGS 3 3 2 8 1 1

Notes on JDD: Configuration Switches Routers would be assigned hosts from those connected to its Edge switch only. This does limit the availability of the 3-Host GE clusters to the WUGS-20 routers only. If we have extra hosts we might want to put some 3 Host GE clusters on the NPU side. 1 NPU Blade can replace 2 WUGS-20 NSPs If we need to grow beyond this configuration we can add Level-1 Interconnection Switches 3 and 4 and then connect all 4 Level-1 Interconnection switches through a 96 port Level-2 Interconnection switch with 24 ports to each of the Level-1 Interconnection switches. We can add a chassis based switch which can be expanded as needed.

Extra The next set of slides are for templates or extra information if needed

Text Slide Template

Image Slide Template

OLD The rest of these are old slides that should be deleted at some point.

Notes on Memory Usage Scheduling Data Structure 3 SRAM Channels of 8MB each per NP 3 RDRAM Channels 0f 256MB  768 MB per NP XScale uses the RDRAM. There is no separate DRAM for the XScale!!! 640 32-bit words of Local Memory per MicroEngine Parameters: N: Max number of packets in the system at any given time. M: Max number of queues that need to be supported BatchSize: Number of slots in Scheduling Data Structure Segments (8 for now) Data Structures Stored in SRAM: Buffer Descriptors 32 Bytes each Number needed: N IXP Queue Descriptors 16 Bytes each Number needed: M QM Queue Data (QLen, Weight, Threshold) 12 Bytes each Scheduling Data Structure Segments: BatchSize*8 + 4B (address) + 4B (pointer to next) + 1 Bytes each Number needed: (M/8) + x Where x is the number of extra/spare needed to operate alogorithm X <= 10 is probably sufficient Data stored in DRAM: Packet Buffers 2KB each SRAM Adr(32b) Next Ptr(32b) QID(20b) Credit(32b) QID credit QID credit QID credit QID credit QID credit QID credit QID credit Scheduling Data Structure

Notes on Memory Usage 1 SRAM Channel for IXP Queue Descs, QM Queue Data and Scheduling Data Structure: 16*M + 12*M + ((M/8)+10)*(72) <= 8MB (0x800000 = 8388608) 28*M + 9M + 720 <= 8MB (8388608) 37*M <= 8387888 M <= 226699 So, lets say we will support 128K Queues (131071) 17 bits of QID gives us a range of 0 – 1310171 1 SRAM Channel for Buffer Descriptors 32*N <= 8MB (0x800000 = 8388608) N <= 262144 (0x40000) Max of 256K packets in the system 1 SRAM Channel still free On NPE this would be used for: MR specific data region MR configuration data Etc. DRAM usage for 256K Packets: 256K * 2K per buffer = 512 MB Out of 768MB available.

Core Components (sample App) Xscale MicroEngines