Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jon Turner (and a cast of thousands) Washington University Design of a High Performance Active Router Active Nets PI Meeting - 12/01.

Similar presentations


Presentation on theme: "Jon Turner (and a cast of thousands) Washington University Design of a High Performance Active Router Active Nets PI Meeting - 12/01."— Presentation transcript:

1 Jon Turner (and a cast of thousands) Washington University jst@cs.wustl.edu Design of a High Performance Active Router Active Nets PI Meeting - 12/01

2 2 - Jonathan Turner - December 5, 2001 Switch Fabric IPPOPP SPC TI IPPOPP SPC TI IPPOPP SPC TI IPPOPP SPC TI IPPOPP SPC TI IPPOPP SPC TI Control Processor Washington University Active Router Smart Port Card Sys. FPGA 64 MB Pentium Cache North Bridge APIC ATM Switch Core Transmisson InterfacesEmbedded Processors Control Processor global coordination & control routing protocols build routing tables and otherinformation needed by SPCs active plugin code server

3 3 - Jonathan Turner - December 5, 2001 SPC Software Architecture Gen. Filters Flow & Route Lookup... virtual output queues... Plugin Control plugins Input Side Processing Distributed Queueing Gen. Filters Flow Lookup output queues... Plugin Control plugins Rate Control... reassembly queues Output Side Processing

4 4 - Jonathan Turner - December 5, 2001 SPC Throughput - Packets Per Second

5 5 - Jonathan Turner - December 5, 2001 Comparison with SPC 2

6 6 - Jonathan Turner - December 5, 2001 SPC Throughput - Mb/s

7 7 - Jonathan Turner - December 5, 2001 SPC Throughput vs. Packet Length

8 8 - Jonathan Turner - December 5, 2001 Distributed Queueing Switch Fabric TI IOIOIO IO IO IO Control Processor Routing Sched. Routing Sched. Routing Sched. Routing Sched. Routing Sched. Routing Sched. queue per output periodic queue length reports Scheduler paces each queue according to backlog share

9 9 - Jonathan Turner - December 5, 2001 Distributed Queueing Algorithm Goal: avoid switch congestion and output queue underflow. Let hi(i,j) be input i’s share of input-side backlog to output j. »can avoid switch congestion by sending from input i to output j at rate  L  S  hi(i,j) »where L is external link rate and S is switch speedup Let lo(i,j) be input i’s share of total backlog for output j. »can avoid underflow of queue at output j by sending from input i to output j at rate  L  lo(i,j) »this works if L  (lo(i,1)+···+lo(i,n))  L  S for all i Let wt(i,j) be the ratio of lo(i,j) to lo(i,1)+···+lo(i,n). Let rate(i,j)=L  S  lo(wt(i,j),hi(i,j)). Note: algorithm avoids congestion and for avoids underflow for large enough S. »what is the smallest value of S for which underflow cannot occur?

10 10 - Jonathan Turner - December 5, 2001 Stress Test

11 11 - Jonathan Turner - December 5, 2001 Stress Test Simulation - Min Rates

12 12 - Jonathan Turner - December 5, 2001 Stress Test Simulation - Actual Rates

13 13 - Jonathan Turner - December 5, 2001 Stress Test Simulation - Backlog

14 14 - Jonathan Turner - December 5, 2001 Stress Test Measurement Results

15 15 - Jonathan Turner - December 5, 2001 Switch Fabric IPPOPP FPX SPC TI IPPOPP FPX SPC TI IPPOPP FPX SPC TI IPPOPP FPX SPC TI IPPOPP FPX SPC TI IPPOPP FPX SPC TI Control Processor Reconfigurable Hardware Extension Field Programmable Port Extenders Field Programmable Port Ext. Network Interface Device Reprogrammable Application Device SDRAM 128 MB SRAM 4 MB

16 16 - Jonathan Turner - December 5, 2001 Switch Fabric IPPOPP FPX SPC TI IPPOPP FPX SPC TI IPPOPP FPX SPC TI IPPOPP FPX SPC TI IPPOPP FPX SPC TI IPPOPP FPX SPC TI Control Processor Active Packet Processing 333666 Smart Port Card Sys. FPGA 32-64 MB Pentium Cache North Bridge APIC 6565 65

17 17 - Jonathan Turner - December 5, 2001 Logical Port Architecture Gen. Filters Flow Lookup active flow queues return queues... output queues... PCU plugins SPC FPX Output Side Processing Gen. Filters Flow & Route Lookup active flow queues return queues... virtual output queues... PCU plugins SPC FPX Input Side Processing

18 18 - Jonathan Turner - December 5, 2001 Fast IP Lookup (Eatherton & Dittia) Multibit trie with clever data encoding. »small memory requirements (4-6 bytes per prefix typical) »small memory bandwidth, simple lookup yields fast lookup rates »updates have negligible impact on lookup performance Avoid impact of external memory latency on throughput by interleaving several concurrent lookups. »8 lookup engine config. uses about 10% of Virtex 1000E logic cells address: 101 100 101 000 01,10 000001 010 100 101 110 011110 100101100 *010,001,11 0 00 11--1* 1,10 0 00 0100 00000000 0 10 1000 00000000 0 10 0000 00000000 0 01 0001 00000000 0 00 0110 11101110 0 00 0000 00001000 0 00 0001 00010010 0 00 0000 00000010 0 01 0000 00001100 1 00 0000 00000000 0 01 0010 00000000 1 00 0000 00000000 0 00 1000 00000000 internal bit vector external bit vector

19 19 - Jonathan Turner - December 5, 2001 Lookup Throughput & Latency linear throughput gain negligible latency increase

20 20 - Jonathan Turner - December 5, 2001 Update Performance reasonable update rates have little impact 1 update every 10  s

21 21 - Jonathan Turner - December 5, 2001 Performance of Combined Traffic

22 22 - Jonathan Turner - December 5, 2001 Summmary and Status Latest version of SPC software nearly complete. »additional testing of distributed queueing »testing of new output queueing subsystem - QSDRR »porting active applications to new plugin environment SPC2 almost ready for production. »finalizing details of PC board schematic and layout »overload performance testing on development system Completion of FPX design & integration with SPC. »low level debugging of FPX interface circuit »distributed queueing implementation in FPX »FIPL extension for flow classification »enhance active flow, output queueing subsystems


Download ppt "Jon Turner (and a cast of thousands) Washington University Design of a High Performance Active Router Active Nets PI Meeting - 12/01."

Similar presentations


Ads by Google