Download presentation
Presentation is loading. Please wait.
1
Analyzing Single Buffered Routers Sundar Iyer, Rui Zhang, Nick McKeown (sundaes, rzhang, nickm)@stanford.edu High Performance Networking Group Departments of Electrical Engineering & Computer Science, Stanford University
2
Stanford University 2 What is an Ideal Router? Output Queued switches are ideal but not practical It minimizes the delay faced by a packet Can give QoS guarantees The bandwidth to each output is NR, the total bandwidth is N 2 R The cost and power consumption is prohibitive 1 N R R Arriving Packets 1 N R R Departing Packets Interconnect Memory NR BW: N 2 R NR Output Queued Switch
3
Stanford University 3 CIOQ Models Departing PacketsArriving Packets 1 N N 1 Arbiter R R R R R R N memories Departing PacketsArriving Packets 1 N N 1 Arbiter R R R R R R N memories 2R BW: NR BW: 2NR R CIOQ switches are better but still not practical They can emulate OQ switches They need a bandwidth of only 2NR They have high computational complexity The model does not capture many different architectures Input Queued SwitchCombined Input-Output Queued Switch 2R R
4
Stanford University 4 The Single Buffered Router Model Single Buffered Routers buffer packets only once The interconnects may be –physically separate or merged –one of the interconnects may be optional The memory can be –centralized or distributed –one or many –reserved or shared amongst all ports 1 N R R Arriving Packets 1 N R R Departing Packets Interconnect Memory
5
Stanford University 5 Why a New Model for Routers? SB Routers comprise a broader class of routers –They replace the CIOQ model –They also include other interesting router architectures such as shared memory routers, parallel packet switches etc. With this model we can compare these routers to an ideal router and answer –Does a router give me quality of service? –Can a router guarantee me 100% throughput
6
Stanford University 6 How to Compare Routers? OQ Switch R R R R R R R R Any SB Switch Yes? Emulate =? 1 N R R Arriving Packets 1 N R R Departing Packets Inter connect Memory 1 N R R Arriving Packets 1 N R R Departing Packets Inter connect NR BW: N 2 R Memory No
7
Stanford University 7 A Modified Pigeon Hole Principle –Consider the following –Only one pigeon can enter or leave a hole in a given time –A pigeon decides when it wants to leave –A pigeonhole may contain many pigeons over time –How many pigeon holes do we need so that departing pigeons are guaranteed to be able to leave, and arriving pigeons are guaranteed a pigeon hole?
8
Stanford University 8 The Constraint Set Technique A technique to analyze single buffered routers 1.Determine each packet’s departure time 2.Define the constraints on the system for both inputs and outputs (if applicable) –buffer, fabrics, speedup, etc. 3.Apply the Pigeon Hole Principle Constraint Sets can be used to analyze Parallel Shared Memory Switches Distributed Shared Memory Switches (bus-based or crossbar-based) Input Queued Switches Parallel Packet Switches.. and we expect in general any Single Buffered router
9
Stanford University 9 Examples of Constraints Physical Constraints These are limitations imposed by the hardware –Memory: (E.g.: Parallel Packet Switch) Can’t access a memory more than a certain number of times in a time period –Bus: (E.g: Centralized Shared Memory) Can’t use the same bus simultaneously for more than a certain number of packets –Crossbar: (E.g: Distributed Shared Memory) Each input and output may be busy only once in a scheduling period Logical Constraints These are requirements imposed on the switch –Time: (Input Queued Router) A packet must face a delay of no more than “p” time slots with respect to an ideal switch
10
Stanford University 10 An Example: Parallel Shared Memory (PSM) Router DRAM consisting of k memories 2NR Arbiter Read Access Time = T Write Access Time = T 1 N 1 N R R R R Departing Packets Arriving Packets InterconnectFabricBus InterconnectNumberOne/Two InterconnectImplementationSeparated/ Merged MemoryPhysical Location Centralized MemoryNumberOne/ Many MemorySharing Allowed Yes Number Memories BW per Memory Total BW Emulate? k3NR/k3NRFIFO k4NR/k4NRQoS
11
Stanford University 11 Question: Can a PSM Router emulate an OQ Router? –Let a cell arrive at input “i” at time “t” and be destined to depart from output port “j” at time “DT” –Such a cell must not be written to memories which 1.Are used to write the other N-1 arriving cells at t. 2.Are used to read the departing N departing cells at t. 3.Will be used to read the N-1 departing cells at DT. –There are three constraint sets –By the pigeonhole principle, 3N memories at rate R, or a memory bandwidth of 3NR is sufficient
12
Stanford University 12 Distributed Shared Memory Router InterconnectFabricBus/ Crossbar InterconnectNumberOne/ Two InterconnectImplementationSeparated/ Merged MemoryPhysical LocationDistributed MemoryNumberMany MemorySharing AllowedYes Departing Packets Arriving Packets N 1 Arbiter R R R 1 N R R R N memories No. Mem BW per Mem. Total BW Xbar speed Emulate? N4R4NR FIFO N6R6NR QoS S1RS1R BW: S 1 NR S2RS2R BW: S 2 NR
13
Stanford University 13 Question: Can a DSM Router emulate an OQ Router? –Let a cell arrive at input “i” at time “t” and be destined to depart from output port “j” at time “DT” –The cell can be written to any intermediate port “x” such that 1.The edge (i,x) is available at time t. Since, no more than N-1 other cells contend to write at time t, at least (N-1)/s 1 vertices are available. 2.The edge (x,j) is available at time DT. Since, no more than N-1 other cells contend to leave at time t, at least (N-1)/s 2 vertices are available. –There are two constraint sets By pigeonhole principle, if suffices that (N-1)/s 1 + (N-1)/s 2 > N. Hence if s1 =s2 =2, i.e. s=s1+s2=4 is enough. A bandwidth of 4NR is sufficient
14
Stanford University 14 How Complex is the Arbiter? For each packet, need to check k memory addresses for potential conflicts Need to maintain the bitmap for scheduled departures from memories Scheduling is done sequentially, O(N) Communication from linecards is minimal
15
Stanford University 15 Summary: New Results, Previous Architectures, Comparison YesSimple-3N(N+1)R3R(N+1)/kNkClosPPS -OQ7a YesSimple6NR 6RNXbarDSM-III6c YesSimple8NR4NR4RNXbarDSM-II6b YesComplex5NR4NR4RNXbarDSM-I6a YesSimple-4NR4NR/kkBusPSM5 YesComplex2NR6NR3R2NXbarCIOQ4 Yes – for FIFO, Leaky Bucket Traffic Simple3NR6NR3R2NXbarIQ * (with speedup) 3 NoMax. Matching NR2NR2RNXbarIQ2 -Simple-2NR 1BusShared Mem.1 Emulate (QoS) ArbiterXbar BW Total BWBW of Mem. Num. Mem. FabricType YesSimple-6NR6NR/kNkClos PPS – Shared Memory 7b Yes (FIFO) Simple-4NR4NR/kNkClosPPS – Shared Memory 7c YesNone-N(N+1)R(N+1)RNBusOQ0
16
Backups
17
Stanford University 17 DSM Router Variants (Trading Arbiter Complexity with Memory Speed) No. Mem. BW per Mem. Total BW Xbar speed Emulate?Arbiter N3R3NR4NRFIFOComplex N3R3NR6NRFIFOSimple N4R4NR FIFOSimple N4R4NR5NRQoSComplex N4R4NR8NRQoSSimple N6R6NR QoSSimple
18
Stanford University 18 Input Queued Router Departing PacketsArriving Packets 1 N N 1 Arbiter InterconnectFabricCrossbar InterconnectNumberOne InterconnectImplementationMerged MemoryPhysical LocationDistributed MemoryNumberMany MemorySharing AllowedNo R R R R R R Number Memories BW per Memory Total BW Emulate? N3R3NRFIFO – Leaky Bucket Traffic N3R3NRQoS – Leaky Bucket Cons. N memories
19
Stanford University 19 Parallel Packet Switch InterconnectFabricClos Network InterconnectNumberTwo InterconnectImplementationSeparated MemoryPhysical LocationDistributed MemoryNumberMany MemorySharing AllowedYes 3N(N+1)R 2N(N+1)R Total BW 3R(N+1)/k 2R(N+1)/k BW per Memory Nk No. Mem. QoS FIFO Emulate? - - Xbar speed OQ Switch 1 2 3 4 R R R R 1 2 3 4 R R R R Multiplexor Demultiplexor (R/k) k=3 1 2 (R/k) Departing PacketsArriving Packets
20
Stanford University 20 Comparing DSM to CIOQ Routers DSM routers are less complex than CIOQ routers Lower requirements on memories Simpler scheduling algorithm Slightly higher crossbar bandwidth Two problems: Departure times must be determined centrally Scheduler is sequential CIOQDSM Num. Mem.2NN Total Mem. BW6NR4NR Xbar BW4R5R Buffer SizeNR x RTT<< NR x RTT
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.