Presentation is loading. Please wait.

Presentation is loading. Please wait.

CIST560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues: Maximal Matching Algorithms (Part II)

Similar presentations


Presentation on theme: "CIST560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues: Maximal Matching Algorithms (Part II)"— Presentation transcript:

1 CIST560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues: Maximal Matching Algorithms (Part II)

2 CIST560 by M. Hamdi 2 Pointer Desynchronization Performance: RRM < iSlip < FIRM Difference only in updating pointers Observation: iSlip and FIRM can effectively desynchronize their output pointers The best effect of pointer desynchronization is achieved if forced

3 CIST560 by M. Hamdi 3 Static Round Robin Matching (SRR): To Achieve FULL Desynchronization Initialization. The input pointers are set to 0's. The output pointers are set to some initial pattern such that there is no duplication among the pointers. The 3 steps of one iteration are: –Request. Each input sends a request to every output for which it has a queued cell. –Grant. If an output receives any requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output notifies each input whether or not its request was granted. The pointer to the highest priority element of the round-robin schedule is always incremented by one (modulo N) whether there is a grant or not.

4 CIST560 by M. Hamdi 4 SRR (Cont’d) –Accept. If an input receives a grant, it accepts the one that appears next in a fixed round-robin schedule starting from the highest priority element. The pointer to the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the accepted one. In DSRR (Improved version of SRR), input pointers are also desynchronized. Rotating DSRR (RDSRR): –Unfairness among inputs under special traffic model. –Outputs searching in clockwise and anti-clockwise directions alternatively to decide grants.

5 CIST560 by M. Hamdi 5 Simulation Results

6 CIST560 by M. Hamdi 6 Simulation Results

7 CIST560 by M. Hamdi 7 Simulation Results

8 CIST560 by M. Hamdi 8 Simulation Results

9 CIST560 by M. Hamdi 9 Stability Property A VOQ switch is considered stable if it approaches a steady state where the expected length of each VOQ is bounded. If it is stable, 100% throughput can be achieved under any admissible traffic pattern. RDSRR is more stable than iSlip and FIRM under various traffic patterns.

10 CIST560 by M. Hamdi 10 Stability Property (Cont’d)

11 CIST560 by M. Hamdi 11 3-Phase & 2-Phase Algorithms iSlip & FIRM are 3-phase algorithms: Request- Grant-Accept DRRM is 2-phase algorithm: Grant-Accept –Each input sends one grant –Each output sends one accept 2-FIRM is the 2-phase version of FIRM

12 CIST560 by M. Hamdi 12 DRRM (Dual Round Robin Matching)

13 CIST560 by M. Hamdi 13 3-Phase & 2-Phase Algorithms

14 CIST560 by M. Hamdi 14 3-Phase & 2-Phase Algorithms

15 CIST560 by M. Hamdi 15 3-Phase & 2-Phase Algorithms In general case, the traffic model changes from time to time When the temporary non-uniformity is on the input side, 3-phase scheme performs better When the temporary non-uniformity is on the output side, 2-phase scheme performs better

16 CIST560 by M. Hamdi 16 2-stage Maximum Size Matching Algorithm: Description The 2-stage algorithm works in the following way: 1. The pointers at both input and output sides are kept fully desynchronized. 2. In each iteration, there are 3 steps: Step 1: Each input sends a request to every output for which it has a queued cell. Step 2: Each input selects one VOQ to send grant that appears next starting from its highest priority output. Each output selects one request received in step 1 to send grant that appears next starting from its highest priority input. OutputCount = number of outputs receiving grants from inputs. InputCount = number of inputs receiving grants from outputs.

17 CIST560 by M. Hamdi 17 2-stage Maximum Size Matching Algorithm: Description Step 3: If OutputCount ? InputCount, each output selects one among the grants received in step 2 which appears next starting from its highest priority input and sends accept. Else, each input selects one among the grants received in step 2 which appears next starting from its highest priority output and sends accept. In simple words, this algorithm will decide in each time slot whether to use 2-phase or 3-phase scheme based on which one can make more matches.

18 CIST560 by M. Hamdi 18 2-stage Maximum Size Matching Algorithm: Hardware Implementation

19 CIST560 by M. Hamdi 19 Performance Evaluation: Simulation Study Uniform Traffic

20 CIST560 by M. Hamdi 20 Performance Evaluation: Simulation Study Load0.50.60.70.80.90.950.99 Improvement Percentage 67%196%81%58%60%84%43% Normalized Improvement Percentage 40%66%45%37% 46%30% Improvement Factor 1.672.961.811.581.601.841.43 Improvement Percentage 7%75%92%54%59%83%43% Normalized Improvement Percentage 7%43%48%35%37%45%30% Improvement Factor 1.071.751.921.541.591.831.43 2-stage over iSlip SRR over iSlip

21 CIST560 by M. Hamdi 21 Performance Evaluation: Simulation Study Bursty Traffic

22 CIST560 by M. Hamdi 22 Load0.630.70.750.80.850.9 Improvement Percentage 213%96%70%46%28%16% Normalized Improvement Percentage 68%49%41%31%22%14% Improvement Factor 3.131.961.701.461.281.16 Improvement Percentage 89%56%46%33%22%14% Normalized Improvement Percentage 47%36%32%25%18%12% Improvement Factor 1.891.561.461.331.221.14 Performance Evaluation: Simulation Study 2-stage over iSlip SRR over iSlip

23 CIST560 by M. Hamdi 23 Performance Evaluation: Simulation Study Hotspot Traffic

24 CIST560 by M. Hamdi 24 Load0.310.380.430.460.50 Improvement Percentage 26%56%101626%160469%81633% Normalized Improvement Percentage 21%36%100% Improvement Factor 1.261.561017.261605.69817.33 Improvement Percentage 5%9%56177%74631%19618% Normalized Improvement Percentage 5%8%99%100%99% Improvement Factor 1.051.09562.77747.31197.18 Performance Evaluation: Simulation Study 2-stage over iSlip SRR over iSlip

25 CIST560 by M. Hamdi 25 Performance Evaluation: Simulation Study Unbalanced Traffic

26 CIST560 by M. Hamdi 26 Performance Evaluation: Simulation Study Load0.50.60.70.80.90.950.99 Improvement Percentage 12%39%53%142%552%8040%3351% Normalized Improvement Percentage 11%28%35%59%85%99%97% Improvement Factor 1.121.391.532.426.5281.4034.51 Improvement Percentage 4%35%74%225%843%11494%3499% Normalized Improvement Percentage 4%26%43%69%89%99%97% Improvement Factor 1.041.351.743.259.43115.9435.99 2-stage over iSlip SRR over iSlip

27 CIST560 by M. Hamdi 27 A new algorithm – RDESRR Real Desynchronized Round Robin Model (RDESRR) Based on 2 phases RRM model (Request and Grant) Add a small share memory that each outputs can read/write (called Share Bits) The size of the memory is 1 bit per input If the bit is set, the corresponding input has already granted by an output If the bit is not set, the output may grant to corresponding input port

28 CIST560 by M. Hamdi 28 RDESRR Conceptual model 0 1 2 3 0 1 2 3 3 0 2 1 3 0 2 1 3 0 2 1 3 0 2 1 3 0 1 2 Share Bits

29 CIST560 by M. Hamdi 29 RDESRR model 2 phases only Request. Each input sends a request to every output for which it has a queued cell. Grant. If an output receives any requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output check the corresponding bit is set or not, if not set, the output will set the bit and notifies the input its request was granted. Otherwise, the output will look for next request until all requests has gone through. The pointer g i to the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the granted input. If no request is received, the pointer stays unchanged.

30 CIST560 by M. Hamdi 30 RDESRR Demo - Request Step 1: Request 0 1 2 3 0 1 2 3

31 CIST560 by M. Hamdi 31 RDESRR Demo – Add a share memory in Output Step 2: Grant 0 1 2 3 0 1 2 3 3 0 2 1 3 0 2 1 3 0 2 1 3 0 2 1 3 0 1 2 Share Bits Add a small share memory that each outputs can read/write (called Share Bits)

32 CIST560 by M. Hamdi 32 3 0 2 1 3 0 2 1 3 0 2 1 3 0 2 1 RDESRR Demo – Output check the share bits 0 1 2 3 0 1 2 3 Step 2: Grant 3 0 1 2 Share Bits The output check the corresponding bit is set or not

33 CIST560 by M. Hamdi 33 RDESRR Demo – When share bit is occupied 0 1 2 3 0 1 2 3 Step 2: Grant 3 0 1 2 3 0 2 1 3 0 2 1 3 0 2 1 3 0 2 1    Share Bits if not set, the output will set the bit and notifies the input its request was granted The share bit is First Come First Serve

34 CIST560 by M. Hamdi 34 RDESRR Demo – Output looks for next request 0 1 2 3 0 1 2 3 Step 2: Grant 3 0 2 1 3 0 2 1 3 0 2 1 3 0 2 1 3 0 1 2    Share Bits If set, the output will look for next request until all requests have gone through

35 CIST560 by M. Hamdi 35 RDESRR Demo – All share bits are allocated 0 1 2 3 0 1 2 3 Step 2: Grant 3 0 2 1 3 0 2 1 3 0 2 1 3 0 2 1 3 0 1 2     Share Bits Fully allocate the share bit will result for fully grant all input request

36 CIST560 by M. Hamdi 36 3 0 2 1 3 0 2 1 3 0 2 1 3 0 2 1 RDESRR Demo – Pointer update/Share bit reset 0 1 2 3 0 1 2 3 3 0 1 2 Share Bits The pointer gi to the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the granted input If no request is received, the pointer stays unchanged Share bits are also reset

37 CIST560 by M. Hamdi 37 SIM Results Run the test for 32x32 port in SIM using – l 1000000

38 CIST560 by M. Hamdi 38 Input Queueing Longest Queue First or Oldest Cell First 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 10 1 1 1 1 M ax i m u m w e i g h t Weight Waiting Time 100% Queue Length { } =

39 CIST560 by M. Hamdi 39 Input Queueing Why is serving long/old queues better than serving maximum number of queues? When traffic is uniformly distributed, servicing the maximum number of queues leads to 100% throughput. When traffic is non-uniform, some queues become longer than others. A good algorithm keeps the queue lengths matched, and services a large number of queues. VOQ # Avg Occupancy Uniform traffic VOQ # Avg Occupancy Non-uniform traffic

40 CIST560 by M. Hamdi 40 Maximum/Maximal Weight Matching 100% throughput for admissible traffic (uniform or non- uniform) Maximum Weight Matching –OCF (Oldest Cell First): w=cell waiting time –LQF (Longest Queue First):w=input queue occupancy –LPF (Longest Port First):w=QL of the source port + Sum of QL form the source port to the destination port Maximal Weight Matching (practical algorithms) –iOCF –iLQF –iLPF (comparators in the critical path of iLQF are removed )

41 CIST560 by M. Hamdi 41 Maximal Weight Matching Algorithms: iLQF Request. Each unmatched input sends a request word of width bits to each output for which it has a queued cell, indicating the number of cells that it has queued to that output. Grant. If an unmatched output receives any requests, it chooses the largest valued request. Ties are broken randomly. Accept. If an unmatched input receives one or more grants, it accepts the one to which it made the largest valued request. Ties are broken randomly.

42 CIST560 by M. Hamdi 42 Maximal Weight Matching Algotithms: iLQF The i-LQF algorithm has the following properties: Property 1. Independent of the number of iterations, the longest input queue is always served. Property 2. As with i-SLIP, the algorithm converges in at most logN iterations. Property 3. For an inadmissible offered load, an input queue may be starved.

43 CIST560 by M. Hamdi 43 Maximal Weight Matching Algotithms: iOCF The i-OCF algorithm works in similar fashion to iLQF, and has the following properties: Property 1. Independent of the number of iterations, the cell that has been waiting the longest time in the input queues (it must at the head of the queue) Property 2. As with i-LQF, the algorithm converges in at most logN iterations. Property 3. No input queue can be starved indefinitely. Property 4. It is difficult to keep time stamps on the cells.

44 CIST560 by M. Hamdi 44 iLQF - Implementation

45 CIST560 by M. Hamdi 45 iLPF - Implementation Complicated hardware

46 CIST560 by M. Hamdi 46 Other research efforts Packet-based arbitration Exhaustive-based arbitration Numerous other efforts

47 CIST560 by M. Hamdi 47 Packet Scheduling/Arbitration in Virtual Output Queues: Randomized Algorithms and Others

48 CIST560 by M. Hamdi 48 Input-Queued Packet Switch Crossbar Scheduler inputs outputs 1 N 1N.......... i,j N,N 1,1 X i,j  i  j (  i i,j < 1 ;  j i,j < 1)

49 CIST560 by M. Hamdi 49 Bipartite Graph and Matrix 011 111 001 inputs outputs 1 2 3 321

50 CIST560 by M. Hamdi 50 Stability of Scheduling Definition: Let X i,j (t) be the number of packets queued at input i for output j at time-slot t. Then an algorithm is stable iff:

51 CIST560 by M. Hamdi 51 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 Maximum size matching Maximum weight matching 1 2 3 4 1 2 3 4 8 6 4 2 1 3 1 1 2 3 4 1 2 3 4 8 6 4 Maximum Matching in VOQ Architecture

52 CIST560 by M. Hamdi 52 Complexity of Maximum Matchings Maximum Size/Cardinality Matchings: –It is not a stable algorithm –Algorithm by Dinic O(N 5/2 ) Maximum Weight Matchings –Algorithm by Kuhn O(N 3 logN) –It is a stable algorithm In general: –Hard to implement in hardware (does not lend itself to simple hardware implementation not because of its serial time complexity) –Slooooow.

53 CIST560 by M. Hamdi 53 Maximal Matching Algorithms Maximal matching algorithms are heuristic algorithms that try to approximate MSM or MWM. In general, maximal matching is much simpler to implement (Not because of its time complexity), and has a much faster running time. A maximal size matching is at least half the size of a maximum size matching. A maximal weight matching is at least half the size of a maximum weight matching.

54 CIST560 by M. Hamdi 54 Maximal Size Matching Algorithm: Performance and Properties Can have 100% throughtput under uniform traffic They converge in logN iterations to a maximal size matching Their performance can be quite good (close to an ideal Output Queued Switch) with multiple iterations The best iterative maximal size matching algorithm takes O(N 2 logN) serial or O(log N) parallel time steps. If the number of iterations is constant, then it can be implemented in constant time (that is why it is practical).

55 CIST560 by M. Hamdi 55 State of Input Queues (N 2 bits) 1 2 N 1 2 N Decision Register Grant Arbiters Request Arbiters Implementation of the parallel maximal matching algorithms

56 CIST560 by M. Hamdi 56 Small Differences (in implementation) between RRM, iSlip & FIRM But large difference in performance RRMiSlipFIRM Input No grantunchanged Grantedone location beyond the accepted one Output No requestunchanged Grant accepted one location beyond the granted one Grant not accepted one location beyond the previously granted one unchangedthe granted one

57 CIST560 by M. Hamdi 57 Maximum/Maximal Weight Matching 100% throughput for admissible traffic (uniform or non-uniform) Maximum Weight Matching –OCF (Oldest Cell First): w=cell waiting time –LQF (Longest Queue First):w=input queue occupancy –LPF (Longest Port First):w=QL of the source port + Sum of QL form the source port to the destination port Maximal Weight Matching (practical iterative algorithms) Make these maximal weight matching algorithms operate like iSLIP –iOCF –iLQF –iLPF

58 CIST560 by M. Hamdi 58 Maximal Weight Matching Algorithms: iLQF Request. Each unmatched input sends a request word of width bits to each output for which it has a queued cell, indicating the number of cells that it has queued to that output. Grant. If an unmatched output receives any requests, it chooses the largest valued request (has the longest queue). Ties are broken randomly. Accept. If an unmatched input receives one or more grants, it accepts the one to which it made the largest valued request (has the longest queue). Ties are broken randomly.

59 CIST560 by M. Hamdi 59 Maximal Weight Matching Algotithms: iLQF The i-LQF algorithm has the following properties: Property 1. Independent of the number of iterations, the longest input queue is always served. Property 2. As with i-SLIP, the algorithm converges in at most logN iterations. Property 3. For an inadmissible offered load, an input queue may be starved. Property 4. It is a stable algorithm.

60 CIST560 by M. Hamdi 60 Maximal Weight Matching Algotithms: iOCF The i-OCF algorithm works in similar fashion to iLQF, and has the following properties: Property 1. Independent of the number of iterations, the cell that has been waiting the longest time in the input queues (it must at the head of the queue) Property 2. As with i-LQF, the algorithm converges in at most logN iterations. Property 3. No input queue can be starved indefinitely. Property 4. It is difficult to keep time stamps on the cells.

61 CIST560 by M. Hamdi 61 Can we do better with than maximal matchings using Randomized Algorithms

62 CIST560 by M. Hamdi 62 Motivation Networking problems suffer from the “curse of dimensionality” –algorithmic solutions do not scale well Typical causes –size: large number of users or large number of I/O –time: very high speeds of operation A good deterministic algorithm exists (Max Flow), but … –it requires too large a data structure –it needs state information, and “state” is too big –it “starts from scratch” in each iteration

63 CIST560 by M. Hamdi 63 Randomization Randomized algorithms have frequently been used in many situations where the state space (e.g., different number of connections between input and output N!) is very large Randomized algorithms –are a powerful way of approximating –it is often possible to randomize deterministic algorithms –this simplifies the implementation while retaining a (surprisingly) high level of performance The main idea is –to simplify the decision-making process –by basing decisions upon a small, randomly chosen sample of the state –rather than upon the complete state

64 CIST560 by M. Hamdi 64 An Illustrative Example Find the largest element of a set S of size 1 billion Deterministic algorithm: linear search –has a complexity of 1 billion The randomized version: find the largest of 10 randomly chosen samples –has a complexity of 10 –(note: this ignores complexity of choosing 10 random samples) Performance –linear search will find the absolute largest element –if R is the element found by randomized algorithm, we can make statements like P(R is at least the 100 millionth largest element) =  thus, we can say that the performance of the randomized algorithm is very good with a high probability

65 CIST560 by M. Hamdi 65 Randomizing Iterative Schemes (e.g., iSLIP) Often, we want to perform some operation iteratively Example: find the heaviest matching in a switch in every time slot Since, in each time slot –at most one packet can arrive at each input –and, at most one packet can depart from each output  the size of the queues, or the “state” of the switch, doesn’t change by much between successive time slots  so, a matching that was heavy at time t will quite likely continue to be heavy at time t+1 This suggests that –knowing a heavy matching at time t should help in determining a heavy matching at time t+1  there is no need to start from scratch in each time slot

66 CIST560 by M. Hamdi 66 Summarizing Randomized Algorithms Randomized algorithms can help simplify the implementation –by reducing the amount of work in each iteration If the state of the system doesn’t change by much between iterations, then –we can reduce the work even further by carrying information between iterations The big pay-off is  that, even though it is an approximation, the performance of a randomized scheme can be surprisingly good

67 CIST560 by M. Hamdi 67 Randomized Scheduling Algorithms: Example Consider a 3 x 3 input-queued switch –input traffic: is Bernoulli IID and λij = α/3 for all i, j, and α < 1 –This is admissible –note: there are a total of 6 (= 3!) possible service matrices

68 CIST560 by M. Hamdi 68 Random Scheduling Algorithms In time slot n, let S(n) be equal to one of the 6 possible matchings independently and uniformly at random Stability of Random –Consider L11(n), the number of packets in VOQ11 arrivals to VOQ11 occur according to A11(n), which is Bernoulli IID input rate = λ11 = α/3 this queue gets served whenever the service matrix connects input 1 to output 1 There are 2 service matrices that connect input 1 to output 1 since Random chooses service matrices u.a.r., input 1 is connected to output 1 1. for a fraction of time = 2/6 = 1/3 --- the service rate between input1 and output1 E(L11(n)) < iff λ11 < 1/3  α < 1 This random algorithm is stable.

69 CIST560 by M. Hamdi 69 Random Scheduling Algorithms Instability of Random Now suppose λii = α for all i and λij =0 for –clearly, this is admissible traffic for all α < 1 –but, under Random, the service rate at VOQ11 is 1/3 at best –hence VOQ11 and the switch will be unstable as soon as Stability (or 100% throughput) means it is stable under all admissible traffic!

70 CIST560 by M. Hamdi 70 Switch Size : 32 x 32 Input Traffic (shown for a 4 X 4 switch) –diagonal load matrix: normalized load=x+y<1 x=2y It is a good test-case Simulation Scenario

71 CIST560 by M. Hamdi 71 Obvious Randomized Schemes Choose a matching at random and use it as the schedule  doesn’t give 100% throughput (already shown) Choose 2 matchings at random and use the heavier one as the schedule Choose N matchings at random and use the heaviest one as the schedule   None of these can give 100% throughput !!

72 CIST560 by M. Hamdi 72

73 CIST560 by M. Hamdi 73 Bounds on Maximum Throughput

74 CIST560 by M. Hamdi 74 Iterative Randomized Scheme (Tassiulas) Say M is the matching used at time t Let R be a new matching chosen uniformly at random (u.a.r.) among the N! different matchings At time t+1, use the heavier of M and R Complexity is very low O(1) iterations This gives 100% throughput !  note the boost in throughput is due to memory (saving previous matchings) But, delays are very large

75 CIST560 by M. Hamdi 75

76 CIST560 by M. Hamdi 76 Observations for Improvement Most of the weight of a matching is carried in a small number of edges Hence, remember edges not matchings We can have 100% throughput under all admissible traffic.

77 CIST560 by M. Hamdi 77

78 CIST560 by M. Hamdi 78 Finer Observations Let M be schedule used at time t Choose a “good’’ random matching R M’ = Merge(M,R) M’ includes best edges from M and R Use M’ as schedule at time t+1 Above procedure yields algorithm called LAURA There are many other small variations to this algorithm.

79 CIST560 by M. Hamdi 79 3 2 3 2 2 1 2 3 4 1 Merging 3 2 3 3 1 XR 3-1+2-2=2 2-1+2-4=-1 W(X)=12W(R)=10 M W(M)=13 Merging Procedure

80 CIST560 by M. Hamdi 80

81 CIST560 by M. Hamdi 81 Can we avoid having schedulers altogether !!!

82 CIST560 by M. Hamdi 82 Recap: Two Successive Scaling Problems OQ routers: + work-conserving (QoS) - memory bandwidth = (N+1)R R R R R IQ routers: + memory bandwidth = 2R - arbitration complexity Bipartite Matching R R

83 CIST560 by M. Hamdi 83 Today: 64 ports at 10Gbps, 64-byte cells. Arbitration Time = = 51.2ns Request/Grant Communication BW = 17.5Gbps 10Gbps 64bytes IQ Arbitration Complexity Two main alternatives for scaling: 1. 1.Increase cell size 2. 2.Eliminate arbitration Scaling to 160Gbps: Arbitration Time = 3.2ns Request/Grant Communication BW = 280Gbps

84 CIST560 by M. Hamdi 84 Desirable Characteristics for Router Architecture Ideal: OQ 100% throughput Minimum delay Maintains packet order Necessary: able to regularly connect any input to any output What if the world was perfect? Assume Bernoulli iid uniform arrival traffic...

85 CIST560 by M. Hamdi 85 Round-Robin Scheduling Uniform & non-bursty traffic => 100% throughput Problem: traffic is non-uniform & bursty

86 CIST560 by M. Hamdi 86 Two-Stage Switch (I) 1 N 1 N 1 N External Outputs Internal Inputs External Inputs First Round-RobinSecond Round-Robin

87 CIST560 by M. Hamdi 87 Two-Stage Switch (I) 1 N 1 N 1 N External Outputs Internal Inputs External Inputs First Round-RobinSecond Round-Robin Load Balancing

88 CIST560 by M. Hamdi 88 100% throughput Problem: unbounded mis-sequencing External Outputs Internal Inputs 1 N External Inputs Cyclic Shift 1 N 1 N 1 1 2 2 Two-Stage Switch Characteristics

89 CIST560 by M. Hamdi 89 Two-Stage Switch (II) NewN 3 instead of N 2

90 CIST560 by M. Hamdi 90 Expanding VOQ Structure Solution: expand VOQ structure by distinguishing among switch inputs 2 1 3 a b

91 CIST560 by M. Hamdi 91 What is being done in practice (Cisco for example) They want schedulers that achieve 100% throughput and very low delay (Like MWM) They want it to be as simple as iSLIP in terms of hardware implementation Is there any solution to this !!!!!

92 CIST560 by M. Hamdi 92 Typical Performance of ISLIP-like Algorithms PIM with 4 iterations

93 CIST560 by M. Hamdi 93 What is being done in practice (Cisco for example) CompanySwitching Capacity Switch Architecture Fabric Overspeed Agere40 Gbit/s-2.5 Tbit/sArbitrated crossbar2x AMCC20-160 Gbit/sShared memory1.0x AMCC40 Gbit/s-1.2 Tbit/sArbitrated crossbar1-2x Broadcom40-640 Gbit/sBuffered crossbar1-4x Cisco40-320 Gbit/sArbitrated crossbar2x


Download ppt "CIST560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues: Maximal Matching Algorithms (Part II)"

Similar presentations


Ads by Google