127 February 2006 RapidIO FT Research Update: Adaptive Routing David Bueno February 27, 2006 HCS Research Laboratory Dept. of Electrical and Computer Engineering.

127 February 2006 RapidIO FT Research Update: Adaptive Routing David Bueno February 27, 2006 HCS Research Laboratory Dept. of Electrical and Computer Engineering University of Florida

227 February 2006 Overview RapidIO switches traditionally handle routing using routing tables with destID-output port pairs  Packet route is NOT specified at source, instead is determined by switches as packet travels through network  Routing tables are generally fixed with one port for each destID Want to explore capabilities of adaptive routing in RapidIO switches for purposes of performance (load balancing) and fault tolerance  Many of our FT network designs provide the option of “over-provisioning” the backplane by providing an extra switch  Initial experiments with high-bandwidth corner turns found it best to leave extra switch inactive as no benefits were gained by using it in active mode with a fixed-routed application  Early GMTI experiments tested adaptive round-robin routing with GMTI corner turns and found a fixed-routed version performed better Lesson learned here is that if an application CAN be effectively statically routed and network provides enough bandwidth, use fixed routing

327 February 2006 Background Several relevant papers on this topic uncovered in previous literature searches  [1] and [2] of most interest since they deal with expanding an existing protocol (InfiniBand) to support adaptive routing  Unlike IBA, RIO spec does not forbid adaptive routing, but leaves implementation up to developer One important issue in implementing adaptive routing in a RIO system is in-order delivery  For traffic flows requiring in-order delivery, [2] suggests assigning multiple destIDs to each node that may be the recipient of an in-order flow  All switch routing tables would then provide only a single output port for this destID  Example: Assign destID’s 5 and 6 to physical processing element P Assume use ID 5 for adaptive traffic, 6 for in-order traffic Sample routing table entry for a RIO switch then could look like:  destID: 5 Port: 1, 2, 3  destID: 6 Port: 3  Packets for destID 5 can leave through ports 1, 2, or 3, but packets for destID 6 must leave through port 3  All packets for destID 5 and 6 end up at the same destination, processing element P

427 February 2006 Initial Model Improvements (2-10-06) Models already supported adaptive routing assumed similar to Honeywell RIOS “aggregate” capabilities  Round-robin selection of output ports from a list similar to previous example Expanded simulation models to allow selection of output port based on port with smallest number of packets outstanding to be sent and accepted Expanded models to allow random selection of output port Selection of port takes place prior to decision to accept or reject a packet based on buffer space, priority, etc. Created additional 32-node benchmarks to test usefulness of adaptive routing for traffic that cannot be statically scheduled  Random reads- Each processing element issues 1000 read requests to random destinations for 256 B. Request N+1 is not issued until request N is filled. Generally ~32 packets are in flight in the network at any one time.  Random sends 256- Each processing element issues 1000 message passing packets (256 B) to random destinations. There is a large delay after each packet is sent so that each iteration is not subject to contention prior to starting. (i.e. everyone sends their packet, then waits awhile, then everyone sends again at the same time, and this happens a total of 1000 times)  Random sends 4096- Each processing element issues 1000 full RapidIO messages (4096 B) to random destinations. There is a large delay after each message is sent so that each iteration is not subject to contention prior to starting.

527 February 2006 Experiments Overview (1) All experiments use the Fault- Tolerant Clos (FTC) network architecture Results generally hold for any of our FT architectures with 5-switch core stage if routing is configured identically Adaptive routing only possible in FIRST stage if a shortest-hop path is to be taken to destination  First-stage switch may choose between any active core switch (up to 5 active switches) assuming packet is destined for a destination node NOT connected to the same first-stage switch  Most paths traverse three switches to get from one node to another Some paths only require one switch when both source and dest are connected to same switch

627 February 2006 Experiments Overview (2) For all experiments, 5-switch core assumes all 5 switches are active 4-switch core may represent either of two cases:  4 active switches with a 5 th switch unpowered as a spare  4 active switches, when the 5 th switch has previously failed 3-switch core should be interpreted similarly Note that based on number of nodes and network bandwidth, 5 switches is over provisioned, 4 switches is “correct” provisioning, and 3 switches is under provisioned

727 February 2006 New Model/Experiment Revisions (2-23-06) Updated models now used for collection of fixed results  Old fixed-routed models had been based on switch model prior to summer 05 internship  Older switch model treated central switch memory as a single pool of buffer space Made decision to accept or reject packets based on priority and total switch memory free (set of 4 thresholds, 1 per priority)  Model revised during internship to treat each output port individually, much like understanding of Honeywell RIOS Decision to accept packet based on output-port dependent factors:  Priority and number of packets of this priority currently buffered for its destination output port  Total number of packets currently buffered for its destination output port  Total amount of free switch memory (i.e. can another packet fit in the switch at all)  Wasn’t a perfect “apples to apples” comparison between fixed-routed and adaptive-routed systems because adaptive systems were based on new switch model For shortest-buffer tactic, added capability to choose a random buffer from the set of shortest buffers rather than choosing the first one the simulation finds Additional experiments:  Changed sequence of random destinations generated Insignificant effect on all results (<1%) Already performing enough repetitions to fairly gather latency results for random sends experiments  Changed initialization of round-robin sequence to random rather than first port in the list Again, insignificant effect on all results (<1%) Random traffic quickly ensures that port lists of each switch are not “synchronized” at all with respect to each other Fair load balance is achieved regardless of starting point of each list

827 February 2006 Random Reads: Revised Shortest buffer with random selection of shortest buffer now slightly outperforms round robin in all cases  Old shortest-buffer tactic most often did not make use of all available backplane switch resources Caused unbalanced network load and performance penalty  New tactic slightly improves upon round robin in most cases Round robin more simple and still does very good job of balancing the load of random traffic Fixed method performance remained mostly the same, except slightly worse in 3-switch case  Note this does NOT indicate that separate buffer management is a worse scheme  Instead, it is just simply a more fair, correct comparison  Switches could be configured to allow more packets of this priority (0), which would change results across the board

927 February 2006 Random Sends (256 B): Revised For 5-switch and 4-switch cases, light traffic still lends itself to fixed mapping Round robin and random shortest buffer now very similar in all cases  Random shortest buffer will behave similarly to an “out-of-order” round robin in many cases Fixed performance again slightly degraded in 3-switch case for reasons already discussed  Results further emphasize the effectiveness of adaptive routing when network is under-provisioned

1027 February 2006 Random Sends (4096 B): Revised New fixed setup performs worse in all cases due to more restrictive buffer management  Again, previous comparison was not fair  Current configuration could be optimized and would affect all results, not just fixed Fixed routing and random adaptive routing two worst options in all cases  Old fixed results were actually aided by unfair buffer management scheme as explained earlier This experiment was most dramatically affected by the change due to the high contention and high number of retries issued  New fixed results suffer in all cases  Fixed routing for under-provisioned 3- switch case now even worse than before! Explanation for poor performance on following slide

1127 February 2006 Fixed Routing Problems Fixed routing for under-provisioned 3-switch case now even worse than before! Explanation for poor performance in both cases:  Imagine P0 wants to send a 4096 B message to P4  Imagine P1 simultaneously wants to send a 4096 B message to P16  Both messages must travel through switch 0, whose (partial) balanced, fixed routing table looks like: P0 P1 2 nd -Level Active Switches P4 P16 Switch 0 Dest IDPort 00 11 22 33 48 59 67 78 89 97 108 119 127 138 149 157 168 Both messages are entirely serialized through Switch 0 Port 8 With only 3 backplane switches, this scenario becomes very likely  But, similar scenario may occur in 4- and 5-switch cases with less frequency Any form of adaptive routing that will use ports 7, 8, and 9 for this traffic will be better This is why even random selection on a per-packet basis performs better than fixed routing in the 3-switch case  Fixed also the worst method in 4-switch case by a lesser margin

1227 February 2006 Conclusions Optimal routing strategy highly dependent on algorithm and communication patterns  Adaptive routing not very useful when high traffic amounts (such as corner turns) can be adequately balanced statically Previous experiments have shown it can do more harm than good  These experiments show adaptive routing most useful in cases of heavy network contention when large transactions can not be statically scheduled In general, round robin and random shortest buffer appear to be most effective adaptive routing strategies for Clos-based RIO networks  Results may vary widely for other network configurations, but Clos networks the focus here due to their FT properties and high performance Random shortest buffer improved upon initial shortest buffer routing but still may not worth the cost of extra logic required to make decisions based on buffer status  Effectiveness is limited in a Clos network because choice can only be made at first-stage switch Even if buffer at first-stage switch is empty, it could be headed to a highly congested second-stage switch! Do NOT want to concern switches with the status of OTHER switches in the network  May be more useful in some applications specifically tailored towards this routing strategy But, similar queue “bypass” could be handled just using RapidIO priority mechanism already present in protocol Adaptive routing improved upon fixed routing in almost all experiments, even when selection of port was completely random  Exception was random sends (256 B) case, where traffic was so light that fixed routing was relatively efficient and balanced  Best case for adaptive routing was random sends (4096 B), where large messages cause problems when statically scheduled for the same output port Extra-switch core helpful in ALL cases when traffic is random, even without adaptive routing  Adaptive routing enhances usefulness of active 5 th core switch

1327 February 2006 References [1] J. M. Montanana, J. Flich, A. Robles, P. Lopez, and J. Duato, "A Transition-Based Fault-Tolerant Routing Methodology For Infiniband Networks," in Proceedings of the 18th International Parallel and Distributed Processing Symposium, Santa Fe, New Mexico, April 2004. [2] J. C. Martinez, J. Flich, A. Robles, P. Lopez, and J. Duato, “Supporting Adaptive Routing in InfiniBand Networks,” In Proceedings of the Eleventh Euromicro Conference on Parallel, Distributed, and Network- Based Processing, pp. 165-172, February 2003.

127 February 2006 RapidIO FT Research Update: Adaptive Routing David Bueno February 27, 2006 HCS Research Laboratory Dept. of Electrical and Computer Engineering.

Similar presentations

Presentation on theme: "127 February 2006 RapidIO FT Research Update: Adaptive Routing David Bueno February 27, 2006 HCS Research Laboratory Dept. of Electrical and Computer Engineering."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

127 February 2006 RapidIO FT Research Update: Adaptive Routing David Bueno February 27, 2006 HCS Research Laboratory Dept. of Electrical and Computer Engineering.

Similar presentations

Presentation on theme: "127 February 2006 RapidIO FT Research Update: Adaptive Routing David Bueno February 27, 2006 HCS Research Laboratory Dept. of Electrical and Computer Engineering."— Presentation transcript:

Similar presentations

About project

Feedback