1 LSA Flooding Optimization Algorithms and Their Simulation Study (draft-choudhury-manral-flooding-simulation-00.txt) Gagan Choudhury AT&T Vishwas Manral NetPlane Systems
2 The Basic Issue Flooding Over All Interfaces is Highly Reliable But in Large Networks it May Cause Sustained CPU Congestion (Often Memory Congestion as well) During LSA Storms Triggered By –Links/Nodes Failures –Synchronization of Refreshes –Software Bugs or Procedural Errors Congestion Reinforced by Positive Feedback Loop due to –LSA Retransmissions, possible packet droppings, possible link failures due to missed Hellos and eventual recoveries More LSAs On Rare Occasions the Congestion Spreads to Many Nodes and Cause Significant Failures (Observed in Operational Networks) We Show Simulation Study on How Stability/Scalability of Networks May Be Improved with Restrictive Flooding Algorithms and Propose that a Subset of These Schemes be Pursued Further
3 Flooding Algorithms Algorithm 1: Flood over All Interfaces (Existing Algorithm) Algorithm 2: Full Flooding But Flood over Only one of Many Parallel Links Between Neighbors (Zinin/Shand ID, Moy ID, Used in PNNI) Algorithm 3: Algorithm 2 + Full Flooding only at Multipoint Relays Chosen by Each Node Independently Algorithm 4: Algorithm 2 + Flooding only Along a Minimum Spanning Tree (If a Link Along the Tree Fails the MST Needs to be Re-computed): Not Robust Under Failures Algorithm 5: Algorithm 2 for LSAs Carrying Intra-Area Topology (Router, Network), Algorithm 4 for Other LSAs (ASE, TE, Summary) Modified Algorithm 5: Flooding Links Survivable Under Single Link and Single Node Failures (Results Not Reported)
4 Alternate Simulation Scenarios Network Scenarios: –Network 1: 100 Nodes, 1200 Links, Max Neighbors 30, Max Node Adjacency 50 –Network 2: 50 Nodes, 600 Links, Max Neighbors 25, Max Node Adjacency 48 LSA Scenarios –1 Router LSA per Node, 1 TE LSA per Link –1 Router LSA per Node, 10 ASE LSAs per Every Other Node LSU Processing Time : ~ 1 ms, ~0.5 ms, ~2 ms
5 Five Simulation Cases Case 1: Network 1, Link LSAs, Proc. Time ~ 1 ms Case 2: Network 1, ASE LSAs, Proc. Time ~ 1 ms Case 3: Network 1, Link LSAs, Proc. Time ~ 0.5 ms Case 4: Network 1, Link LSAs, Proc. Time ~ 2 ms Case 5: Network 2, Link LSAs, Proc. Time ~ 1 ms
6 Number of Non-Converged LSAs Vs. LSA Storm - Case 1, Algorithm 1 - LSA Storm Starts Between 20 and 30 Seconds
7 LSA Storm Threshold for Sustained CPU Congestion
8 Observations on Flooding Algorithms Flooding Over one of Many Parallel Links (Alg. 2) May Significantly Improve Scalability over Current Algorithm (Alg. 1) –Zinin/Shand ID/ Moy ID Should be Pursued Further Further Restriction With MPR (Alg 3) Has Moderate Improvement –Neighbors of High Adjacency Node Tend to Declare it as MPR Flooding Only Over Minimum Spanning Tree (Alg 4) Greatly Improves Scalability But Not Robust Under Failure Full Flooding for LSAs Carrying Intra-Area Topology and MST Flooding for Others (Alg 5) May Be Almost As Scalable as Alg 4 But Also More Robust Modification to Alg 5 with Disjoint MSTs + Other Flooding Links to Ensure Robustness Under Single Link and Single Node Failures Have Been Considered (Not Reported) –Quite Robust and Significantly More Scalable Compared to Alg 2 Alg 2, Alg 5 and Modified Alg 5 Should Be Pursued Further