IGP Data Plane Convergence draft-ietf-bmwg-dataplane-conv-meth-14.txt draft-ietf-bmwg-dataplane-conv-term-14.txt draft-ietf-bmwg-dataplane-conv-app-14.txt BMWG, IETF-70 Vancouver December 2007 Scott Poretsky, NextPoint Networks Brent Imhoff, Juniper Networks
2 Title Change Benchmarking Methodology for IGP Data Plane Route Convergence Terminology for Benchmarking IGP Data Plane Route Convergence Considerations for Benchmarking IGP Data Plane Route Convergence Benchmarking Methodology for Link-State IGP Data Plane Route Convergence Terminology for Benchmarking Link-State IGP Data Plane Route Convergence Considerations for Benchmarking Link-State IGP Data Plane Route Convergence -13 Submittals-14 Submittals Resolves DISCUSS from Dan Romascanu
3 Addition to Section 3 of Considerations 3. Factors for IGP Route Convergence Time... -Increased Forwarding Delay due to Queueing... “Routers may have a centralized forwarding architecture, in which one route table is calculated and referenced for all arriving packets, or a distributed forwarding architecture, in which the central route table is calculated and distributed to the interfaces for local look-up as packets arrive. The distributed route tables are typically maintained in hardware.”
4 Addition to Section 4 of Considerations 4. Network Events that Cause Convergence There are different types of network events that can cause IGP convergence. These network events are as follow: * administrative link removal * unplanned link failure * line card failure * route changes such as withdrawal, flap, next-hop change, and cost change. 4. Network Events that Cause Convergence There are different types of network events that can cause IGP convergence. These network events are as follow: * administrative link removal * unplanned link failure * line card failure * route changes such as withdrawal, flap, next-hop change, and cost change. * session loss due to loss of peer or adjacency * link recovery * link insertion -13 Submittal-14 Submittal
5 First Prefix Convergence Added to Terminology (Used in Methodology) New Terms in -14: 3.9 First Prefix Convergence Instant 3.15 First Prefix Convergence Time
6 Units of Time Clarified "Measurement Units:" for all terms with time- based units have been listed as "seconds" so that there is no assumption of magnitude. 3.7 Convergence Event Instant and 3.8 Convergence Recovery Instant were updated to allow measurement to microseconds as follows: Measurement Units: hh:mm:ss:nnn:uuu, where 'nnn' is milliseconds and 'uuu' is microseconds.
7 Reversion Convergence Time 3.14 Restoration Convergence Time Definition: The amount of time for the router under test to restore traffic to the original outbound port after recovery from a Convergence Event. … Measurement Units: seconds or milliseconds 3.16 Reversion Convergence Time Definition: The amount of time for the DUT to forward traffic from the Preferred Egress Interface, instead of the Next-Best Egress Interface, upon recovery from a Convergence Event. … Measurement Units: seconds
8 Updated DISCUSSION to Clarify Terms 3.18 Local Interface … Discussion: A failure of the Local Interface indicates that the failure occurred directly on the DUT Neighbor Interface … Discussion: A failure of a Neighbor Interface indicates that a failure occurred on a neighbor router’s interface that directly links that neighbor router to the DUT Remote Interface … Discussion: A failure of a Remote Interface indicates that the failure occurred on an neighbor router’s interface that is not directly connected to the DUT Stale Forwarding... Discussion: Stale Forwarding can be caused by a Convergence Event and can manifest as a "black-hole" or microloop that produces packet loss. Stale Forwarding exists until Network Convergence is achieved. Stale Forwarding cannot be observed with a single DUT.
9 Updated Test Cases in Methodology INSERTED in -14 for each test case a step to measure the First Prefix Convergence Time 5. Measure First Prefix Convergence Time [Po07t] as DUT detects link down event and begins to converge IGP routes and traffic over the Next-Best Egress Interface. ADDED in Convergence Due to Local Administrative Shutdown 4.8 Convergence Due to ECMP Member Remote Interface Failure REVISED in -14 –4.1.3 Convergence Due to Remote Interface Failure Results now state “The measured IGP Convergence time is influenced by the link failure indication, LSA/LSP Flood Packet Pacing, LSA/LSP Retransmission Packet Pacing, LSA/LSP Generation time, SPF delay, SPF Hold time, SPF Execution Time, Tree Build Time, and Hardware Update Time [Po07a]. This test case may produce Stale Forwarding [Po07t] due to microloops which may increase the Rate-Derived Convergence Time.” –4.7 Convergence Due to ECMP Member Interface Failure Procedure now includes measurement of Out-of-Order Packets and Duplicate Packets.
10 Added Summary of Procedures in Methodology 4. Test Cases It is RECOMMENDED that all applicable test cases be executed for best characterization of the DUT. The test cases follow a generic procedure tailored to the specific DUT configuration and Convergence Event [Po07t]. This generic procedure is as follows: 1. Establish DUT configuration and install routes. 2. Send offered load with traffic traversing Preferred Egress Interface [Po07t]. 3. Introduce Convergence Event to force traffic to Next-Best Egress Interface [Po07t]. 4. Measure First Prefix Convergence Time. 5. Measure Rate-Derived Convergence Time. 6. Recover from Convergence Event. 7. Measure Reversion Convergence Time.
11 New Reporting Format in Methodology 3.3 Reporting Format For each test case, it is recommended that the following reporting format is completed: Parameter Units IGP (ISIS or OSPF) Interface Type (GigE, POS, ATM, etc.) Packet Size offered to DUT bytes IGP Routes advertised to DUT number of IGP routes Packet Sampling Interval on Tester seconds or milliseconds IGP Timer Values configured on DUT SONET Failure Indication Delay seconds or milliseconds IGP Hello Timer seconds or milliseconds IGP Dead-Interval seconds or milliseconds LSA Generation Delay seconds or milliseconds LSA Flood Packet Pacing seconds or milliseconds LSA Retransmission Packet Pacing seconds or milliseconds SPF Delay seconds or milliseconds Benchmarks Rate-Derived Convergence Time seconds or milliseconds Loss-Derived Convergence Time seconds or milliseconds Restoration Convergence Time seconds or milliseconds 3.3 Reporting Format For each test case, it is recommended that the reporting table below is completed and all time values SHOULD be reported with resolution as specified in [Po07t]. Parameter Units IGP (ISIS or OSPF) Interface Type (GigE, POS, ATM, etc.) Test Topology (1, 2, 3, or 4) Packet Size offered to DUT bytes Total Packets Offered to DUT number of Packets Total Packets Routed by DUT number of Packets IGP Routes advertised to DUT number of IGP routes Nodes in emulated network number of nodes Packet Sampling Interval on Tester milliseconds IGP Timer Values configured on DUT Interface Failure Indication Delay seconds IGP Hello Timer seconds IGP Dead-Interval seconds LSA Generation Delay seconds LSA Flood Packet Pacing seconds LSA Retransmission Packet Pacing seconds SPF Delay seconds Benchmarks First Prefix Convergence Time seconds Rate-Derived Convergence Time seconds Loss-Derived Convergence Time seconds Reversion Convergence Time seconds -13 Submittal-14 Submittal
12 Clarifications in Methodology 3.1 Test Topologies ADDED to description of Figure 2 – “A Remote Interface [Po07t] failure on router R2 MUST result in convergence of traffic to router R3.” Convergence Time Metrics “The RECOMMENDED value for the Packet Sampling Interval is 10 milliseconds” instead of 100msec Interface Types ADDED “All interfaces SHOULD be configured as point-to- point.”
13 Agreed Out-of-Scope NSF Graceful Restart RIP
14 Planned Changes for -15 to Close Out DISCUSS Items (1) –Add new term 3.5 Partial Convergence Route Convergence for one or more route entries in the FIB in which recovery from the Convergence Event is indicated by data-plane traffic for a flow [Po06] matching that route entry(ies) being routed to the Next- Best Egress Interface. –Add new term 3.15 Partial Convergence Time The amount of time it takes for Partial Convergence to be achieved as calculated from the amount of Convergence Packet Loss for a specific flow or group of flows. –Update Methodology Section 3.0 Test Considerations to provide option to execute test cases to benchmark Full Convergence using Rate-Derived Convergence Time Partial Convergence using Loss-Derived Convergence Time –Possible to calculate min, max, average, median First Convergence using Rate-Derived Convergence Time or Loss- Derived Convergence Time
15 Planned Changes for -15 to Close Out DISCUSS Items (2) –Test Case To be Added in -15: 4.10 Convergence Due to Link Insertion –Test Case To be Revised in -15: 4.5 Convergence Due to Route Withdrawal to be specific to External IGP Routes.
16 Next Steps Issues? Comments? -15 Term and Meth to be posted by eoy