Download presentation
Presentation is loading. Please wait.
Published byNoel Nicholas Barnett Modified over 9 years ago
1
Reliable Routing for the Internet Avici Company Confidential Scott Poretsky Avici Systems, Inc. June 3, 2002 Core Router Testing for High Availability
2
Architecture for the 21 st Century Network IP Network Availability Test Coverage for 99.999% Availability Commercial Test Equipment Requirements Outline
3
Architecture for the 21 st Century Network IP Network Availability
4
Architecture for the 21 st Century Network High Reliability = More Revenue Reliability is the single biggest criteria in selecting an ISP, according to Interactive Week/Telechoice ISP Customer Survey 4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 ReliabilityValuePerformanceCustomer Service Provisioning Speed Relative Importance ISP Customer Survey 4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 ReliabilityValuePerformanceCustomer Service Provisioning Speed Relative Importance New IP services demand higher levels of network reliability
5
Architecture for the 21 st Century Network High Reliability = More Profit Compensation for poor router reliability through redundancy and interconnects can increase network cost by up to 50% VOIP Core Layer (Backbone Router) DSLAML3/4 Switch CMTSGGSNL3/4 Switch Direct Connects Aggregation Layer (Hub Router) Direct Connects Service Provider Peer Service Provider Peer Edge Layer Access Devices VOIP Core Layer (Backbone Router) DSLAML3/4 Switch CMTSGGSNL3/4 Switch Direct Connects Aggregation Layer (Hub Router) Direct Connects Service Provider Peer Peering Edge Layer Access Devices IP Backbone
6
Architecture for the 21 st Century Network Definitions Reliable Capable of being dependable (Webster) Availability Measure of Reliability using router/switch Uptime Mission Reliability Mean Time Between Critical Failures (MTBCF) or the average time between hardware or software failures that interrupt service (the mission) Maintenance Reliability Mean Time Between Failures (MTBF) or the average time between hardware failures that require corrective maintenance actions Defects Per Million (DPM) Measure of downtime equal to (1 – Availability) x 10 6
7
Architecture for the 21 st Century Network CrashDump TimeBoot Time Protocol Convergence Time Total Time to Restore Router/Switch After a Software Failure Not to Scale Software Failure Occurs Full Operation Restored Time Mission Reliability Contributing Factors for Availability Maintainer Response TimeBoot Time Protocol Convergence Time Total Time to Restore a Module After a Hardware Failure Not to Scale Removal and Replacement Time Hardware Failure Occurs Time Maintenance Reliability Full Operation Restored Image Upgrade Time
8
Architecture for the 21 st Century Network The Availability Goal The Goal – 99.999% Router Availability The Reality – 99.9% Router Availability Features to achieve 99.999% availability. Non-Stop Routing Graceful Restart What if testing could could improve Mission Reliability to achieve 99.999% Availability in absence of new features? What if the addition of these new features would then achieve 99.9999% Availability?
9
Architecture for the 21 st Century Network Test Coverage
10
Architecture for the 21 st Century Network Isolated testing of protocols Functionality Conformance Interoperability Scaling Forwarding Performance in the absence of protocols. Disadvantages Operational environment is not tested Operational conditions are not tested The router under test is not completely stressed. Deployed routers run multiple protocols simultaneously. Traditional Test Coverage
11
Architecture for the 21 st Century Network Stress Testing Longevity Testing Convergence Testing Network-Specific Topology Testing Automated Regression Testing Test Program for 99.999% Availability
12
Architecture for the 21 st Century Network Stress Testing Simultaneous configuration and scaling of multiple protocols. BGP, IGP MPLS-TE, LDP (optional) MBGP, PIM-SM, MSDP (optional) Traffic Forwarding Line Rate Traffic Forwarding Overutilize links Enable QoS Network Instability Repeated Route Flaps Link Loss Tunnel Reroutes (optional) Serviceability Repeated SNMP Gets Logging Enabled Debug Enabled Telnet with SHOW commands (stressful and invalid)
13
Architecture for the 21 st Century Network Stress Configuration Router Under Test Neighbor Router Neighbor Router Optional Neighbor Router for Tunnel Reroutes Test Equipment Test Equipment Test Equipment
14
Architecture for the 21 st Century Network Stress Execution Guidelines Configure ECMP, Parallel Paths, and Composite Links between routers Use Live BGP Feed for Route Table Mix traffic types across links (IP Unicast, IP Multicast, MPLS) One neighbor router should be a different vendor to show interoperability under stress Run Stress for many days (if the router lasts that long) Router should experience more in a couple of days then it likely would in its operational lifetime.
15
Architecture for the 21 st Century Network Typical Stress Metrics Flap 1 million BGP routes per hour Forward 10 Terabits of data per hour Perform 100,000 SNMP Gets per hour Simulate 100 fiber cuts per hour (use every remote interface) Along with Full BGP Table Full IGP Table Full Multicast Cache Required MPLS-TE Tunnels (protection optional) Required LDP FECs Enable Logging and Protocol Debug
16
Architecture for the 21 st Century Network Longevity Testing Similar to Stress Testing, but more operational (less stressful) conditions injected over many weeks. Simultaneous configuration and scaling of multiple protocols Traffic Forwarding More realistic Network Instability More typical Serviceability actions Use Live Internet feed.
17
Architecture for the 21 st Century Network Network Convergence - The point in time at which all nodes in a network have updated their routing tables for a route entry change (new, withdrawal, or modification) Protocol Convergence - The point in time in which a single node updates its routing table and advertises the route table change to its peer in a routing protocol advertisement (or update) message. Route Convergence - The point in time in which a single node updates its routing table and reroutes traffic out the new interface. Route Convergence is the common Router Benchmark. Convergence Terms
18
Architecture for the 21 st Century Network Large number of Protocols in which Convergence is important. Number of conditions that can impact results. Technical difficulty in testing convergence of one protocol due to flap or instability of another protocol. Convergence Test Issues
19
Architecture for the 21 st Century Network Interface shutdown on Local Interface on Remote Interface Fiber Pull on Local Interface on Remote Interface Peer removal via CLI on Local router on Peer router Peer node failure Route Table changes Route Withdrawal Route Flap Next-Hop Change Metric Change Dynamic Constraint Change Policy Change All conditions must be tested because different results can be produced. Convergence Test Conditions
20
Architecture for the 21 st Century Network Network-Specific Topology Testing Large network with many routers (e.g. 10) Use multiple vendors for interoperability/functionality testing. Multiple protocols configured in deployment scenario Run test cases to match deployment scenario
21
Architecture for the 21 st Century Network Addition of bug fixes/new features put previously working features at risk. Regression testing ensures that the previously working features still work. As the number of releases with new features grow it is more difficult to provide complete regression coverage through manual testing (increasingly labor intensive). Automated regression testing enables more coverage in less time. Automation is typically achieved using TCL scripts. Configuration: Automated Regression Testing Router Under Test Test Equipment
22
Architecture for the 21 st Century Network Commercial Test Equipment Requirements
23
Architecture for the 21 st Century Network Test Equipment fails to meet today’s requirements for testing 99.999% Availability. Router vendors have been forced to develop their own specialized test tools. Carriers have been forced to use the router vendor test tools. Test Equipment vendors must respond to the challenge today. The State of the Union
24
Architecture for the 21 st Century Network Stress Testing Requirements Maintain BGP Sessions and IGP Adjacencies Flap BGP Routes Signal and maintain RSVP-TE tunnels Distribute LDP FECs Signal and maintain Multicast Groups Perform SNMP GETs and check validity Forward Traffic (IP Unicast, IP Multicast, and MPLS) Make the network seem much bigger than it really is without having to obtain hundreds of routers.
25
Architecture for the 21 st Century Network Required Protocol Emulation/ Conformance Suites Coverage Routing Protocols BGP OSPF, ISIS OSPF-TE, ISIS-TE RSVP-TE Fast Reroute Standby Tunnels Ingress, Mid-Point, Egress LDP RFC 2547 Layer 3 VPNs Martini Layer 2 VPNs P and PE LDP over RSVP Multicast MBGP PIM-SM MSDP
26
Architecture for the 21 st Century Network Protocol Emulation Requirements Run any protocols in combination on the same interface Forward traffic for emulated protocols Protocol Emulation on any interface type – GigE, 10GigE, and POS (including 192c). Scaling BGP Sessions >500/system, >100/interface BGP Routes >3M/system, >500K/session MPLS-TE Tunnels >10K - Ingress, Mid-Point, Egress FECs >10K Load external BGP table for advertisement Controlled BGP Route Flapping
27
Architecture for the 21 st Century Network Commercial test equipment vendors offer protocol conformance TCL suites. Test Case coverage must be improved within each suite Interaction between protocols must be tested Need each script to test multiple interfaces (4 or more) Full Protocol Coverage Multicast protocols have been the “forgotten son” Automated Regression Requirements
28
Architecture for the 21 st Century Network System Requirements Multiple ports per chassis (>32) Automated Convergence measurement Automated reroute/failover measurement Support for ECMP and Composite Links System/Protocol Stability For Many Days Ability to store GUI configuration for repeatability. Ability to TCL script any GUI test case.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.