1 Root-Cause Network Troubleshooting Optimizing the Process Tim Titus CTO PathSolutions
2 Sample Problem x41 x52 x53 User complains about missing parts of a conversation between x41 and x53 at 12:04pm VoIP call quality problem
3 Packet Capture x41 x52 x53 You have confirmation that there is a problem, but no idea which device or link caused the packet loss Results Latency: 127ms Jitter: 87ms Packet loss: 8.2% Actual VoIP Call Wireshark
4 Application Performance Monitoring x41 x52 x53 You have knowledge of the experience across the network, but no understanding of the source or cause of the problem. Results Latency: 127ms Jitter: 87ms Packet loss: 8.2% Synthetic VoIP Call
5 Netflow Collectors x41 x52 x53 You have knowledge of a flow across the network, but no awareness of any problem Flow from to at 2:45pm Actual VoIP Call Flow Record Flow Collector
6 SNMP Collectors x41 x52 x53 You have data about conditions on some parts of the network, but no analysis of the problem or correlation to events 23% WAN utilization at 12:05pm Actual VoIP Call SNMP Collector
7 Finding the Root-Cause x41 x52 x53 Actual VoIP Call Step 1: Locate where the involved endpoints connect to the network
8 Finding the Root-Cause x41 x52 x53 Step 2: Identify the full layer-2 path through the network from the first phone to the second phone
9 Finding the Root-Cause x41 x52 x53 Step 3: Investigate involved switch and router health (CPU & Memory) for acceptable levels
10 Finding the Root-Cause x41 x52 x53 Step 4: Investigate involved interfaces for: VLAN assignment DiffServe/QoS tagging Queuing configuration 802.1p Priority settings Duplex mismatches Cable faults Half-duplex operation Broadcast storms Incorrect speed settings Over-subscription TRANSIENT PROBLEM WARNING: If the error condition is no longer occurring when this investigation is performed, you may not catch the problem
11 Optimizing the Methodology In a perfect world you want: Tracking of: Every switch, router, and link in the entire infrastructure All error counters, performance and configuration info At any time of the day Automatic layer-1, 2, and 3 mapping from any IP to any IP Problems identified in plain-English for rapid remediation This is what PathSolutions TotalView does
12 How TotalView Works x41 x52 x53 Install PathSolutions Result: One location is able to monitor all devices and links in the entire network for performance and errors All Switches and Routers are queried for information
13 Total Network Visibility® Broad: All ports on all routers & switches Continuous: Health collected every 5 minutes Deep: 18 different error counters collected and analyzed Heuristics Engine provides plain-English prescription of faults: “This interface is dropping 8% of its packets due to a cable fault”
14 Results Within 12 Minutes Establish Baseline of Network Health 7% Loss from cabling fault 12% Loss from Alignment Errors 11% Loss from Collisions 28% Loss from Duplex mismatch x41 x52 x53
15 Results Within 12 Minutes Immediately start fixing problems 7% Loss from cabling fault 12% Loss from Alignment Errors 11% Loss from Collisions 28% Loss from Duplex mismatch x41 x52 x53
16 Path Analysis Report Root-cause troubleshoot all elements along a path 12:02pm 8% Loss from Duplex mismatch x41 x52 x53
17 Don’t turtle your network
18 Total Network Visibility®