Network Monitoring: A Practical Approach Philip Smith/IT Services University of Windsor March 21, 2003 Hello I am Philip Smith from IT Services. Part of my responsibilities include network management of the devices. This in turn includes monitoring the network. Dr. Aggrawal has asked my to briefly discuss how we monitor the networks on campus and the internet.
Agenda Campus Structure Benchmarking on Campus Tools on Campus Benchmarking off Campus Tools off Campus Questions and Answers
Campus Structure Core Router (Nortel Networks Passport 8610) 60+ Building Subnets (student + faculty) Computer Science and Engineering have their own networks Have two external connections Internet (Telus) at 15Mb/s + over subscription CAnet*4 (AT&T) at 155Mb/s Both connections use ATM
Campus Structure (Block Diagram)
Campus Structure (Graphical)
Benchmarking on Campus: Benchmarks FTP (TCP/IP download performance) TTCP (TCPIP upload performance) Need to consider both upload and download because you could have a duplex problem. PERFORM3 (Novell performance)
Benchmarking on Campus: FTP FTP is a disk to disk transfer protocol theoretically this could & does affect performance. We drop the first FTP test to each server because the file is not cached. FTP benchmark is run against 3 servers at or near the network core. Key servers are: Admin1 (administrative server/AIX-IBM UNIX) Pdomain (campus FTP server/IRIX-SGI UNIX) Zeus (Lotus Notes server/AIX) Notice that servers for FTP & TTCP are similar. We are unable to run TTCP on Cronus because that would require IIS to be installed.
Benchmarking on Campus: TTCP TTCP is a memory to memory transfer protocol disk is NOT involved. TTCP benchmark is run against 4 servers at or near the network core. Key servers are: Admin1 (administrative server/AIX-IBM UNIX) Cronus (Lotus Notes server/NT) Pdomain (campus FTP server/IRIX-SGI UNIX) Zeus (Lotus Notes server/AIX)
Benchmarking on Campus: PERFORM3 PERFORM3 is Novell’s benchmark for networks that are 10Mb/s or more. While Novell is not used very frequently in Computer Science it is used a great deal elsewhere on campus. At one point (circa 2000) Novell traffic was 2/3 of our Network. Modified PERFORM3 to run faster; limit is to twelve operations at 16K intervals instead of at each 4K interval. Modified test takes 1-2 minutes compared to 5 minutes. Run PERFORM3 benchmark against all available Novell servers.
Benchmarks on Campus: Methodology Using Work Study labour, annually run all three benchmarks from each subnet in each building using a common laptop. Run 4 TTCP tests against each of the 4 TTCP server (4*4=16) Run 3 FTP tests against each of the 3 FTP servers (3*3=9); remember first test is discarded Run 2 PERFORM3 tests against each Novell server (2*~9=18) Does not include Computer Science’s or Engineering’s subnets.
Benchmarks on Campus: Summary Results of annual building tests available on line. URL: http://www.uwindsor.ca/netperf Click on Benchmark Database from left hand menu. Also contains benchmarks from some faculty and staff that have complained about their performance.
Tools on Campus Protocol Analyzer WhatsUp MRTG MRTG-UFFE NMS Other tools that can be used to measure or benchmark performance
Tools on Campus: Protocol Analyzer Device that lets you see packets on the wire Our tool is a Network Associates’ Sniffer Primarily a troubleshooting tool However, by capturing the data on a connection (e.g. uplink) over time you can collect key network statistics Flaw: It only does ONE connection at a time Protocol Analyzer measures packets Slide from NAI’s brochure (their copy write)
Tools on Campus: WhatsUp Monitors network devices (e.g. switches & routers) servers & server applications uses ICMP (ping) and TCP/IP ports If device responds server is deemed to be up Flaw: Just because the web server port opens on port 80 this does not necessarily mean the web server is working properly; it just means that the web server is up WhatsUp measures availability Uses drill down method (example to follow) For example most of the web data could be off line due to a disk failure.
Tools on Campus: WhatsUp Look all the buildings are not green. Let’s look at Memorial.
Tools on Campus: WhatsUp Drilling down into Memorial Hall, there is something wrong with the UPS (top diagram) It looks like the UPS management is down (bottom diagram)
Tools on Campus: MRTG MRTG = Multi Router Traffic Grapher Monitors bits in and out of a network device (eg. Switch port, router port, NIC card) Using SNMP it queries the switch for port activity once every five minutes Keeps daily, weekly monthly and yearly statistics on that port Flaw 1: If there is a lot of usage then the device(s) attached to the port are running well. If usage is low then ???? Flaw 2: It monitors amount of bits not the number of packets. If you had a Denial of Service attack with a large number of small packets MRTG would not indicate a problem MRTG measures bandwidth Like WhatsUp, MRTG uses drill down method
MRTG example: Fully drilled down view of Passport to CS SSR Router
Tools on Campus: MRTG-UFFE MRTG-UFFE = MRTG’s User Friendly Front End Add on to MRTG Homegrown utility that documents the important (special, unusual, busy) connections on campus Hyperlinks to MRTG MRTG-UFFE measures connections
Tools on Campus: NMS NMS = Network Management System MRTG only measures bits in (received) and out (transmitted) Only 2 of 34 parameters on the switch port Future Project
Benchmarks off Campus Mostly a new area of focus Have been monitoring using Protocol Analyzer, WhatsUp & MRTG Size of Internet Pipe growing yearly by about 2Mb. Recently we have also been monitoring using BroadBandReports.com
Benchmarks off Campus: WhatsUp Measure Availability
Benchmarks off Campus: MRTG Measures bandwidth off campus to the Internet & CAnet*4 via WEDnet
Benchmarks off Campus: BroadBandReports.Com Six times a day we measure the thruput BroadBandReport.com’s East Coast site (NJ), West Coast site (San Jose, CA) and Cogeco. This is the agregate Internet Performance.
Tools Off Campus Protocol Analyzer WhatsUp MRTG BroadBandReports.com Internet Monitors
Tools Off Campus: Internet Monitors Internet Health Report http://www.internethealthreport.com/ Measures Latency (TCP Open) Between Major U.S. carriers. Internet Traffic Report http://www.internettrafficreport.com/ Measures Latency (ICMP Echo) & Packet loss between selected routers world wide. Internet Average http://average.matrixnetsystems.com/ Measures Latency, Packet Loss, and Reachability between thousands of servers and routers around the world. (Most Comprehensive) Canadian content? Give you a feel for what is going on with the net.
Question & Answers Thanks for your attendance Philip Smith’s Network Performance site: http://www.uwindsor.ca/netperf Email: Philip@UWindsor.Ca