30 May 2001 Campus Measurement Matt Zekauskas, Internet2 Campus Workshop Atlanta, GA
Campus Measurement30-May Outline Existing measurement tools & projects A sample performance problem Vision for infrastructure to solve problems Steps campuses can take today
Campus Measurement30-May My Bias I once ran a (corporate) campus network My recent focus: measurements Most recently: end-to-end performance I have been helping solve some wide- area performance problems Caveat: I haven’t been doing the fixing (My view is from the center)
Campus Measurement30-May We Need Your Help I have probably missed something Give us the benefit of your experience fixing problems operating a network …also join a working group!
Campus Measurement30-May Measurement Goals Solving performance problems Network operations Network engineering Network research Operational data performance, flows, anomalies Network characterization how used? load response? SLS?
Campus Measurement30-May Measurements from the Center Active Measurement within Abilene Measurement using entire Internet2 infra. Passive SNMP stats (esp. core Abilene links) “IOS” stats (for QoS) Characterization of traffic (on the way) –Netflow; OCxMON
Campus Measurement30-May Measurement Projects Surveyor (one-way delay, loss, routing) On many Internet2 campuses (70 sites) Abilene presence AMP (round-trip delay, loss, routing) moat.nlanr.net/AMP At even more Internet2 campuses (120 sites) PMA (passive, packet traces) moat.nlanr.net/PMA 1 min, 8 times a day, ~13 sites
Campus Measurement30-May Measurement Projects PingER (round-trip delay, routing) Long term data from a few locations to many High-energy physics focus NIMI Designed to be platform for experiments Undergoing some redesign/revitalization ~ 60 sites?
Campus Measurement30-May Usefulness AMP, Surveyor, Pinger If at your campus, a view from your campus If at destination, a view of destination Look for campus connected to same gigaPoP if not at local or destination [“Phase 0” measurement points for e2eperf] Routing, congestion problems
Campus Measurement30-May Usefulness PMA If at your campus, can look at traces for anomalies Not as useful for on-demand debugging (but don’t ignore ability to take traces)
Campus Measurement30-May Surveyor on One Slide Continuous measurement One-way delay and loss 1/sec on Poisson Schedule 12 Byte UDP packets Traceroutes at 1/600 sec 72 Machines Report.html -- Java, close to real-time -- static
Campus Measurement30-May
Campus Measurement30-May
Campus Measurement30-May AMP Like Surveyor, but Round-trip latency instead of one-way –Easier to deploy Working on more comprehensive set of “alarms” Potentially more available
Campus Measurement30-May
Campus Measurement30-May An “Application-Level” Example Pioneer Synthesis of existing infrastructure Focus: video conferencing tests Goal: use this to tell if video likely to work
Campus Measurement30-May Abilene Abilene goal to be an exemplar Measurements open Tests possible to router nodes Web-mediated on-demand measurements Throughput tests routinely through backbone …as well as existing utilization, etc.
Campus Measurement30-May Active within Abilene Each Router Node has a PC Now 10 of 11 are OC3-ATM attached missing: Houston No GPS working towards GPS within CDMA solution
Campus Measurement30-May Ad-hoc Active on Abilene With OC-3, can do moderate throughput testing (e.g., iperf UDP & TCP). ~90 Mbps Adding on-demand tests in support of performance debugging Contact me if you want to perform an ad-hoc test
Campus Measurement30-May Passive - Utilization The Abilene NOC takes Packets in,out Bytes in,out Drops/Errors..for all interfaces, publishes internal links & peering points (at 5 min intervals)..via SNMP polling – every 3 sec
Campus Measurement30-May
Campus Measurement30-May
Campus Measurement30-May Passive – Characterization Some sparse via NLANR/MOAT Starting some NetFlow measurements QoS AS-AS information for K-20 & ITN Intend to do some characterization
Campus Measurement30-May Others via Abilene NOC BGP Peering MSDP (multicast source discovery) logging See: -> Operational Status
Campus Measurement30-May Multicast-specific Multicast measurements Not fully understood Debugging is an art Tools Mtrace ‘sdr’ announcements in backbone Mhealth, Mantra via UCSB
Campus Measurement30-May JPL/Caltech – GSFC The situation Using Abilene Tuned hosts Things work locally Therefore it MUST be Abilene Tests show good flows router-router Intermediate tests point towards CA Bad fiber connection!
Campus Measurement30-May Vision I Ongoing monitoring to test major elements, and (some, important) end-to- end paths. Elements: gigaPoP links, peering, … Utilization Delay Loss Occasional throughput Multicast connectivity
Campus Measurement30-May Vision II There are many more paths end to end than can be monitored. Diagnostic tools available on-demand (with authorization) Show routes Perform flow tests (perhaps app tests) Parse/debug flows (a-la tcpdump or OCXmon with heuristic tools)
Campus Measurement30-May For TCP (and Streaming) Eliminating loss is the goal Focus on noncongestive losses TCP: 100 Mbit Ethernet coast-to-coast: Full size packets… need P loss [Mathis] Less than 1 loss every 83 seconds GigE/655: 10 -8, 1 loss every 497 seconds
Campus Measurement30-May Enabling Divide & Conquer and Ongoing Monitoring Backbone 1 GigaPoP A Campus GigaPoP B Backbone 2 Wall Jack P P
Campus Measurement30-May Some Commercial Tools Caveat: only a partial list, give me more! Spirent (nee Netcom/Adtech): working on a box for ‘end-to-end’ measurements SmartBits: test at low & high rates, QoS; test components or end-to-end path NetIQ: Chariot/Pegasus Ixia (like SmartBits/Spirent); Agilent Brix Networks (like Surveyor, for ‘QoS’)
Campus Measurement30-May Some Noncommercial Tools Iperf: dast.nlanr.net/Projects/iperf See also Flowscan: SLAC’s traceroute perl script: srv.html One large list:
Campus Measurement30-May What You Can Do Export SNMP data I can keep an “internet2 list”, would like it to be public [Current Measurement WG project] Monitor loss as well as throughput Performance test point at campus edge Netperf or iperf, so can be from anywhere Traceroute “looking glass” Commercial (e.g., NetIQ) complements I’m willing to keep a master list [MWG project] Portable performance test point
Campus Measurement30-May For TCP Tuning Keep an eye out for Web100: NCNE Tuning Page:
Campus Measurement30-May What You Can Do If you have a Cisco router at your edge, use NetFlow and cflowd + FlowScan to see your traffic characteristics RTFM / RMON probes See also Joe St.Sauver’s presentation from the last “Joint Techs” meeting: / sauver1.html
Campus Measurement30-May A Summer Project Measurement box at edge Spend month or two with mobile box, checking throughput/loss/.. from every point. Eliminate noncongestive losses Develop a baseline to get a complete picture of the campus: map the campus networks
Campus Measurement30-May NTP everywhere! If GPS, get good NTP distribution Allow correlation among campuses
Campus Measurement30-May Plug: Internet2 Measurement Working Group Activities Measurement architecture Encourage common Measurements, tools Parameters Reporting Work with (at least) management, QoS, multicast; End-to-end Performance Initiative
Campus Measurement30-May Contact Information Matt Zekauskas, Measurements Working Group End-to-end interest list subscribe e2e-interest
Campus Measurement30-May (Some) URLs