Advanced Technology Laboratories page 1 Network Performance Monitoring at Small Time Scales Dina Papagiannaki, Rene Cruz, Christophe Diot
Advanced Technology Laboratories page 2 Motivation Network management for large-scale networks almost exclusively relies on SNMP. SNMP reports on aggregate link activity for the duration of the polling interval (5 mins). Operators provision their network around these values according to provider-specific acceptable utilization levels.
Advanced Technology Laboratories page 3 Questions Can one infer delay degradation from SNMP link counters? Can one infer delay performance from output link utilization? –At what time scale should these measurements be taken? How do we summarize such high-resolution measurements in a 5-minute counter?
Advanced Technology Laboratories page 4 Terminology A micro-congestion episode is a short-lived episode in the lifetime of a link when packets face increased delays due to crosstraffic. Metrics: Amplitude, Duration, Frequency
Advanced Technology Laboratories page 5 Measurement Data: Sampling the Output Queue Collect packet traces from links attached to the same router (set 1: OC-3, set 2: OC-12) Compute single-hop delay using GPS accurate timestamps for arrival and departure OC-3
Advanced Technology Laboratories page 6 Methodology Compute link throughput for non-overlapping time intervals of (1ms, 10ms, 100ms, 1s) duration Collect all delay samples for each interval Associate throughput level with delay distribution d1d2 Output Link
Advanced Technology Laboratories page 7 Delay performance (OC-3)
Advanced Technology Laboratories page 8 Instantaneous link utilization and delay Instantaneous link utilization may be high even when packets do not experience congestion! NO QUEUEING DELAY
Advanced Technology Laboratories page 9 5 minutes too long to capture micro-congestion 5 minutes
Advanced Technology Laboratories page 10 Inference of Duration and Frequency If a micro-congestion episode persists in time, it should be visible across time scales For each time scale τ we count the number of intervals exceeding θ throughput level Measure fraction of overloaded intervals within each 5- minute interval Output link
Advanced Technology Laboratories page 11 Inference of Duration (cntd) If the fraction of congested intervals exceeding θ at time scale τ+1 is greater than the fraction of congested intervals at time scale τ, then significant fluctuations at time scale τ. Reporting interval
Advanced Technology Laboratories page 12 Duration/Frequency
Advanced Technology Laboratories page 13 Summary 5 minute average utilization measurements can hide micro-congestion episodes. There is no unique time scale that captures micro- congestion. Impact needs to be studied at multiple time scales simultaneously. New metric to address network performance monitoring at small time scales.
Advanced Technology Laboratories page 14 Ongoing Work We need to identify the impact of –Link capacity –Traffic arrival pattern We have instrumented an entire router and analyze busy periods.