A Black-box QoS Measurement Methodology for VoIP End-points Wenyu Jiang Henning Schulzrinne NYMAN Workshop September 12, 2003
Motivation Quality (thus success) of VoIP depends on both the network and the end-points Internals of VoIP end-points not always known -> Black-box measurement Previous work [Jiang, ICC 2003] studied various VoIP end-point QoS metrics: Mouth-to-ear (M2E) delay Packet loss concealment (PLC) quality Clock skew; Silence detection behavior; etc. Goals of this paper: Generalize measurement approach Explore more QoS metrics and more observations
Basics: Measuring M2E Delay Capture both original and output audio Use adelay program to measure M2E delay Automation facilitates long-term delay trend observation This method can be generalized for black-box VoIP end-point QoS measurements
Generalization #1: WAN behavior Extension: add a UDP relay between end-points, then insert loss/delay/jitter (e.g., trace-based) Benchmark delay traces Delay spike Tests end-point response to delay surge Oscillative delays (-> excessive playout delay for Exp-Avg) Tests end-point’s playout algorithm intelligence Problem: time collation of trace and M2E delay curve Solution: UDP relay should log all RTP packets -> replication of original waveform via RTP payload -> time collation of trace and original analog waveform
Generalization #2: Playout Delay Playout delay: a well known term but often unknown figure Idea 1: create a situation where receiver is forced to reduce playout delay to 0, even if only temporarily Cons: Doesn’t always work Idea 2: use fundamentals of playout delay Use small step-increase delay Watch for waveform distortion
Case Study #1: Delay Anomaly Long-term observation reveals On a test PC, erroneous delay adjustment Opposite to what skew compensation should do Repeated measurements indicate dual-oscillators on the PC’s soundcard Verified that RAT source code assumes only 1 clock/oscillator (Mic) per end-point
Case Study #2: WAN Behavior Studied Cisco, 3Com IP phones Behaviors are desirable Can quickly adjust to delay spikes Does not overshoot playout delay, i.e., uses playout algorithm other than Exp-Avg
Case Study #3: Playout Delay Use first method (high delay increase) Polycom phone: 30-40ms 3Com phone: 9-28ms (mostly ~10ms) Cisco phone: 0-10ms (mostly 0ms) Use second method (gradually increase delay steps) A new metric D bare (= D m2e – D p ), is more informational than playout delay D p -> helps reveal true delay bottleneck Large delay (D bare ) of RAT is likely a soundcard buffer issue ReceiverDpDp D bare Polycom30ms60ms 3Com20ms35ms Cisco15ms48ms Rat (Ultra-10)40ms200ms
Conclusions Presented a general black-box measurement methodology for VoIP end-point QoS evaluation Illustrated how to measure several QoS metrics: esp. WAN behaviors and extraction of playout delay Evaluated the methodology on these new metrics and made many observations Anomalous delay adjustment <- dual-oscillator IP phones’ WAN behaviors to benchmark traces Playout delay measurement of IP phones Introduction of a new metric called D bare Future work: integrate our general methodology into an automated measurement tool