Assessment of VoIP Service Availability Wenyu Jiang Henning Schulzrinne IRT Lab, Dept. of Computer Science Columbia University December 2002
Overview (on-going work, preliminary results, still looking for measurement sites, …) Service availability Measurement setup Measurement results call success probability overall network loss network outages outage induced call abortion probability
Service availability Users do not care about QoS at least not about packet loss, jitter, delay rather, it’s service availability how likely is it that I can place a call and not get interrupted? availability = MTBF / (MTBF + MTTR) MTBF = mean time between failures MTTR = mean time to repair availability = successful calls / first call attempts equipment availability: 99.999% (“5 nines”) 5 minutes/year AT&T: 99.98% availability (1997) IP frame relay SLA: 99.9%
Availability – PSTN metrics PSTN metrics (Worldbank study): fault rate “should be less than 0.2 per main line” fault clearance (~ MTTR) “next business day” call completion rate during network busy hour “varies from about 60% - 75%” dial tone delay
Example PSTN statistics Source: Worldbank
Measurement setup Node name Location Connectivity Network columbia Columbia University, NY >= OC3 I2 wustl Washington U., St. Louis unm Univ. of New Mexico epfl EPFL, Lausanne, CH I2+ hut Helsinki University of Technology rr NYC cable modem ISP rrqueens Queens, NY njcable New Jersey newport ADSL sanjose San Jose, California suna Kitakyushu, Japan 3 Mb/s sh Shanghai, China Shanghaihome Shanghaioffice
Measurement setup Active measurements call duration 3 or 7 minutes UDP packets: 36 bytes alternating with 72 bytes (FEC) 40 ms spacing September 10 to December 6, 2002 13,500 call hours
Call success probability 62,027 calls succeeded, 292 failed 99.53% availability roughly constant across I2, I2+, commercial ISPs All 99.53% Internet2 99.52% Internet2+ 99.56% Commercial 99.51% Domestic (US) 99.45% International 99.58% Domestic commercial 99.39% International commercial 99.59%
Overall network loss PSTN: once connected, call usually of good quality exception: mobile phones compute periods of time below loss threshold 5% causes degradation for many codecs others acceptable till 20% loss 0% 5% 10% 20% All 82.3 97.48 99.16 99.75 ISP 78.6 96.72 99.04 99.74 I2 97.7 99.67 99.77 99.79 I2+ 86.8 98.41 99.32 99.76 US 83.6 96.95 99.27 Int. 81.7 97.73 99.11 99.73 US ISP 73.6 95.03 98.92 Int. ISP 81.2 97.60 99.10 99.71
Network Outages sustained packet losses 23% outages arbitrarily defined at 8 packets far beyond any recoverable loss (FEC, interpolation) 23% outages make up significant part of 0.25% unavailability symmetric: AB BA spatially correlated: AB AX not correlated across networks (e.g., I2 and commercial)
Network outages
Network outages all 10,753 30% 145 25 17:20 10:58 I2 819 14.5% 360 no. of outages % symmetric duration (mean) duration (median) total (all, h:m) outages > 1000 packets all 10,753 30% 145 25 17:20 10:58 I2 819 14.5% 360 3:17 2:33 I2+ 2,708 10% 259 26 7:47 5:37 ISP 8,045 37% 107 24 9:33 4:58 US 1,777 18% 269 20 5:18 3:53 Int. 8,976 33% 121 12:02 6:42
Outage-induced call abortion proability Long interruption user likely to abandon call from E.855 survey: P[holding] = e-t/17.26 (t in seconds) half the users will abandon call after 12s 2,566 have at least one outage 946 of 2,566 expected to be dropped 1.53% of all calls all 1.53% I2 1.16% I2+ 1.15% ISP 1.82% US 0.99% Int. 1.78% US ISP 0.86% Int. ISP 2.30%
Conclusion Availability in space is (mostly) solved availability in time restricts usability for new applications initial investigation into service availability for VoIP need to define metrics for, say, web access unify packet loss and “no Internet dial tone’’ far less than “5 nines” working on identifying fault sources and locations looking for additional measurement sites