Download presentation
Presentation is loading. Please wait.
2
QoS Measurement and Management for VoIP Wenyu Jiang IRT Lab March 5, 2003
3
Introduction to VoIP & IP Telephony Transport of voice packets over IP networks Cost savings – Consolidates voice and data networks – Avoids leased lines, long-distance toll calls Smart and new services – Call management (filtering, TOD forwarding): CPL – Better than PSTN quality: wide-band codecs Protocols and Standards – Signaling: SIP (IETF), H.323 (ITU-T) – Transport: RTP/RTCP (IETF)
4
Practical Issues in VoIP Quality of Service (QoS) – Internet is a best-effort network Loss, delay and jitter Users expect at least PSTN quality for VoIP! Ease of deployment – Requires seamless integration with legacy networks (PSTN/PBX) – Security is a must High yardstick of service availability – Can your network achieve 99.999% up time?
5
Outline QoS measurement – Objective vs. subjective metrics – Automated measurement of subjective quality QoS management: improving your quality – End-to-End: FEC, LBR, PLC – Network provisioning: voice traffic aggregation Reality check – Performance of end-points (IP phones, …) – Deployment issues in VoIP – Evaluation of VoIP service availability through Internet measurement
6
Workings of a VoIP Client Audio is packetized, encoded and transmitted Forward error correction (FEC) may be used to recover lost packets Playout control smoothes out jitter to minimize late losses; coupled with FEC Packet loss concealment (PLC) – Last line of “defense” after FEC and playout
7
LBR: An Alternative to FEC An (n,k) block FEC code can recover n-k losses Low Bit-rate Redundancy (LBR) – Transmit a lower bit-rate version of original audio – No notion of “blocks” – Not bit-exact recovery
8
Objective QoS Metrics: Loss Internet packet loss is often bursty – May worsen voice quality than random (Bernoulli) loss Characterization of packet loss – 2-state Markov (Gilbert) model: conditional loss prob. – More detailed models, but more states! Extended Gilbert model, n th order Markov model Hidden Markov model, Gilbert-Elliot model, inter-loss distance – More states Larger test set, loss of big picture, and Adaptive applications can trade-off model accuracy for fast feedback Gilbert model provides an acceptable compromise
9
Effect of Gilbert Loss Model Loss burst distribution of a packet trace – Roughly, though not exactly exponential Loss burstiness on FEC performance – FEC less efficient under bursty loss 0.1 1 10 100 1000 024681012 number of occurrences Loss burst length Packet trace Gilbert model
10
Objective QoS Metrics: Delay Complementary Conditional CDF (C 3 DF) – More descriptive than auto-correlation function (ACF) – Delay correlation rises rapidly beyond a threshold – Approximates conditional late loss probability
11
Subjective QoS Metrics Perceived quality – Mean Opinion Score (MOS) ITU-T P.800/830 Obtained via listening tests – MOS variations DMOS (Degradation) CMOS (Comparison) MOS c (Conversational): considers delay A/B preference Pros: more meaningful to end users Cons: time consuming, labor intensive MOS GradeScore Excellent5 Good4 Fair3 Poor2 Bad1
12
Effect of Loss Model on Perceived Quality Codec: G.729 (8kb/s ITU std) Random (Bernoulli) vs. bursty (Gilbert) loss – Bursty lower MOS – True even when FEC or LBR is used
13
Going Further: Bridging Objective and Subjective Metrics The E-model (ITU-T G.107/108) – Originally for telephone network planning – Considers various impairments – Reduces to delay and loss impairment when adapted for VoIP Objective quality estimation algorithms – Suitable when network stats is not available, e.g., phone-to-phone service with IP in between. – Speech recognition performance may be used as a quality predictor, by comparing with original text
14
The E-model Map from loss and delay to impairment scores (I e, I d ) Compute a gross score (R value) and map to MOS c Limited number of codec loss impairment mappings
15
Using Speech Recognition to Predict MOS Evaluation of automatic speech recognition (ASR) based MOS prediction – IBM ViaVoice Linux version – Codec used: G.729 – Performance metric absolute word recognition ratio relative word recognition ratio
16
Recognition Ratio vs. MOS Both MOS and R abs decrease w.r.t. loss Then, eliminate middle variable p
17
Speaker Dependency Absolute performance is speaker-dependent But relative word recognition ratio is not Suitable for MOS prediction
18
Summary of QoS Measurement Loss burstiness: – Affects (generally worsens) perceived quality as well as FEC performance – May be described with, e.g., a Gilbert model Delay correlation: – Increases rapidly beyond a threshold, revealed through Complementary Conditional CDF (C 3 DF) – Late losses are also bursty Perceived quality (MOS) estimation – Analytical: the E-model – If network statistics N/A: relative word recognition ratio can provide speaker-independent MOS prediction
19
Outline QoS measurement – Objective vs. subjective metrics – Automated measurement of subjective quality QoS management: improving your quality – End-to-End: FEC, LBR, PLC – Network provisioning: voice traffic aggregation Reality check – Performance of VoIP end-points (IP phones, …) – Deployment issues in VoIP – Evaluation of VoIP service availability through Internet measurement
20
Quality of FEC vs. LBR FEC is substantially and consistently better – At comparable bandwidth overhead – Across all codec configurations tested G.729+G.723.1 LBR AMR LBR
21
Quality of FEC under Bursty Loss Packet interval T has a stronger effect on MOS with FEC than without FEC
22
FEC MOS Optimization Considering Delay Effect Larger T FEC efficiency , but delay Optimizing T with the E-model – Calculate final loss probability after FEC, apply delay impairment of FEC, map to MOS c Prediction close to FEC MOS test results – Suitable for analytical perceived quality prediction
23
Trade-off Analysis between Codec Robustness and FEC 3 loss repair options – FEC, LBR, PLC Loss-resilient codec – Better PLC iLBC (IETF) – But more bit-rates – Better than FEC?
24
Observations and Results When considering delay: – iLBC is usually preferred in low loss conditions – G.729 or G.723.1 + FEC better for high loss Example: max bandwidth 14 kb/s – Consider delay impairment (use MOS c )
25
Effect of Max Bandwidth on Achievable Quality 14 to 21 kb/s: significant improvement in MOS c From 21 to 28 kb/s: marginal change due to increasing delay impairment by FEC
26
Provisioning a VoIP Network Silence detection/suppression – Transmit only during On period, saves bandwidth – Allows traffic aggregation through statistical multiplexing Characteristics of On/Off patterns in VoIP – Traditionally found to be exponentially distributed – Modern silence detectors (G.729B VAD, NeVoT SD) produce different patterns
27
Traffic Aggregation Simulation Token bucket filter with N sources, R: reserved to peak BW ratio CDF model resembles trace model in most cases Exponential (traditional) model – Under-predicts out-of-profile packet probability; – Under-prediction ratio as token buffer size B Similar results for NeVoT SD
28
Summary of QoS Management End-to-End – FEC is superior in quality to LBR – Codec robustness is better than FEC in low loss conditions Combining both schemes brings the best of both sides Network provisioning – Observation: New silence detectors (G.729B, NeVoT SD) non-exponential voice On/Off patterns – Result: performance of voice traffic aggregation under new On/Off patterns – Important in traffic engineering and Service Level Agreement (SLA) validation
29
Outline QoS measurement – Objective vs. subjective metrics – Automated measurement of subjective quality QoS management: improving your quality – End-to-End: FEC, LBR, PLC – Network provisioning: voice traffic aggregation Reality check – Performance of end-points (IP phones, …) – Deployment issues in VoIP – Assessment of VoIP service availability through Internet measurement
30
Mouth-to-ear Delay of VoIP End-points All receivers can adjust M2E delay adaptively whenever it is too low or too high M2E delay depends mainly on receiver (esp. RAT) HW phones have relatively low delay (~45-90ms)
31
But Adaptiveness Perfection Symptom of playout buffer underflow Waveforms are dropped Occurred at point of delay adjustment Bugs in software? LAN perfect quality?
32
Major Observations Overall: end-points matter a lot! HW IP phones: 45-90ms average M2E delay SW clients: – Messenger 2000 lowest (68ms), XP (96-120ms) c.f. GSM PSTN: 110ms either direction – NetMeeting very bad (> 400ms) PLC robustness – Acceptable in all 3 IP phones tested, Cisco phone more robust Silence detection/suppression – Works for speech input – Often fails for non-speech (e.g., music) input Generates many unnatural gaps Not good for customer support center (on-hold music)! Acoustic echo cancellation (AEC): – Good on most IP phones (Echo Return Loss > 40 dB) – But some do not implement AEC at all
33
Reality Check #2: IP Telephony Deployment Localized deployment at Columbia Univ. SIP proxy, redirect server SQL database sipd Conference Server Voicemail Server T1/E1 RTP/SIP Regular phone SIP/PSTN Gateway Telephone Switch/PBX Web based configuration Web Server Server status monitoring Core Server IP Phones
34
Issues and Lessons Learned PSTN/PBX integration – Requires full understanding of legacy networks Lower layer (e.g., T1 line configuration) –Parameters must match on both PSTN/PBX and gateway! PBX access configurations –To ensure calls go through in both directions Address translation (dial-plan) in both directions – Previous lessons/experiences can help greatly E.g., second gateway installed in weeks instead of months Security – Issue: SIP/PSTN gateway has no authentication feature – Solution: Use gateway’s access control lists to block direct calls SIP proxy server handles authentication using record-route
35
Reality Check #3: VoIP Service Availability Focus on availability rather than traditional QoS – Delay is a minor issue; FEC recovers most isolated losses – Ability to make a call is vital, especially in emergency Internet measurement sites: – 14 nodes worldwide, not just Internet2 and alike Definitions: – Availability = MTBF / (MTBF + MTTR) – Availability = successful calls / first call attempts Equipment availability: 99.999% (“5 nines”) 5 minutes/year AT&T: 99.98% availability (1997) IP frame relay SLA: 99.9% UK mobile phone survey: 97.1-98.8%
36
First Look of Availability Call success probability: – 62,027 calls succeeded, 292 failed 99.53% availability – Roughly constant across I2, I2+, commercial ISPs: 99.39-99.58% Overall network loss – PSTN: once connected, call usually of good quality exception: mobile phones – Compute % time below loss threshold 5% loss causes degradation for many codecs others acceptable till 20% loss0%5%10%20% All82.397.4899.1699.75 ISP78.696.7299.0499.74 I297.799.6799.7799.79 I2+86.898.4199.3299.76 US83.696.9599.2799.79 Int.81.797.7399.1199.73 US ISP 73.695.0398.9299.79 Int. ISP 81.297.6099.1099.71
37
Network Outages Sustained packet losses – arbitrarily defined at 8 packets – far beyond recoverable (FEC, interpolation) 23% packet losses are outages Make up significant part of 0.25% unavailability Symmetric: A B B A Spatially correlated: A B A X Not correlated across networks (e.g., I2 and commercial) Mostly short (a few seconds), but some are very long (100’s of seconds), make up majority of outage time
38
Outage-induced Call Abortion Probability Long interruption user likely to abandon call from E.855 survey: P[holding] = e -t/17.26 (t in seconds) half the users will abandon call after 12s 2,566 have at least one outage 946 of 2,566 expected to be dropped 1.53% of all calls all1.53% I21.16% I2+1.15% ISP1.82% US0.99% Int.1.78% US ISP0.86% Int. ISP2.30%
39
Summary of Service Availability Through several metrics, one can translate from network loss to VoIP service availability (no Internet dial-tone) Current results show availability far below five 9’s, but comparable to mobile telephony – Outage statistics are similar in research and ISP networks Working on identifying fault sources and locations Additional measurement sites are welcome
40
Conclusions Measuring QoS – Loss burstiness and delay correlation affects (generally worsens) perceived quality – Bridging objective and subjective metrics: the E-model, or speech recognition based MOS prediction – Performance of real products: IP phones and soft clients Ensuring/improving QoS – Network provisioning (voice traffic aggregation) Efficient, but may be expensive to deploy and manage – End-to-End (FEC > LBR, PLC) Easier to deploy, but must control overhead of FEC Reality Check – Good implementation at the end-point (e.g., IP phones) is vital – VoIP deployment requires PSTN integration and security – Service availability is crucial for VoIP, but still far from 99.999% over the Internet
41
Ongoing and Future Work Sampling Internet performance – Where do the problems reside? Access networks (Cable, DSL), or International paths? – How can we solve these problems? Can adaptive FEC react fast enough to changes in network conditions? Playout delay behaviors of VoIP end-points – How well do they react to jitter, delay spikes?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.