Tools and Processes for Testing VoIP Chris Bajorek, Director CT Labs www.ct-labs.com.

Tools and Processes for Testing VoIP Chris Bajorek, Director CT Labs www.ct-labs.com

About the Speaker Chris Bajorek, Director and Founder, CT Labs Chris Bajorek is a 25-year veteran of computer telephony and converged communications. Bajorek has led the company to its industry-leading position in testing services which include real-world performance testing, interoperability verification, and usability and quality analysis. Customers include first-tier enterprise and carrier-grade next-generation network product manufacturers. Prior to founding CT Labs, Bajorek founded Telephone Response Technologies, Inc. (TRT), which developed and sold turnkey voice response and unified messaging products as well as award-winning toolkits for rapid development of voice-based applications. Prior to TRT he worked for Integrated Office Systems and Time and Space Processing where he performed pioneering work on voicemail and digital voice communications products. Bajorek holds a B.S.E.E. from Cal Poly, San Luis Obispo.

For Today’s Talk: Taking a Developer’s Perspective to VoIP Test Much of CT Labs’ business is with R&D and QA groups of VoIP product manufacturers Would like to provide a window into some of our VoIP test experiences, including o Common VoIP test myths o Testing tips and suggestions o Focus on voice quality testing—hot area for VoIP

Myths around VoIP Deployment Voice quality is a given VoIP is easy to deploy VoIP is inexpensive to deploy All VoIP-enabled phones are created equal Once you have your VoIP network set up, you can leave it alone

VoIP Requires a Lifecycle Approach Lack of a proper lifecycle will: o Drive Costs Up o Reduce VoIP Reliability / Availability o Risk Complete Failure of Deployment Should design new VoIP products with this in mind

VoIP Troubleshooting Areas – The Big Picture Call Processing (i.e. call connectivity, service availability) Voice Quality Interoperability / Feature Interaction Configuration / Registration Routing Security Applications (conferencing, IVR, voicemail, …)

Troubleshooting example Symptom: sporadic call “failures” Common causes : o Gateway and switch mis-configuration o Interoperability issues between equipment o Capacity limitations o Performance issues and delays triggering timeouts o “Feature interaction” issues such as conflicting call- forwarding settings

VoIP Deployment Segments 1.Residential (Voice over Broadband) 2.Enterprise 3.Next-Gen Network Carriers and Service Providers All three areas are quite active now…

VoIP Products, by Segment (products that “touch” the media stream) Residential o Analog terminal adapters, VoIP softphones, residential routers Enterprise o IP PBXs, IP Contact Centers, VoIP phones & softphones, firewalls/ALGs, media servers (conferencing, voice mail) Next-Gen Carriers and Service Providers o Session border controllers, media servers, media gateways, transcoding/VQ enhancement processors

VoIP Testing Areas of Focus Service reliability o i.e. Availability of service, Call connectivity Voice quality o Includes measurement of VQ, latency, levels, echo can., etc. “Phone” features o CLASS features, such as call park, transfer, etc. VoIP Access to enhanced services o Voice mail, conferencing, IVR, etc. Each of these areas has its own set of testing challenges, but one thing is clear: all relate to the end-user Quality Experience and must be validated

Active versus Passive VoIP Testing Active tests o Involves driving real 2-way calls thru the VoIP network o Benefits: more accurate, uses mature standards (PESQ, etc.) for automated quality assessment o Negatives: consumes network resources Passive tests o Involves passive evaluation of call-based packet flows o Ignores (or models) VoIP endpoint-specific behaviors to network conditions

Post-Deployment, Passive Testing is Key Deployed VoIP networks should: o Continuously monitor passive VQ, call completion rates, network packet loss, jitter, & latency o Set alarming thresholds for VoIP call performance that degrades below adaptive-corrective levels Assumption: Pre-deployment tests resulted in… o Clean bill of network health o Baseline characterization of network during peak, off- peak times

Passive Monitoring “Embedded Components” for Product Developers Products incorporating these can quickly adapt to changing IP network conditions o Real-time access to estimated MOS, round-trip latency o Access to level and echo information for estimate of MOS-Conversational Quality VQMon – from Telchemy (www.telchemy.com)www.telchemy.com PsyVoIP -- from Psytechnics (www.psytechnics.com)www.psytechnics.com

A few things about Codecs Waveform codecs o Produces waveform as identical as possible to the original (G.711 PCM, G.726 ADPCM) Source codecs o Uses a model of how speech is generated o Can significantly alter the time-domain waveform while sounding very similar to the input (G.729a/729, G.723.1)

A few things about Codecs Hybrid codecs o Combine techniques from waveform and source codecs o Uses different modes and bit rates depending on network conditions o AMR Bit rate: 4.75-12.2 kbps MIPS complexity: 15-20 o AMR-WB / G.722.2 (wideband—7kHz signal bw) Bit rate: 6.6-28.3 kbps MIPS complexity: 38 (incl. VAD and CNG) Why knowledge of codec method(s) is useful for VQ analysis

Devices that can affect a User’s “VoIP Experience” IP PBXs IP Phones & VoIP endpoints Media Gateways IVR / Voice portals SBCs (Border Controllers) Media Servers Firewalls/ALGs Messaging Servers Conference Bridges

Voice Quality versus Intelligibility Voice quality: the “acceptability” of speech Intelligibility: the “clarity” of speech o Subjective tests: Diagnostic Rhyme Test, Modified Rhyme Test o Higher frequencies more important for intelligibility, a good benefit of wideband codecs Lower quality affects intelligibility but not necessarily vice versa

Voice Quality Measurement – A Hot Topic What is considered the “gold standard” way to measure voice quality? o Answer: with humans, and the more of them in a listening session the better the resolution of the resulting quality scores However, conducting a live-listener test is not as easy or cheap as you may think…

MOS Subjective Testing It’s a Standard: ITU-T P.800 (1996) The technique rates quality using “absolute category rating” method (ACR) 5-grade scale: 5=excellent 4=good 3=fair 2=poor 1=bad

MOS Subjective Testing How it’s done o Requires use of a group of 32-64 “naive” listeners o Standardized male, female, and child phrases are used o Calibrating “reference” degraded conditions are intermixed with actual samples o The identical speech sample sets are played to all listeners o Listeners judge the quality of each phrase using ACR scale

MOS Subjective Testing Strengths o Provides the definitive answer to “which sounds best?” Weaknesses o High cost, especially when many different test conditions or sample sets must be evaluated o Takes time to schedule test and get results

Objective VQ Standards All automated VQ measurement techniques are designed to estimate the way humans perceive voice quality PSQM P.861 (1996) o PSQM+ handled higher distortion levels than PSQM PESQ P.862 (2001) o Solved variable delay (“alignment”) problem of PSQM

What PESQ VQ Testing is designed for PESQ is a way to quickly and cost-effectively estimate the effects of one-way speech distortion and noise on speech quality PESQ is “endpoint-agnostic” – can be used for VoIP-to-VoIP, VoIP-to-PSTN calls, etc. PESQ can be used for VQ assessment of wideband codecs if your test platform supports it (if not, 3.1kHz signal bandwidth applies)

PESQ Narrowband vs. Wideband

What PESQ VQ Testing is not designed for PESQ does not evaluate the effects of loudness loss, fixed latency, sidetone, or echo as related to two-way caller interactions PESQ can not safely be used to declare a VQ “winner” when the PESQ score differential is small (i.e. <.25) o “Opposite conclusion” errors are very possible, so the bigger the score differential the better o Especially true when comparing samples with more than a single changed “variable”

Objective VQ Testing Strengths o Provides excellent estimate of voice quality o Tests can be performed quickly o Tests are very repeatable Weaknesses o Not good for reliably resolving small differences in quality scores

Must look at all the metrics of VoIP calls exactly as transmitted on the network Packet Loss ? Jitter ? Delay ? Voice Quality ? Jitter distribution graph Troubleshooting VQ Issues Measurement is critical for problem resolution

Tip: How to test end-to-end VQ of VoIP phones #1: It’s usually not enough to evaluate VQ by just looking at the packet streams (i.e. E-model) #2: Must evaluate quality all the way to the phone’s earpiece and microphone wires o So can evaluate the proper operation of the phone’s internal “VoIP gateway”, including automatic gain (AGC), voice activity detection (VAD), comfort noise generation (CNG), echo cancellation, codecs, jitter buffer management, and packet loss concealment algorithms. o In other words, there is much that can go wrong.

Tip: How to test end-to-end VQ of VoIP phones #3: Must evaluate under expected LAN/WAN impairment conditions o Packet loss, Jitter, Latency o Effective bandwidth of IP connection i.e. Broadband versus Dialup #4: Don’t forget interoperability testing against other VoIP devices o Verify VQ against other expected manufacturer’s devices

Testing end-to-end VQ of VoIP phones The automated VQ test o Important for verifying VQ under many conditions o Vary one dimension at a time during subsequent test runs The manual VQ “real user” test o Conduct 2-way calls with real users who are familiar with potential echo cancellation and other 2-way effects o Include handset and speakerphone test calls

Testing end-to-end VQ of VoIP phones Test setup examples o Softphone to softphone test o VoIP Phone to VoIP Phone test (in lab) o VoIP Phone to PSTN call test o Variations on these themes easily set up Wideband codecs used? If so, be sure to verify that all test equipment in the audio/media signal path can support 8 kHz.

Testing Softphone-to-Softphone Media may flow peer-to-peer or through the VoIP Network component PESQ evaluated off- line via batch process

Testing VoIP Phone-to-VoIP Phone Good setup when isolated device performance test is needed. Phone calls are manually placed with this setup.

Testing VoIP Phone to PSTN calls

Example: WAN Impairment Conditions for VQ Test Conditions suitable for emulation of overseas Internet dialup conditions Broadband and Dialup IP bandwidths for each condition below: Packet Loss = 0% Latency / Jitter =10/30 mSec (uniform distributed latency model) Packet Loss – Random = 2.5%, Latency / Jitter = 10/30 mSec Packet Loss – Burst = 5.0%, 1-5 packet burst size Latency / Jitter = 50/80 mSec Packet Loss – Burst = 10.0%, 1-8 packet burst size Latency / Jitter = 125/250 mSec

Watch out for… Do not try to compare “MOS” scores derived from different sources or evaluation engines o Even the numeric ranges from “worse” to “best” can vary (i.e. “best” = 4.5, not 5.0) o Especially, don’t compare passive with active VQ results

Real-World Next-Gen Network Product Testing www.ct-labs.com 916-577-2100 Chris Bajorek chris@ct-labs.com 916-577-2110 direct line

Tools and Processes for Testing VoIP Chris Bajorek, Director CT Labs www.ct-labs.com.

Similar presentations

Presentation on theme: "Tools and Processes for Testing VoIP Chris Bajorek, Director CT Labs www.ct-labs.com."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Tools and Processes for Testing VoIP Chris Bajorek, Director CT Labs www.ct-labs.com.

Similar presentations

Presentation on theme: "Tools and Processes for Testing VoIP Chris Bajorek, Director CT Labs www.ct-labs.com."— Presentation transcript:

Similar presentations

About project

Feedback