Download presentation
Presentation is loading. Please wait.
Published byBranden Tucker Modified over 9 years ago
1
New Models for Perceived Voice Quality Prediction and their Applications in Playout Buffer Optimization for VoIP Networks University of Plymouth United Kingdom {L.Sun; E.Ifeachor}@plymouth.ac.uk Dr. Lingfen Sun Prof Emmanuel Ifeachor
2
ICC 2004, Paris France, 20-24 June 2004 2 Outline Background Speech quality for VoIP networks Current status Aims of the project Main Contributions Novel non-intrusive voice quality prediction models Novel perceptual-based speech quality optimization (e.g. jitter buffer optimization) mechanism Conclusions and Future Work
3
ICC 2004, Paris France, 20-24 June 2004 3 Background – Speech Quality for VoIP Networks VoIP speech quality: end-user perceived quality (MOS), an important metric. Affected by IP network impairments and other impairments. Voice quality measurement: subjective (MOS ) or objective (intrusive or non-intrusive) SCN IP Network Gateway SCN: Switched Comm. Networks (PSTN, ISDN, GSM …) End-to-end Perceived speech quality Intrusive measurement Non-intrusive measurement MOS Reference speech Degraded speech
4
ICC 2004, Paris France, 20-24 June 2004 4 Current Status and Problems Lack of an efficient non-intrusive speech quality measurement method E-model (a complicated computational model) Based on subjective tests to derive models/parameters, time- consuming and expensive. Only limited models exist Lack of perceptual optimization control methods only based on individual network parameters for buffer optimization and QoS control purposes not perceptual-based optimization control
5
ICC 2004, Paris France, 20-24 June 2004 5 Aims of the Project IP Network Receiver Voice source Voice receiver Encoder Sender Packetizer Jitter buffer Decoder De- packetizer Non-intrusive measurement MOS End-to-end perceived voice quality (MOS) To develop novel and efficient method/models for non-intrusive quality prediction, To apply the models for perceptual-based optimization control ( e.g. buffer optimization or adaptive sender-bit-rate QoS control).
6
ICC 2004, Paris France, 20-24 June 2004 6 Novel Non-intrusive Voice Quality Prediction Based on intrusive quality measurement (e.g. PESQ) to predict voice quality non-intrusively which avoids subjective tests. A generic method which can be applied to audio, image and video. VoIP Network New model (packet loss, delay, codec …) Predicted MOSc PESQ E-model Measured MOSc delay MOS(PESQ) Reference speech Degraded speech Intrusive method (regression or ANN models) Non-intrusive method
7
ICC 2004, Paris France, 20-24 June 2004 7 New Structure to Obtain MOS c PESQ can only predict one-way listening speech quality (expressed as MOS). By a new combined PESQ/E-model structure, a conversational speech quality (MOSc) can be obtained as Measured MOSc. PESQ Delay model MOS R I e IeIe End-to-end delay E-model MOSc IdId Reference speech Degraded speech MOS (PESQ)
8
ICC 2004, Paris France, 20-24 June 2004 8 Regression based Models (1) Nonlinear regression models are derived for I e based on PESQ/PESQ-LQ Further combine I e with I d to obtain MOS c. MOS (PESQ) I e model IeIe E-model MOSc I d model IdId Delay (d) Codec Packet loss Reference speech Degraded speech Speech database Encoder Loss model Decoder Nonlinear regression model (I e model) Predicted I e PESQ/ PESQ-LQ MOS R I e Measured I e (a) (b)
9
ICC 2004, Paris France, 20-24 June 2004 9 Regression based Models (2) I e can be modelled by a logarithm fitting function with the form of Parameters for different codecs (PESQ) ParametersAMR(H)AMR(L)G.729G.723.1iLBC a16.6830.8621.1420.0612.59 b*10030.114.2612.7310.249.45 c14.9631.6622.4525.6320.42
10
ICC 2004, Paris France, 20-24 June 2004 10 Regression Models for AMR (12.2Kb/s) e.g. for AMR (12.2Kb/s), The goodness of fit is: SSE = 2.83 and R 2 = 0.998 MOS vs. packet loss and delay
11
ICC 2004, Paris France, 20-24 June 2004 11 Perceptual-based Buffer Optimization Motivation: only based on individual network parameters (e.g. delay or loss) targeting only minimum average delay or minimum late arrival loss, not maximum MOS. There is a need to design buffer algorithm to achieve optimum perceived speech quality. Contribution A perceptual-based optimization jitter buffer algorithm o Use regression based models for buffer optimization o Use a minimum impairment criterion instead of traditional maximum MOS score o A Weibull delay distribution based on trace analysis o A perceptual-based optimization of playout buffer algorithm
12
ICC 2004, Paris France, 20-24 June 2004 12 Impairment Function I m Define: impairment function I m Playout delay d Weilbull distribution buffer loss b
13
ICC 2004, Paris France, 20-24 June 2004 13 Minimum Impairment Criterion Define: minimum impairment criterion Given:network delay d n, network loss n and codec type Estimate: an optimized playout delay d opt Such that: minimize I m can be reached. d 1 d 2 d 3 d 4 Minimum I m
14
ICC 2004, Paris France, 20-24 June 2004 14 Perceptual-based Optimization Buffer Algorithm For every packet i received, calculate network delay n i If mode == SPIKE then if n i tail*old_d then mode = NORMAL elseif n i > head*d i then mode = SPIKE; old_d = d i else -update delay records for the past W packets endif At the beginning of a talkspurt If mode == SPIKE then d i = n i else -obtain ( , , ) for Weilbull distribution for the past W packets -search playout d which meets minimum I m criterion endif
15
ICC 2004, Paris France, 20-24 June 2004 15 Performance Analysis and Comparison (1) Selected five traces from UoP to CU (USA), DUT (Germany), BUPT (China), and NC (China). Traces 1 and 3 with high delay variation and traces 2, 4, 5 with low delay variation TraceDelay (ms) Jitter (ms) Loss (%) 115316.21.1 2460.80.3 318619.514.3 4160.74.4 51500.2
16
ICC 2004, Paris France, 20-24 June 2004 16 Performance Analysis and Comparison (2) “p-optimum” algorithm achieves the optimum voice quality for all traces. “adaptive” algorithm achieves sub-optimum quality with low complexity.
17
ICC 2004, Paris France, 20-24 June 2004 17 Conclusions and Future Work Conclusions The development of a new methodology and regression models to predict voice quality non-intrusively. Demonstrated the application of new non-intrusive voice quality prediction models to perceptual-based optimization of playout buffer algorithms. Future Work To consider buffer adaptation during a talkspurt in order to achieve the best trade-off between delay, loss and end-to-end jitter. To extend the work to improve the performance of multimedia services (e.g. audio/image/video) over IP networks
18
ICC 2004, Paris France, 20-24 June 2004 18 Contact Details http://www.tech.plymouth.ac.uk/spmc Dr. Lingfen Sun L.Sun@plymouth.ac.uk Prof Emmanuel Ifeachor E.Ifeachor@plymouth.ac.uk Any questions? Thank you!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.