Download presentation
Presentation is loading. Please wait.
Published byWalter Hunt Modified over 9 years ago
1
Y(J) Stein VoP4 1 VOPVOP YJS Other Features
2
Y(J) Stein VoP4 2 VOPVOP YJS Echo Cancellation
3
Y(J) Stein VoP4 3 VOPVOP YJS Acoustic Echo Ecan
4
Y(J) Stein VoP4 4 VOPVOP YJS Line echo Telephone 1 Telephone 2 hybrid Ecan
5
Y(J) Stein VoP4 5 VOPVOP YJS Subjective reaction to echo Required suppression (dB)Round-Trip Delay (ms) 1.40 11.120 17.740 22.760 27.280 30.9100 Ecan
6
Y(J) Stein VoP4 6 VOPVOP YJS Ecan
7
Y(J) Stein VoP4 7 VOPVOP YJS Subjective effect of 15 dB echo returns loss. Percent DifficultyDecrease in MOSRound-trip Delay (ms) 000 301.3300 602.0600 602.01200 Ecan
8
Y(J) Stein VoP4 8 VOPVOP YJS Echo suppressor comp switch inv 4w In practice need more: VOX, over-ride, reset, etc. Ecan
9
Y(J) Stein VoP4 9 VOPVOP YJS Why not echo suppresion? Echo suppression makes conversation half duplex –Waste of full-duplex infrastructure –Conversation unnatural –Hard to break in –Dead sounding line It would be better to cancel the echo subtract the echo signal allowing desired signal through but that requires DSP. near end - far end Ecan
10
Y(J) Stein VoP4 10 VOPVOP YJS Echo cancellation? Unfortunately, it’s not so easy Outgoing signal is delayed, attenuated, distorted Two echo canceller architectures: MODEM TYPE LINE ECHO CANCELLER (LEC) near end far end - clean echo path clean near end far end - echo path clean Ecan
11
Y(J) Stein VoP4 11 VOPVOP YJS LEC architecture A/D hybridhybrid D/A near end doubletalk detector adapt - NLP far end filter H X Y Ecan
12
Y(J) Stein VoP4 12 VOPVOP YJS Adaptive Algorithms How do we find the echo cancelling filter? keep it correct even if the echo path parameters change? Need an algorithm that continually changes the filter parameters All adaptive algorithms are based on the same ideas (lack of corellation between desired signal and interference) Let’s start with a simpler case - adaptive noise cancellation Ecan
13
Y(J) Stein VoP4 13 VOPVOP YJS Noise cancellation h nx y x n - h y e e n Ecan
14
Y(J) Stein VoP4 14 VOPVOP YJS Noise cancellation - cont. Assume that noise is distorted only by unknown gain h We correct by transmitting e n so that the audience hears y = x + h n - e n = x + (h-e) n the energy of this signal is E y y 2 = x 2 + (h-e) 2 n 2 + 2 (h-e) x n Assume that C xn = x n We need only set e to minimize E y ! (turn knob until minimal) Even if the distortion is a complete filter h we set the ANC filter e to minimize E y Ecan
15
Y(J) Stein VoP4 15 VOPVOP YJS The LMS algorithm Gradient descent on energy correction to H is proportional to error times input X H H + X Ecan
16
Y(J) Stein VoP4 16 VOPVOP YJS Nonlinear processing Because of finite numeric precision the LEC (linear) filtering can not completely remove echo Standard LEC adds center clipping to remove residual echo Clipping threshold needs to be properly set by adaptation Ecan
17
Y(J) Stein VoP4 17 VOPVOP YJS Doubletalk detection Adaptation of H should take place only when far end speaks So we freeze adaptation when no far end or double-talk, that is whenever near end speaks Geigel algorithm compares absolute value of near-end speech to half the maximum absolute value in X buffer If near-end exceeds far-end can assume only near-end is speaking Ecan
18
Y(J) Stein VoP4 18 VOPVOP YJS Data Relays
19
Y(J) Stein VoP4 19 VOPVOP YJS The need for relays Voice is a relatively forgiving signal (rather the ear is) Compression techniques are designed to pass voice but may hopelessly distort other signals Even simple tones (or DTMF) may not be passed by coders We could go back to 64Kbps G.711 for non-voice signals But isn’t that silly? Using 64Kbps for 64bps or even 9.6Kbps data? The solution is to use a relay Relays
20
Y(J) Stein VoP4 20 VOPVOP YJS Open Channel Reasons to use 64Kbps G.711 (open channel) (32 Kbps ADPCM may work as well) Inexpensive Simple design Robust Even open channel is not trivial! Need dynamic BW mechanism Need to detect the event (fax/modem tone, DTMF, MF, CPT, etc.) Need to return to compressed voice (end of session, time-out)
21
Y(J) Stein VoP4 21 VOPVOP YJS Tone / Fax / Modem Relay A/D D/A Demodulate/ Remodulate Analog 64 Kbps Demodulate/ Remodulate 64 Kbps A/D D/A Analog Relays Fax PSN Problems: need highly accurate detectors need low false alarm rate need appropriate protocol need accurate timing need expensive DSP processing delay may be too large may need “spoofing” can sides operate with different parameters?
22
Y(J) Stein VoP4 22 VOPVOP YJS VoP DSP Architecture Multi Channel Codec Speech Coders Tone Detector Packet Voice Protocol Playout Unit Control Real Time Operating System VAD CNG DISC. PCM Interface Tone Generator Serial Port Voice Packet Module LEC Relays PSN
23
Y(J) Stein VoP4 23 VOPVOP YJS DSP VoP System Implementation Telephony Signaling Module Microprocessor Voice Packet Module Microprocessor Voice Signaling Packet Protocol Module Network Management Module NM info Voice & Signaling Packets ATM / FR / IP Network PSTN Relays
24
Y(J) Stein VoP4 24 VOPVOP YJS Quality of Service
25
Y(J) Stein VoP4 25 VOPVOP YJS The meaning of QoS For general purpose data: Every little bit counts –only lossless compression –best effort delivery Real-time not essential –dynamic routing and packet reordering allowed For speech: Only subjective quality counts –Can use lossy compression –Can drop segments with little effect Real-time essential –predetermined route preferable (traffic engineering) QoS
26
Y(J) Stein VoP4 26 VOPVOP YJS PSTN QoS Virtually all calls (>95%) completed Once connected virtually no disconnects or faults Toll quality voice Low delay (except satellite calls) Full switching, optimized routing Call Management Fax/Modem functions Wireline and wireless services QoS
27
Y(J) Stein VoP4 27 VOPVOP YJS Paying for QoS Law of Photonics Price of transmitting a bit drops by half every 9 months Free Internet telephony Several firms offering free long distance service over Internet Strong compression, significant delay and jitter We no longer need to pay for service … but we are willing to pay for quality of service QoS
28
Y(J) Stein VoP4 28 VOPVOP YJS Paying for QoS wire service mobile service toll QoS
29
Y(J) Stein VoP4 29 VOPVOP YJS Speech Quality Measurement
30
Y(J) Stein VoP4 30 VOPVOP YJS Why does it sound the way it sounds? PSTN BW=0.2-3.8 KHz, SNR>30 dB PCM, ADPCM (BER 10 -3 ) five nines reliability line echo cancellation Voice over packet network speech compression delay, delay variation, jitter packet loss/corruption/priority echo cancellation SQM
31
Y(J) Stein VoP4 31 VOPVOP YJS Subjective Voice Quality Old Measures 5/9 DRT DAM The modern scale MOS DMOS meet neat seat feet Pete beat heat SQM
32
Y(J) Stein VoP4 32 VOPVOP YJS MOS according to ITU P.800 Subjective Determination of Transmission Quality Annex B: Absolute Category Rating (ACR) Listening Quality Listening Effort 5 excellent relaxed 4 good attention needed 3 fair moderate effort 2 poor considerable effort 1 bad no meaning with feasible effort SQM
33
Y(J) Stein VoP4 33 VOPVOP YJS MOS according to ITU (cont) Annex D Degradation Category Rating (DCR) Annex E Comparison Category Rating (CCR) ACR not good at high quality speech DCR CCR 5 inaudible 4 not annoying 3 slightly annoying much better 2 annoying better 1 very annoying slightly better 0 the same -1 slightly worse -2 worse -3 much worse SQM
34
Y(J) Stein VoP4 34 VOPVOP YJS Some MOS numbers Effect of Speech Compression: (from ITU-T Study Group 15) Quiet room 48 KHz 16 bit linear sampling 5.0 PCM (A-law/ law) 64 Kb/s 4.1 G.723.1 @ 6.3 Kb/s 3.9 G.729 @ 8 Kb/s 3.9 ADPCM G.726 32 Kb/s 3.8 toll quality GSM @ 13Kb/s 3.6 VSELP IS54 @ 8Kb/s 3.4 SQM
35
Y(J) Stein VoP4 35 VOPVOP YJS The Problem(s) with MOS Accurate MOS tests are the only reliable benchmark BUT MOS tests are off-line MOS tests are slow MOS tests are expensive Different labs give consistently different results Most MOS tests only check one aspect of system SQM
36
Y(J) Stein VoP4 36 VOPVOP YJS The Problem(s) with SNR Naive question: Isn’t CCR the same as SNR? SNR does not correlate well with subjective criteria Squared difference is not an accurate comparator Gain Delay Phase Nonlinear processing SQM
37
Y(J) Stein VoP4 37 VOPVOP YJS Speech distance measures Many objective measures have been proposed: Segmental SNR Itakura Saito distance Euclidean distance in Cepstrum space Bark spectral distortion Coherence Function None correlate well with MOS ITU target - find a quality-measure that does correlate well SQM
38
Y(J) Stein VoP4 38 VOPVOP YJS Return to Biology Standard speech model (LPC) (used by most speech processing/compression/recognition systems) is a model of speech production Unfortunately, speech production and perception systems are not matched Speech quality measurement idea: use a models of human auditory system (perception) ITU-T P.861 Perceptual Speech Quality Measurement (PSQM) ITU-T P.862 Perceptual Evaluation of Speech Quality (PESQ) ITU-R BS1387 Objective Measurements of Perceived Audio Quality SQM
39
Y(J) Stein VoP4 39 VOPVOP YJS Some objective methods Perceptual Speech Quality Measurement (PSQM) ITU-T P.861 Perceptual Analysis Measurement System (PAMS) BT proprietary technique Perceptual Evaluation of Speech Quality (PESQ) ITU-T P.862 Objective Measurement of Perceived Audio Quality (PAQM) ITU-R BS.1387 E-model ITU-T G.107, G.108 ETSI ETR-250 SQM
40
Y(J) Stein VoP4 40 VOPVOP YJS Objective Quality Strategy speech MOS estimate channel QM to MOS SQM
41
Y(J) Stein VoP4 41 VOPVOP YJS PSQM philosophy (from P.861) Perceptual model Perceptual model Internal Representation Internal Representation Audible Difference Cognitive Model SQM
42
Y(J) Stein VoP4 42 VOPVOP YJS PSQM philosophy (cont) Perceptual Modelling (Internal representation) Short time Fourier transform Frequency warping (telephone-band filtering, Hoth noise) Intensity warping Cognitive Modelling Loudness scaling Internal cognitive noise Asymmetry Silent interval processing PSQM Values 0 (no degradation) to 6.5 (maximum degradation) Conversion to MOS PSQM to MOS calibration using known references Equivalent Q values SQM
43
Y(J) Stein VoP4 43 VOPVOP YJS Problems with PSQM Designed for telephony grade speech codecs Doesn’t take network effects into account: filtering variable time delay localized distortions Draft standard P.862 adds: transfer function equalization time alignment, delay skipping distortion averaging SQM
44
Y(J) Stein VoP4 44 VOPVOP YJS PESQ philosophy (from P.862) Perceptual model Perceptual model Internal Representation Internal Representation Audible Difference Cognitive Model Time Alignment SQM
45
Y(J) Stein VoP4 45 VOPVOP YJS E-model R factor mouth to ear transmission quality model R = R 0 - I s - I d - I e + A where R 0 effect of SNR I s effect of simultaneous impairments I d effect of delayed impairments I e effect of equipment distortion A advantage of method (e.g. mobility of cellphone) Defined in ITU-T G.107, G.108 and ETSI ETR-250 SQM
46
Y(J) Stein VoP4 46 VOPVOP YJS VQMon PSQM and PESQ are intrusive techniques PSQM and PESQ require on-line DSP processing Given the speech encoder shouldn’t there be a connection between network parameters e.g. packet loss, jitter and speech quality? A nonintrusive technique has been developed based on the E-model Invented by AD Clark (Telchemy) accepted by ETSI TIPHON SQM
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.