Warped Linear Prediction Concept: Warp the spectrum to emulate human perception; then perform linear prediction on the result Approaches to warp the spectrum: 1.Fourier transform, warp, and transform back 2.Bank of overlapping band-pass filters. We seen this in one of the VAD algorithms 3.All-pass time-domain filters; all frequencies through but spectrum and phases are warped Why? To hopefully be able to more closely model human speech with smaller residues. Applications: Speech coding, recognition, synthesis
All-pass filter A pole of an all-phase filter lies inside the unit circle and the matching zero is outside. The magnitudes of the matching poles and zeros cancel along the unit circle They lie on the same radius line, so the polar coordinate angle is the same. First order all pass filter transfer function: H(z) = B(z)/A(z) = (z -1 – p 0 * )/ (1-p 0 z -1 ) = (z -1 - s e -jφ )/ (1- s e jφ z -1 ) = (z -1 - λ)/(1 - λz -1 ) Example: if p 0 = ½ + ½i, then the zero is at 1/p 0 * = 1/(½ - ½ i) = ( ½ + ½ i)/(1/2) = 1 + i Higher order all pass filter p0p0 p0*p0* Φ s r a n + a n-1 z -1 + a n-2 z -2 + … + a 1 z -n+1 + z -n 1 + a 1 z -1 + a 2 z -2 + … + a n-1 z -n+1 + a n z -n H(z) = Note: p 0 * = conjugate of p
All pass Filter Visualization
All-pass Filter Phase Response Real coefficients – λ, controls the location of the pole (p) and the zero (1/p). – No phase shift at frequencies 0, π, 2π; only a signal delay Complex coefficients – Similar phase responses – Coefficients alter diagonal crossing frequency: f x f x = f s /2π arccos(λ) where fs is the sampling rate – Phase response: w+2arctan(λsin(w)/(1- λcos(w)) π 2π2π 2π2π λ= 0.8 Note: The cross over point is where there is no frequency warping, only a delay
Frequency Warping All pass filter: magnitude remains constant, but the phase and frequency warped Group delay – Definition: change of phase with respect to change of frequency – Interpretation: Different frequencies pass through a filter at different speeds. Therefore, a frequency warping operation occurs. – Formula: Where w is angle of original frequency, w’ is the angle of the warped frequency, λ is the all-pass coefficient (1- λ 2 )sin(w) (1- λ 2 ) cos(w) - 2λ w’ = arctan
Illustration
Application to LPC Warping to the match hearing auditory system – λ = (2/π arctan( fs/100) ½ – Significant at higher sampling rates: > 8k hz – CELP coding: Degradation Mean Opinion Score (DMOS): 0.3 < λ < 0.4 Best Bark Scale match: λ = 0.57 Modified LPC: x’ n = d * f; y n ≈ ∑ k=1,N a k x’ n – Convolute the frame, f, with all-pass filter, d – Apply linear prediction to warped frequency signal
Evaluation Extra processing is minimal The LPC estimate is more accurate than when warping is not used For coding operations – Save one bit per sample at 48 kHz and 32 kHz – Save 0.6 bits per sample at 16kHz – Save 0.3 bits per sample at 8kHz Less peaky residue spectrum than standard methods Insignificant improvement for more than 30 LPC coefficients Matlab Toolbox:
Inverse LPC Filter Transfer function: Y z = H z X z – X z is the original signal – H z is the LPC filter ( G / (1-∑ i=1,P a i z -i ) – Y z is the filtered signal (residue) Inverse filter: Y z / H z = X z – Y z is the filtered output – H z is the LPC filter – X z is the restored signal Convolute the filtered signal with 1/H z to restore the original signal from the residue
Click Detection using WLP Definition: A click is a short localized discontinuity typically less than 1ms, which corrupts a signal Clicked Detection with both Warped and Standard linear prediction – LPC: y k = ∑ n=1,P an x k-n + r k + c k – r k is the residue and c k is the energy introduced by clicks – Looking for spikes (c k ), can find click points The warped linear prediction coefficient: λ – A value of 0.0 reverts to standard linear prediction – Positive values increase higher frequency resolution – Negative values increase lower frequency resolution
Click Detection Algorithm Compute the standard deviation (σ) of the audio signal LPC residue (ex: the amount of residue that we expect to remain) FOR each frame – Perform the Linear prediction with various λ values – Consider a click present in the frame when K σ > threshold, where K is an empirically set gain factor. – Approach 1 Throw away frames determined to contain clicks Disadvantage: some distortion is present – Approach 2 Use interpolation to smooth the residue signal of clicks Restore signal: Convolute the inverse LPC filter with the residue
Does WLP have an affect? Prediction Gain (improvement in signal to noise ratio) – Divide clean signal energy by residue energy – Note: The residue is computed applying WLP to the noisy signal – The higher the result, the better the detection – G p = 10 log (∑ n=1,N |x n | 2 / ∑ n=1,n |r n | 2 ) Experiment – 44 kHz sample rate, 215 frames of 1024 samples, musical signal corrupted with known click points, λ values varied between -0.8 and +0.8 – Result: choice of λ affects the ratio between clean signal and residue with clicks λ GpGp
Experiment Approach 1: Throw away click frames Approach 2: Interpolate click frames Results: Both LPC and WLP can detect clicks WLP with warping coefficient -0.7 reduces false detects LPC and WLP miss approximately the same number of clicks