Warped Linear Prediction Concept: Warp the spectrum to emulate human perception; then perform linear prediction on the result Approaches to warp the spectrum:

Slides:



Advertisements
Similar presentations
DCSP-13 Jianfeng Feng
Advertisements

Shapelets Correlated with Surface Normals Produce Surfaces Peter Kovesi School of Computer Science & Software Engineering The University of Western Australia.
Introduction to MP3 and psychoacoustics Material from website by Mark S. Drew
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Frequency modulation and circuits
August 2004Multirate DSP (Part 2/2)1 Multirate DSP Digital Filter Banks Filter Banks and Subband Processing Applications and Advantages Perfect Reconstruction.
CS 551/651: Structure of Spoken Language Lecture 11: Overview of Sound Perception, Part II John-Paul Hosom Fall 2010.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Filtering Filtering is one of the most widely used complex signal processing operations The system implementing this operation is called a filter A filter.
Digital Signal Processing – Chapter 11 Introduction to the Design of Discrete Filters Prof. Yasser Mostafa Kadah
AMI 4622 Digital Signal Processing
Speech & Audio Processing
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
Time-Frequency and Time-Scale Analysis of Doppler Ultrasound Signals
5.1 the frequency response of LTI system 5.2 system function 5.3 frequency response for rational system function 5.4 relationship between magnitude and.
Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.
CELLULAR COMMUNICATIONS DSP Intro. Signals: quantization and sampling.
EE313 Linear Systems and Signals Fall 2010 Initial conversion of content to PowerPoint by Dr. Wade C. Schwartzkopf Prof. Brian L. Evans Dept. of Electrical.
Relationship between Magnitude and Phase (cf. Oppenheim, 1999)
Representing Acoustic Information
EE513 Audio Signals and Systems Digital Signal Processing (Systems) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
DSP. What is DSP? DSP: Digital Signal Processing---Using a digital process (e.g., a program running on a microprocessor) to modify a digital representation.
Background Noise Definition: an unwanted sound or an unwanted perturbation to a wanted signal Examples: Clicks from microphone synchronization Ambient.
Linear Prediction Coding (LPC)
ECE 8443 – Pattern Recognition EE 3512 – Signals: Continuous and Discrete Objectives: Linearity Time Shift and Time Reversal Multiplication Integration.
1 CS 551/651: Structure of Spoken Language Lecture 8: Mathematical Descriptions of the Speech Signal John-Paul Hosom Fall 2008.
Speech Enhancement Using Spectral Subtraction
1 PATTERN COMPARISON TECHNIQUES Test Pattern:Reference Pattern:
Basics of Neural Networks Neural Network Topologies.
The Physical Layer Lowest layer in Network Hierarchy. Physical transmission of data. –Various flavors Copper wire, fiber optic, etc... –Physical limits.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
CEPSTRAL ANALYSIS Cepstral analysis synthesis on the mel frequency scale, and an adaptative algorithm for it. Cecilia Caruncho Llaguno.
Linearity Recall our expressions for the Fourier Transform and its inverse: The property of linearity: Proof: (synthesis) (analysis)
Z TRANSFORM AND DFT Z Transform
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
Noise Reduction Two Stage Mel-Warped Weiner Filter Approach.
Original signal Median of seven Moving average Non-Linear Filter Median of five or seven.
Transform Analysis of LTI systems 主講人:虞台文. Content The Frequency Response of LTI systems Systems Characterized by Constant- Coefficient Difference Equations.
Singer Similarity Doug Van Nort MUMT 611. Goal Determine Singer / Vocalist based on extracted features of audio signal Classify audio files based on singer.
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 27,
More On Linear Predictive Analysis
AUDIOFILES Harika Basana ), Elizabeth Chan ), Nikolai ), Frank Zhang ) 6100.
Automatic Equalization for Live Venue Sound Systems Damien Dooley, Final Year ECE Progress To Date, Monday 21 st January 2008.
RCC-Mean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recognition in presence of Telephone Noise Amin Fazel Sharif.
S.Frasca on behalf of LSC-Virgo collaboration New York, June 23 rd, 2009.
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
The Discrete Fourier Transform
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 20,
Chapter 5. Transform Analysis of LTI Systems Section
Relationship between Magnitude and Phase Quote of the Day Experience is the name everyone gives to their mistakes. Oscar Wilde Content and Figures are.
Relationship between Magnitude and Phase Quote of the Day Experience is the name everyone gives to their mistakes. Oscar Wilde Content and Figures are.
Speech Enhancement Summer 2009
PATTERN COMPARISON TECHNIQUES
Discrete Fourier Transform (DFT)
CS 591 S1 – Computational Audio
Speech Signal Processing
Vocoders.
Linear Prediction Simple first- and second-order systems
EE Audio Signals and Systems
Chapter 8 Design of Infinite Impulse Response (IIR) Digital Filter
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Homework 1 (Due: 11th Oct.) (1) Which of the following applications are the proper applications of the short -time Fourier transform? Also illustrate.
Z TRANSFORM AND DFT Z Transform
LECTURE 18: FAST FOURIER TRANSFORM
Govt. Polytechnic Dhangar(Fatehabad)
LECTURE 11: FOURIER TRANSFORM PROPERTIES
LECTURE 18: FAST FOURIER TRANSFORM
Presentation transcript:

Warped Linear Prediction Concept: Warp the spectrum to emulate human perception; then perform linear prediction on the result Approaches to warp the spectrum: 1.Fourier transform, warp, and transform back 2.Bank of overlapping band-pass filters. We seen this in one of the VAD algorithms 3.All-pass time-domain filters; all frequencies through but spectrum and phases are warped Why? To hopefully be able to more closely model human speech with smaller residues. Applications: Speech coding, recognition, synthesis

All-pass filter A pole of an all-phase filter lies inside the unit circle and the matching zero is outside. The magnitudes of the matching poles and zeros cancel along the unit circle They lie on the same radius line, so the polar coordinate angle is the same. First order all pass filter transfer function: H(z) = B(z)/A(z) = (z -1 – p 0 * )/ (1-p 0 z -1 ) = (z -1 - s e -jφ )/ (1- s e jφ z -1 ) = (z -1 - λ)/(1 - λz -1 ) Example: if p 0 = ½ + ½i, then the zero is at 1/p 0 * = 1/(½ - ½ i) = ( ½ + ½ i)/(1/2) = 1 + i Higher order all pass filter p0p0 p0*p0* Φ s r a n + a n-1 z -1 + a n-2 z -2 + … + a 1 z -n+1 + z -n 1 + a 1 z -1 + a 2 z -2 + … + a n-1 z -n+1 + a n z -n H(z) = Note: p 0 * = conjugate of p

All pass Filter Visualization

All-pass Filter Phase Response Real coefficients – λ, controls the location of the pole (p) and the zero (1/p). – No phase shift at frequencies 0, π, 2π; only a signal delay Complex coefficients – Similar phase responses – Coefficients alter diagonal crossing frequency: f x f x = f s /2π arccos(λ) where fs is the sampling rate – Phase response: w+2arctan(λsin(w)/(1- λcos(w)) π 2π2π 2π2π λ= 0.8 Note: The cross over point is where there is no frequency warping, only a delay

Frequency Warping All pass filter: magnitude remains constant, but the phase and frequency warped Group delay – Definition: change of phase with respect to change of frequency – Interpretation: Different frequencies pass through a filter at different speeds. Therefore, a frequency warping operation occurs. – Formula: Where w is angle of original frequency, w’ is the angle of the warped frequency, λ is the all-pass coefficient (1- λ 2 )sin(w) (1- λ 2 ) cos(w) - 2λ w’ = arctan

Illustration

Application to LPC Warping to the match hearing auditory system – λ = (2/π arctan( fs/100) ½ – Significant at higher sampling rates: > 8k hz – CELP coding: Degradation Mean Opinion Score (DMOS): 0.3 < λ < 0.4 Best Bark Scale match: λ = 0.57 Modified LPC: x’ n = d * f; y n ≈ ∑ k=1,N a k x’ n – Convolute the frame, f, with all-pass filter, d – Apply linear prediction to warped frequency signal

Evaluation Extra processing is minimal The LPC estimate is more accurate than when warping is not used For coding operations – Save one bit per sample at 48 kHz and 32 kHz – Save 0.6 bits per sample at 16kHz – Save 0.3 bits per sample at 8kHz Less peaky residue spectrum than standard methods Insignificant improvement for more than 30 LPC coefficients Matlab Toolbox:

Inverse LPC Filter Transfer function: Y z = H z X z – X z is the original signal – H z is the LPC filter ( G / (1-∑ i=1,P a i z -i ) – Y z is the filtered signal (residue) Inverse filter: Y z / H z = X z – Y z is the filtered output – H z is the LPC filter – X z is the restored signal Convolute the filtered signal with 1/H z to restore the original signal from the residue

Click Detection using WLP Definition: A click is a short localized discontinuity typically less than 1ms, which corrupts a signal Clicked Detection with both Warped and Standard linear prediction – LPC: y k = ∑ n=1,P an x k-n + r k + c k – r k is the residue and c k is the energy introduced by clicks – Looking for spikes (c k ), can find click points The warped linear prediction coefficient: λ – A value of 0.0 reverts to standard linear prediction – Positive values increase higher frequency resolution – Negative values increase lower frequency resolution

Click Detection Algorithm Compute the standard deviation (σ) of the audio signal LPC residue (ex: the amount of residue that we expect to remain) FOR each frame – Perform the Linear prediction with various λ values – Consider a click present in the frame when K σ > threshold, where K is an empirically set gain factor. – Approach 1 Throw away frames determined to contain clicks Disadvantage: some distortion is present – Approach 2 Use interpolation to smooth the residue signal of clicks Restore signal: Convolute the inverse LPC filter with the residue

Does WLP have an affect? Prediction Gain (improvement in signal to noise ratio) – Divide clean signal energy by residue energy – Note: The residue is computed applying WLP to the noisy signal – The higher the result, the better the detection – G p = 10 log (∑ n=1,N |x n | 2 / ∑ n=1,n |r n | 2 ) Experiment – 44 kHz sample rate, 215 frames of 1024 samples, musical signal corrupted with known click points, λ values varied between -0.8 and +0.8 – Result: choice of λ affects the ratio between clean signal and residue with clicks λ GpGp

Experiment Approach 1: Throw away click frames Approach 2: Interpolate click frames Results:  Both LPC and WLP can detect clicks  WLP with warping coefficient -0.7 reduces false detects  LPC and WLP miss approximately the same number of clicks