More On Linear Predictive Analysis

Slides:

Advertisements

Similar presentations

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.

Advertisements

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.

Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.

Speaker Recognition Sharat.S.Chikkerur Center for Unified Biometrics and Sensors

A 12-WEEK PROJECT IN Speech Coding and Recognition by Fu-Tien Hsiao and Vedrana Andersen.

Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.

Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.

Complete Discrete Time Model Complete model covers periodic, noise and impulsive inputs. For periodic input 1) R(z): Radiation impedance. It has been shown.

Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.

Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec

Analysis & Synthesis The Vocoder and its related technology.

Voice Transformation Project by: Asaf Rubin Michael Katz Under the guidance of: Dr. Izhar Levner.

Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.

Pitch Prediction for Glottal Spectrum Estimation with Applications in Speaker Recognition Nengheng Zheng Supervised under Professor P.C. Ching Nov. 26,

Warped Linear Prediction Concept: Warp the spectrum to emulate human perception; then perform linear prediction on the result Approaches to warp the spectrum:

Linear Prediction Problem: Forward Prediction Backward Prediction

Relationship between Magnitude and Phase (cf. Oppenheim, 1999)

EE513 Audio Signals and Systems Digital Signal Processing (Systems) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.

Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.

LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Encoding of Waveforms Encoding of Waveforms to Compress Information.

1 CS 551/651: Structure of Spoken Language Lecture 8: Mathematical Descriptions of the Speech Signal John-Paul Hosom Fall 2008.

Digital Systems: Hardware Organization and Design

Fourier Series Summary (From Salivahanan et al, 2002)

Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.

Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.

Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.

Concepts of Multimedia Processing and Transmission IT 481, Lecture #4 Dennis McCaughey, Ph.D. 25 September, 2006.

SPEECH CODING Maryam Zebarjad Alessandro Chiumento.

1 Linear Prediction. 2 Linear Prediction (Introduction) : The object of linear prediction is to estimate the output sequence from a linear combination.

1 PATTERN COMPARISON TECHNIQUES Test Pattern:Reference Pattern:

1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Definitions Random Signal Analysis (Review) Discrete Random Signals Random.

Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.

Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.

Structure of Spoken Language

Linear Predictive Analysis 主講人：虞台文. Contents Introduction Basic Principles of Linear Predictive Analysis The Autocorrelation Method The Covariance Method.

ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska

Chapter 6 Linear Predictive Coding (LPC) of Speech Signals 6.1 Basic Concepts of LPC 6.2 Auto-Correlated Solution of LPC 6.3 Covariance Solution of LPC.

VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.

ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.

Transform Analysis of LTI systems 主講人：虞台文. Content The Frequency Response of LTI systems Systems Characterized by Constant- Coefficient Difference Equations.

(Extremely) Simplified Model of Speech Production

Discrete-Time Signals and Systems

Professors: Eng. Diego Barral Eng. Mariano Llamedo Soria Julian Bruno

SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Normal Equations The Orthogonality Principle Solution of the Normal Equations.

Chapter 2. Fourier Representation of Signals and Systems

Discrete-time Random Signals

Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)

Voice Sampling. Sampling Rate Nyquist’s theorem states that a signal can be reconstructed if it is sampled at twice the maximum frequency of the signal.

By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.

Structures for Discrete-Time Systems 主講人：虞台文. Content Introduction Block Diagram Representation Signal Flow Graph Basic Structure for IIR Systems Transposed.

Linear Prediction.

1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.

PATTERN COMPARISON TECHNIQUES

Figure 11.1 Linear system model for a signal s[n].

Linear Prediction.

1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.

Microcomputer Systems 2

Linear Predictive Coding Methods

The Vocoder and its related technology

ESTIMATED INVERSE SYSTEM

Digital Systems: Hardware Organization and Design

Digital Systems: Hardware Organization and Design

Linear Prediction.

Speech Processing Final Project

16. Mean Square Estimation

Presentation transcript:

More On Linear Predictive Analysis 主講人：虞台文

Contents Linear Prediction Error Computation of the Gain Frequency Domain Interpretation of LPC Representations of LPC Coefficients Direct Representation Roots of Predictor Polynomials PARCO Coefficients Log Area Ratio Coefficients Line Spectrum Pair

More On Linear Predictive Analysis Linear Prediction Error

LPC Error

Examples Could be used for Pitch Detection Premphasized Speech Signals

Normalized Mean-Squared Error General Form Autocorrelation Method Covariance Method

Normalized Mean-Squared Error Autocorrelation Method Covariance

Experimental Evaluation on LPC Parameters Frame Width N Filter Order p Conditions: 1. Covariance method and autocorrelation method 2. Synthetic vowel and Nature speech 3. Pitch synchronous and pitch asynchronous analysis

Pitch Synchronous Analysis Covariance method is more suitable for pitch synchronous analysis. Pitch Synchronous Analysis /i/ The frame was beginning at the beginning of a pitch period. Why the error increases? zero: the same order as the synthesizer.

Pitch Asynchronous Analysis Both covariance and autocorrelation methods exhibit similar performance. Pitch Asynchronous Analysis /i/ Monotonically decreasing

The errors resulted by covariance and autocorrelation methods are compatible when N > 2P. Frame Width Variation /i/ Why the errors jump high when the frame size nears the multiples of pitch period?

Pitch Synchronous Analysis Both for synthetic and nature speeches, covariance method is more suitable for pitch synchronous analysis. Pitch Synchronous Analysis

Pitch Asynchronous Analysis Both for synthetic and nature speeches, two methods are compatible. Pitch Asynchronous Analysis

Both for synthetic and nature speeches, the errors resulted by covariance and autocorrelation methods are compatible when N > 2P. Frame Width Variation

More On Linear Predictive Analysis Computation of the Gain

Speech Production Model (Review) Impulse Train Generator Random Noise Time-Varying Digital Filter Vocal Tract Parameters G u(n) s(n)

Speech Production Model (Review) Impulse Train Generator Random Noise Time-Varying Digital Filter Vocal Tract Parameters G u(n) s(n) H(z)

Linear Prediction Model (Review) Error compensation:

Speech Production vs. Linear Prediction Vocal Tract Excitation ak = k Linear Predictor Error Linear Prediction:

Speech Production vs. Linear Prediction

The Gain Generally, it is not possible to solve for G in a reliable way directly from the error signal itself. Instead, we assume Energy of Error Energy of Excitation

Assumptions about u(n) Voiced Speech This requires that both glottal pulse shape and lip radiation are lumped into the vocal tract model. G(z) V(z) R(z) G u(n)=(n) h(n) 1/A(z) Unvoiced Speech

Gain Estimation for Voiced Speech This requires that both glottal pulse shape and lip radiation are lumped into the vocal tract model. G(z) V(z) R(z) G u(n)=(n) h(n) 1/A(z) This requires that p is sufficiently large.

Gain Estimation for Voiced Speech h(n) G(z) V(z) R(z) G u(n)=(n) h(n) 1/A(z)

Correlation Matching Define (n) h(n) Assumed causal. Autocorrelation function of the impulse response.

Correlation Matching Autocorrelation function of the speech signal h(n) Autocorrelation function of the speech signal If H(z) correctly model the speech production system, we should have

Correlation Matching (n) h(n)

Correlation Matching Assumed causal.

Correlation Matching The same formulation as autocorrelation method.

The Gain for voice speech En

More on Autocorrelation H(z) x(n) y(n) Assumed Stationary Define The stationary assumption implies

Properties of LTI Systems H(z) x(n) y(n) Define

Properties of LTI Systems Define

Properties of LTI Systems Independent on n y(n) is also stationary.

Properties of LTI Systems ll+k

Properties of LTI Systems Define Properties of LTI Systems Estimated from input Estimated from output Filter Design

The Gain for Unvoiced Speech s(n)

The Gain for Unvoiced Speech =?

The Gain for Unvoiced Speech Why? The Gain for Unvoiced Speech =?

The Gain for Unvoiced Speech Estimated using rm

The Gain for Unvoiced Speech Once again, we have the same formulation as autocorrelation method. Furthermore,

More On Linear Predictive Analysis Frequency Domain Interpretation of LPC

Spectral Representation of Vocal Tract

Spectra

Frequency Domain Interpretation of Mean-Squared Prediction Error Parseval’s Theorem

Frequency Domain Interpretation of Mean-Squared Prediction Error

Frequency Domain Interpretation of Mean-Squared Prediction Error |Sn(ej)| > |H(ej)| contributes more to the total error than |Sn(ej)| < |H(ej)|. Hence, the LPC spectral error criterion favors a good fit near the spectral peak.

Spectra

More On Linear Predictive Analysis Representations of LPC Coefficients --- Direct Representation

Direct Representation Coding ai’s directly. z1 a1 a2 ap uG[n] uL[n] G G/A(z)

Disadvantages The dynamic ranges of ai’s is relatively large. Quantatization possibly causes instability problems.

More On Linear Predictive Analysis Representations of LPC Coefficients --- Roots of Predictor Polynomials

Roots of the Predictor Polynomial Coding p/2 zk’s. Dynamic range of rk’s? Dynamic range of k’s?

The Application Formant Analysis Application.

Implementation G/A(z) Each Stage represents one formant frequency and its corresponding bandwidth.

More On Linear Predictive Analysis Representations of LPC Coefficients --- PARCO Coefficients

PARCO Coefficients Step-Up Procedure:

PARCO Coefficients Dynamic range of ki’s? Step-Down Procedure: where n goes from p to p1, down to 1 and initially we set:

More On Linear Predictive Analysis Representations of LPC Coefficients --- Log Area Ratio Coefficients

Log Area Ratio Coefficients ki’s: Reflection Coefficients gi’s: Log Area Ratios

More On Linear Predictive Analysis Representations of LPC Coefficients --- Line Spectrum Pair

LPC Coefficients where m is the order of the inverse filter. If the system is stable, all zeros of the inverse filter are inside the unit circle. Line Spectrum Pair (LSP) is an alternative LPC spectral representation.

Line Spectrum Pair LSP contains two polynomials. The zeros of the the two polynomials have the following properties: Lie on unit circle Interlaced Through quantization, the minimum phase property of the filter is kept. Useful for vocoder application.

Recursive Relation of the inverse filter where km+1 is the reflection coefficient of the m+1th tube. Special cases: Recall that ki =1: Ai+1 ki =1: Ai+1=0

LSP Polynomials

Properties of LSP Polynomials Show that The zeros of P(z) and Q(z) are on the unit circle and interlaced.

Proof

Proof

Proof

Proof >0

Proof P(z)=0 iff H(z) = 1. Q(z)=0 This concludes that the zeros of P(z) and Q(z) are on the unit circle. P(z)=0 Q(z)=0 iff H(z) = 1.

Proof (interlaced zeros) Fact: H(z) is an all-pass filter. One can verify that (0) = 0 and (2) = 2(m+1)  Phase

Proof (interlaced zeros) zeros of Q(z) zeros of P(z) Therefore, z=1 is a zero of Q(z). One can verify that (0) = 0 and (2) = 2(m+1)  Phase

Proof (interlaced zeros) (0) = 0 (2) = 2(m+1)  Proof (interlaced zeros) 2 () 2(m+1)   Is this possible?

Proof (interlaced zeros) (0) = 0 (2) = 2(m+1)  Proof (interlaced zeros) 2 () 2(m+1)   Is this possible?

Proof (interlaced zeros) (0) = 0 (2) = 2(m+1)  Proof (interlaced zeros) Group Delay > 0 () is monotonically decreasing.

Proof (interlaced zeros) (0) = 0 (2) = 2(m+1)  Proof (interlaced zeros) 2 () 2(m+1)    2 3 4 5 Typical shape of () .

Proof (interlaced zeros) 2 () 2(m+1)    2 3 4 5 . Typical shape of () Q(ej)=0 P(ej)=0 Q(ej)=0 P(ej)=0 Q(ej)=0 P(ej)=0

Proof (interlaced zeros) 2 () 2(m+1)    2 3 4 5 . Typical shape of () Q(ej)=0 P(ej)=0 There are 2(m+1) cross points from 0    , these constitute the 2(m+1) interlaced zeros of P(z) and Q(z).

Quantization of LSP Zeros Is such a quantization detrimental? Quantization of LSP Zeros For effective transmission, we quantize i’s into several levels, e.g., using 5 bits.

Minimum Phase Preserving Property Show that in quantizing the LSP frequencies, the reconstructed all-pole filter preserves its minimum phase property as long as the zeros has the properties shown in the left figure.

Find the Roots of P(z) and Q(z) Symmetric Anti-symmetric

Find the Roots of P(z) and Q(z) . . . .

Find the Roots of P(z) and Q(z) . We only need compute the values on 1  i  m/2.

Find the Roots of P(z) and Q(z) . . . .

Find the Roots of P(z) and Q(z) . We only need compute the values on 1  i  m/2.

Find the Roots of P(z) and Q(z) Both P’(z) and Q’(z) are symmetric. P’(z) Find the Roots of P(z) and Q(z) zero on 1 Q’(z) zero on +1

Find the Roots of P(z) and Q(z) To find its zeros.

Find the Roots of P(z) and Q(z)

Find the Roots of P(z) and Q(z) Define

Find the Roots of P(z) and Q(z) .

Find the Roots of P(z) and Q(z) Consider m=10.

Find the Roots of P(z) and Q(z)

Find the Roots of P(z) and Q(z)

Find the Roots of P(z) and Q(z)

Find the Roots of P(z) and Q(z) We want to find i’s such that

Find the Roots of P(z) and Q(z) Algorithm:  We only need to find zeros for this half.

Find the Roots of P(z) and Q(z)   2 3