Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Systems: Hardware Organization and Design

Similar presentations


Presentation on theme: "Digital Systems: Hardware Organization and Design"— Presentation transcript:

1 Digital Systems: Hardware Organization and Design
11/21/2018 Speech Processing Homomorphic Signal Processing Architecture of a Respresentative 32 Bit Processor

2 Digital Systems: Hardware Organization and Design
11/21/2018 Outline Principles of Homomorphic Signal Processing Details of Homomorphic Processing Variants of Homomorphic Processing Investigation of Homomorphic systems to speech analysis and synthesis 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

3 Principles of Homomorphic Processing
Digital Systems: Hardware Organization and Design 11/21/2018 Principles of Homomorphic Processing Superposition Property of Linear Systems: x1[n] x[n] L L(x[n]) a1 x2[n] a2 a1L(x1[n]) L x1[n] L(x[n]) a1 L x2[n] a2L(x2[n]) a2 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

4 Principles of Homomorphic Processing
Digital Systems: Hardware Organization and Design 11/21/2018 Principles of Homomorphic Processing Example 6.1: If signals fall in non-overlapping frequency bands then they are separable. x[n]=x1[n]+x2[n] X1()=ℱ{x1[n]} & X1() [0,/2], X2()=ℱ{x2[n]} & X2() [/2, ], y[n] = h[n]*(x1[n]+x2[n]) = h[n]*x1[n] + h[n]*x2[n] y[n] = h[n]*x2[n] = x2[n] 0 for  ∈[0,/2] 1 for  ∈[/2, ] 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

5 Principles of Homomorphic Processing
Digital Systems: Hardware Organization and Design 11/21/2018 Principles of Homomorphic Processing Generalized Superposition Concept that would support separation of nonlinearly combined signals. Leads to the notion of Generalized Linear Filtering. Properties: H(x1[n]□x2[n])=H(x1[n])○H(x2[n]) H(c:x [n])=c◈H(x [n]) Systems that satisfy those two properties are referred to as homomorphic systems and are said to satisfy a generalized principle of superposition. Input rule : Output rule H() x[n] y[n] 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

6 Principles of Homomorphic Processing
Digital Systems: Hardware Organization and Design 11/21/2018 Principles of Homomorphic Processing Importance of homomorphic systems for speech processing lies in their capability of transforming nonlinearly combined signals to additively combined signals so that linear filtering can be performed on them. Homomorphic systems can be expressed as a cascade of three homomorphic sub-systems depicted in the figure below – referred to as the canonic representation: H + + + + D□ L D○ -1 x[n] . . . . y[n] : I II III 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

7 Canonic Representation of a Homomorphic System
Digital Systems: Hardware Organization and Design 11/21/2018 Canonic Representation of a Homomorphic System I The Characteristic System: Transforms □ into add “+” The Linear System: transforms “add” into “add” The Inverse System: transforms add into ○ + D□ x[n] : . II + + L . . III + D○ -1 . y[n] 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

8 Digital Systems: Hardware Organization and Design
11/21/2018 Homomorphic Systems Let the goal be removal of undesired component of the signal (e.g., noise): Type of combination rule System Operation Signal & Additive noise Linear System Linear Filtering Signal & Multiplicative noise Multiplicative System Multiplicative Filtering Signal & Convolutional Noise Convolutional System Convolutional Filtering 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

9 Multiplicative Homomorphic Systems
Digital Systems: Hardware Organization and Design 11/21/2018 Multiplicative Homomorphic Systems Consider Homomorphic Multiplicative System depicted below: Use D□ to convert MULT into ADD. Use D○ to convert ADD into MULT. Which rule (operation) transforms MULT into ADD? M[] x[n] y[n] -1 + + + + D● L D● -1 x[n] y[n] I II III 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

10 Multiplicative Homomorphic Systems
Digital Systems: Hardware Organization and Design 11/21/2018 Multiplicative Homomorphic Systems If x[n]=x1[n]●x2[n], and x1[n]>0 & x2[n]>0 for all n Then log(x1[n]●x2[n])=log(x1[n])+log(x2[n]) However, x[n] may not be always positive. Generalization to complex signals: x[n]=|x[n]|ejarg(x[n]) which requires definition of complex log operator. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

11 Multiplicative Homomorphic Systems
Digital Systems: Hardware Organization and Design 11/21/2018 Multiplicative Homomorphic Systems An implementation of multiplicative Homomorphic System: Definition: Complex log: Complex exp. (Inverse operation) + + + + Complex log Linear System Complex Exp. x[n] y[n] I II III 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

12 Homomorphic Systems for Convolution
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Systems for Convolution Consider Homomorphic System for Convolution depicted below: Use D□ to convert “*” into ADD. Use D○ to convert ADD into “*”. How to transform “*” into ADD? C[] x[n] y[n] C + + + + D* L D* -1 x[n] y[n] I II III 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

13 Homomorphic Systems for Convolution
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Systems for Convolution Let x[n]=x1[n]*x2[n] Inverse Operation I. + + + x[n] З[] З-1[] log[] time “time” D* III. З[] + exp[] З-1[] * D* “time” y[n] -1 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

14 Homomorphic Systems for Convolution
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Systems for Convolution For x[n]=x1[n]*x2[n]: X(z)=X1(z)X2(z) Log(X(z))=Log(X1(z)X2(z))= Log(X1(z))+Log(X2(z)) Complex logarithm. This operation requires special handling because: X(z) > 0 For complex X(z) phase is not uniquely defined (i.e., multiple of 2) X(z) has to be defined on unit circle (e.g., Z transform of a stable sequence). In practice operate on unit circle z=ej. Fourier Transform: 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

15 Homomorphic Systems for Convolution
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Systems for Convolution Two cases are possible in computing : Complex Cepstrum (CC): Real Cepstrum (RC): 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

16 Homomorphic Systems for Convolution
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Systems for Convolution Example 6.3 Consider a sequence x[n] consisting of a system impulse response h[n] convolved with an impulse train p[n]: Goal is to estimate h[n]. First form canonical representation for convolution: If D* is such that p[n] remains train of pulses, and h[n] falls between impulses then separation is possible. h[] p[n] x[n] x[n]=h[n]*p[n] ^ ^ 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

17 Digital Systems: Hardware Organization and Design
11/21/2018 Example 6.3 (cont.) Let L denote such operation (i.e., rectangular window that would separate p[n] from h[n]). ^ ^ 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

18 Digital Systems: Hardware Organization and Design
11/21/2018 Example 6.4 a,b real and positive: ⇒ log(ab) = log(a)+log(b) a,b real but b<0 ⇒ log(ab) = log(a|b|ejk)=log(a)+log(|b|)+jk, k=1,3,5,… log(ab) is ambiguous. This example indicates that special consideration must be made in defining the logarithm operator for complex X(z) in order to make the logarithm of the product the sum of logarithms. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

19 Homomorphic Systems for Convolution-Complex Logarithm
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Systems for Convolution-Complex Logarithm Suppose that X(z) is evaluated on the unit circle (z=ej) Let x[n]=x1[n]*x2[n] ⇒ X()=X1() X2() Consider then complex log of X(): Considering that X()=X1() X2() then: 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

20 Homomorphic Systems for Convolution-Complex Logarithm
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Systems for Convolution-Complex Logarithm In the previous expression the following was assumed: Also: Expression generally does not hold due to the ambiguity in the definition of phase: 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

21 Homomorphic Systems for Convolution-Complex Logarithm
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Systems for Convolution-Complex Logarithm Note that: PV denotes principal value of the phase which falls in the interval [-,]. Arbitrary multiple of 2 can be added to the principal phase value Thus additive property generally does not hold. How to impose uniqueness? Force continuity of phase: Select k such that ∠X()=PV[∠X()]+ 2k is a continuous function. Figure 6.5 (next slide). Phase derivative approach: It can be shown that: 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

22 Fourier Transform Phase Continuity
Digital Systems: Hardware Organization and Design 11/21/2018 Fourier Transform Phase Continuity 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

23 Homomorphic Systems for Convolution
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Systems for Convolution Relationship of complex cepstrum to real cepstrum c[n]: If x[n] real then: |X()| is real and even and thus log[|X()|] is real and even ∠X() is odd, and hence is referred to as the complex cepstrum. Even component of the complex cepstrum, c[n] is referred to as the real cepstrum. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

24 Complex Cepstrum of Speech-Like Sequences
Digital Systems: Hardware Organization and Design 11/21/2018 Complex Cepstrum of Speech-Like Sequences Sequences with Rational z-Transform: General form the class of sequences is given below: Mi, Ni – are zeros and poles inside the unit circle. Mo, No – are zeros and poles outside the unit circle. |ak|, |bk|, |ck|, |dk| are all < 1 ⇒ Thus there are no singularities on the unit circle. A > 0. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

25 Complex Cepstrum of Speech-Like Sequences
Digital Systems: Hardware Organization and Design 11/21/2018 Complex Cepstrum of Speech-Like Sequences Applying complex logarithm gives: is a z-transform of sequence Want inverse z-transform to be absolutely summable ⇒ ROC of must include unit circle, |z|=1. This condition is equivalent to having all constituent elements of have ROC’s that include unit circle, |z|=1 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

26 Complex Cepstrum of Speech-Like Sequences
Digital Systems: Hardware Organization and Design 11/21/2018 Complex Cepstrum of Speech-Like Sequences Im Z-plane In order to obtain ROC for expressions of the form: log(1-z-1) log(1-z), they are expressed in a power series expansion: 1 Re ROC for log(1-z-1) Im Z-plane 1 Re 1/ ROC for log(1- z) 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

27 Complex Cepstrum of Speech-Like Sequences
Digital Systems: Hardware Organization and Design 11/21/2018 Complex Cepstrum of Speech-Like Sequences The ROC of is therefore given by an annulus defined by the poles & zeros of X(z) closest to the unit circle: Im Z-plane 1 Re ROC for typical rational X(z) 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

28 Complex Cepstrum of Speech-Like Sequences
Digital Systems: Hardware Organization and Design 11/21/2018 Complex Cepstrum of Speech-Like Sequences Complex cepstrum associated with rational X(z) can be therefore expressed as: 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

29 Example A. x[n] = anu[n], |a|<1
Determine the complex cepstrum of the minimum-phase sequence: x[n] = anu[n], |a|<1 Solution: Determine the z-transform of x[n]. 21 November 2018 Veton Këpuska

30 Example A. (cont.) Compute X(z):
^ Compute X(z): Complex cepstrum values are simply the coefficients of the term z-n above, that is: 21 November 2018 Veton Këpuska

31 x[n] = d[n]+bd[n-1], |b|<1
Example B. Determine the complex cepstrum of the maximum-phase sequence: x[n] = d[n]+bd[n-1], |b|<1 Solution: Determine the z-transform of x[n]. 21 November 2018 Veton Këpuska

32 Example B. (cont.) Compute X(z):
^ Compute X(z): Complex cepstrum values are simply the coefficients of the term z-n above, that is: 21 November 2018 Veton Këpuska

33 Example C. x[n] = d[n]+ad[n-Np], |a|<1
Determine the complex cepstrum of the sequence: x[n] = d[n]+ad[n-Np], |a|<1 Discrete convolution of any sequence x[n] with this sequence produced a scaled by-a echo of the first sequence: i.e.: x1[n]*(d[n]+ad[n-Np]) = x1[n]+ax1[n-Np] Solution: Determine the z-transform of x[n]. 21 November 2018 Veton Këpuska

34 Example C. (cont.) Compute X(z):
^ Compute X(z): Complex cepstrum values are simply the coefficients of the term z-n above, that is: 21 November 2018 Veton Këpuska

35 Example D. Determine the complex cepstrum of the convolution sequence of previous examples (A-C): Solution: 21 November 2018 Veton Këpuska

36 Example D. The complex cepstrum of x[n] is the sum of the complex cepstra of the three sequences: 21 November 2018 Veton Këpuska

37 Example D. (cont.) 21 November 2018 Veton Këpuska

38 Digital Systems: Hardware Organization and Design
11/21/2018 Example 6.5 Let: where a, b, c, are real and <1. The ROC of X(z) includes unit circle so that x[n] is stable. A delay z-r corresponds to a shift in the sequence. Thus complex cepstrum is given by: 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

39 Digital Systems: Hardware Organization and Design
11/21/2018 Example 6.5 (cont.) The inverse z-transform of the shift term is given by: Contribution of z-r term is significant. On the unit circle: z-r=e-jr=1∠-r contributes a linear ramp to the phase and thus for a large shift r, dominates the phase representation and gives a large discontinuity at  and -. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

40 Complex Cepstrum of Speech-Like Sequences
Digital Systems: Hardware Organization and Design 11/21/2018 Complex Cepstrum of Speech-Like Sequences Relation of complex cepstrum and real cepstrum for x[n] with rational z-transform that is minimum phase: Complex cepstrum of a minimum-phase sequence with a rational z-transform is right-sided: 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

41 Impulse Train Convolved with Rational z-Transform Sequences
Digital Systems: Hardware Organization and Design 11/21/2018 Impulse Train Convolved with Rational z-Transform Sequences Second class of sequences of interest in the speech context is the train of uniformly-spaced unit samples with varying weights and its interaction with the system: h[n] p[n] x[n] x[n]=h[n]*p[n] Z 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

42 Impulse Trans Convolved with Rational z-Transform Sequences
Digital Systems: Hardware Organization and Design 11/21/2018 Impulse Trans Convolved with Rational z-Transform Sequences If p[n] is minimum phase and |ar(zN)-1|<1, zeros are inside the unit circle, log[P(z)] can be expressed as: Thus is an infinite right-sided sequence of impulses spaced N-samples apart. Note that in general for non-minimum phase sequences the complex cepstrum is two-sided with uniformly spaced impulses. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

43 Digital Systems: Hardware Organization and Design
11/21/2018 Example 6.6 Consider a sequence x[n]=h[n]*p[n] where z-transform of h[n] is given by: a,a*, and b, b* are complex conjugate pairs. Consider p[n] to be train of periodic pulses then: Z-plane Im b 1 a Re a* b* h[n] p[n] x[n] x[n]=h[n]*p[n] 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

44 Digital Systems: Hardware Organization and Design
11/21/2018 Example 6.6 (cont) If ∈ and ||<1 then p[n] is train of decaying exponentials: Z-transform of p[n] is given by: Then, as derived earlier: 1 p[n] n 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

45 Digital Systems: Hardware Organization and Design
11/21/2018 Example 6.6 (cont) h[n] p[n] 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

46 Homomorphic Filtering
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Filtering In the cepstral domain: Pseudo-time  Quefrency Low Quefrency  Slowly varying components. High Quefrency  Fast varying components. Removal of unwanted components (i.e., filtering) can be attempted in the cepstral domain (on the signal , in which case filtering is referred to as liftering): When the complex cestrum of h[n] resides in a quefrency interval less than a pitch period, then the two components can be separated form each other. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

47 Homomorphic Filtering
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Filtering If log[X()] Is viewed as a “time signal” Consisting of low-frequency and high-frequency contributions. Separation of this signal with a high-pass/low-pass “filter”. One implementation of low pass “filter”: + + + + x[n]=h[n]*p[n] D* l[n] D* -1 y[n] 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

48 Homomorphic Filtering
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Filtering Alternate view of “liftering” operation: Filtering operation L() applied in the log-spectral domain Interchange of time and frequency domain by viewing the frequency-domain signal log[X()] as a time signal to be filtered. ⇒ “Cepstrum” can be thought of as spectrum of log[X ()] Time axes of is referred to as “quefrency” Filter l[n] as the “lifter”. ^ ^ L() Y() X() x[n]= h[n]*p[n] F log F-1 l[n] F exp F-1 y[n] 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

49 Homomorphic Filtering
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Filtering Three elements in the doted lines of previous figure can be replaced by L(), which can be viewed as a smoothing function: ^ ^ x[n] =h[n]*p[n] X() Y() F log L() exp F-1 y[n] 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

50 Practical Implementation Issues
Digital Systems: Hardware Organization and Design 11/21/2018 Practical Implementation Issues Use FFT and IFFT for Fourier Transformations. X() is computed by: log|X()| computed as And for x[n] use ^ 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

51 Practical Implementation Issues
Digital Systems: Hardware Organization and Design 11/21/2018 Practical Implementation Issues ^ Cepstrum x[n] is infinitely long thus xN[n] is aliased version of x[n]. That is: Thus it is necessary to use a largest N as possible Phase component j∠X(k) must be properly unwrapped to ensure phase continuity. Goal to determine r[k] so that ∠X(k) is continuous. ^ ^ 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

52 Modulo 2 Phase Unwrapper
Digital Systems: Hardware Organization and Design 11/21/2018 Modulo 2 Phase Unwrapper Goal is to determine r[k] so that X(k) is continuous PV[X()] PV[X(k)] Principal Value PV - 2/N Phase Representation in Discrete Complex Spectrum 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

53 Modulo 2 Phase Unwrapper
Digital Systems: Hardware Organization and Design 11/21/2018 Modulo 2 Phase Unwrapper Algorithm: If PV[X(k)]-PV[X(k-1)]>2- r[k]=r[k-1]-1 # Subtract 2 Else if PV[X(k)]-PV[X(k-1)]<2- r[k]=r[k-1]+1 # Add 2 Else r[k]=r[k-1] # Do not change End Note: Even with fine grid of (determined by N) 2/N, it is possible that subsequent PV samples may be more than 2 rad apart (case of poles/zeros close together). 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

54 Phase Derivate-Based Phase Unwrapper
Digital Systems: Hardware Organization and Design 11/21/2018 Phase Derivate-Based Phase Unwrapper The phase derivative is uniquely defined by: Then: However, since only X(k) is available must estimate from discrete values. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

55 Phase Derivate-Based Phase Unwrapper
Digital Systems: Hardware Organization and Design 11/21/2018 Phase Derivate-Based Phase Unwrapper Re-state the Problem: Where q(k) is an integer-valued function. Assuming that phase has been correctly unwrapped up-to k-1 with the value (k-1) then: An approximation: Select value of q(k) such that E[k] is minimized: over q(k). 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

56 Digital Systems: Hardware Organization and Design
11/21/2018 Example 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

57 Short-Time Homomorphic Analysis of Periodic Sequences
21 November 2018 Veton Këpuska

58 Short-Time Homomorphic Analysis of Periodic Sequences
Digital Systems: Hardware Organization and Design 11/21/2018 Short-Time Homomorphic Analysis of Periodic Sequences Recall Source-System model of speech production: For voiced speech p[n] is quasi-periodic: For unvoiced speech p[n] is noise-like. In practice a periodic waveform is windowed by a finite-length sequence w[n]: s[n]=w[n]x[n]=w[n](p[n]*h[n]) Approximation to s[n]: h[n] p[n] x[n]= h[n]*p[n] 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

59 Short-Time Homomorphic Analysis of Periodic Sequences
Digital Systems: Hardware Organization and Design 11/21/2018 Short-Time Homomorphic Analysis of Periodic Sequences If w[n] is smooth relative to h[n], that is, P large enough so that h[n-kP] do not substantially overlap, then: Then, Cepstrum of s[n] is: where is complex cepstrum of w[n]p[n]. Can show that: D[n] – weighting function depending on w[n]. …………() 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

60 Cepstral Domain (Quefrency) Perspective
Digital Systems: Hardware Organization and Design 11/21/2018 Cepstral Domain (Quefrency) Perspective Under what conditions can we perform deconvolution? Cepstral Domain (Quefrency) Perspective Let x[n], a voiced speech signal, produced by an infinite train of periodic impulses: Thus the only samples in X() and log[X()] are defined at multiples of the fundamental frequency o=2/P, i.e., k=(2/P)k X(k) = P(k) H(k) log[X(k)] = log[P(k)] + log[H(k)] 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

61 Cepstral Domain (Quefrency) Perspective
Digital Systems: Hardware Organization and Design 11/21/2018 Cepstral Domain (Quefrency) Perspective In the cepstral domain, appear as a set of replicas of h[n] appearing at every kP. Thus, aliasing is an issue and needs to be handled properly. That is: Can this aliasing be prevented or at least minimized? Consider: s[n]=w[n]x[n]=w[n](p[n]*h[n]) ^ 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

62 Cepstral Domain (Quefrency) Perspective
Digital Systems: Hardware Organization and Design 11/21/2018 Cepstral Domain (Quefrency) Perspective Let’s rewrite s[n] as: s[n] = (p[n]w[n])*g[n] where g[n] ≈ h[n]. Then: Taking log of equations under and , and solving for log[G()] the following is obtained: ………(1) 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

63 Cepstral Domain (Quefrency) Perspective
Digital Systems: Hardware Organization and Design 11/21/2018 Cepstral Domain (Quefrency) Perspective To simplify, assume W() has only one main lobe of rectangular window: That is: with wo=2/P` 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

64 Cepstral Domain (Quefrency) Perspective
Digital Systems: Hardware Organization and Design 11/21/2018 Cepstral Domain (Quefrency) Perspective Thus second log term becomes zero: ………(2) 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

65 Cepstral Domain (Quefrency) Perspective
Digital Systems: Hardware Organization and Design 11/21/2018 Cepstral Domain (Quefrency) Perspective From (1) and (2) we can write: where is the complex cepstrum of p[n]w[n], and is the complex cepstrum of h[n] and w[n] is the inverse Fourier transform of the rectangular function W(w). The result is illustrated in Figure g.15. …………() 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

66 Figure 6.15. Quefrency 21 November 2018 Veton Këpuska

67 Cepstral Domain (Quefrency) Perspective
Digital Systems: Hardware Organization and Design 11/21/2018 Cepstral Domain (Quefrency) Perspective Last equation () is a special case of Equation () with D[n]=w[n]. As with purely convolutional model: the contributions of the windowed pulse train and impulse response are additively combined so that deconvolution is possible. Now the impulse response contribution is repeated at the pitch period rate. This aliasing is: Dependent upon pitch, and is different from aliasing due to an Insufficient DFT length (see section 6.4.4). 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

68 Cepstral Domain (Quefrency) Perspective
Digital Systems: Hardware Organization and Design 11/21/2018 Cepstral Domain (Quefrency) Perspective Conditions under which: s[n]≈(w[n]p[n])*h[n] w[n] – time domain window, should be long enough so that D[n] should be smooth over |n|<P over the extent of w[n] – should be short enough to reduce contribution of replicas of In practice w[n] is Hamming window of 2-3 pitch periods long. w[n] should be centered at time origin, n=0, aligned with h[n]. ^ 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

69 Cepstral Domain (Quefrency) Perspective
Under those conditions for low-time lifter (filter in cepstral domain), l[n] of the length |n|<P/2  That is, complex cepstrum is close to that derived from conventional model. Note that with high-pitched speakers there is stronger presence of p[n] close to the origin (as noted earlier) as well as more aliasing of replicas of h[n]. ^ 21 November 2018 Veton Këpuska

70 Frequency Domain Perspective
Digital Systems: Hardware Organization and Design 11/21/2018 Frequency Domain Perspective Let x[n] where: Then: X(k)=P(k) H(k) Where X(k) represents line spectrum at k=(2/P)k. Question arises: Under what conditions the window properties would lead: the output to be close to actual: s[n]=w[n]x[n]=w[n](p[n]*h[n])? 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

71 Frequency Domain Perspective
Digital Systems: Hardware Organization and Design 11/21/2018 Frequency Domain Perspective Define an error measure E() that would reflect degradation in the frequency domain: Want to minimize: It was found empirically that for Hamming window this spectral distance measure is minimized for window length in the range of roughly 2-3 pitch periods. An implication of this result is that the length of the analysis window should be adapted to the pitch period to make the windowed waveform as close as possible (in the sense described above) to the desired convolutional model. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

72 Short-Time Speech Analysis
21 November 2018 Veton Këpuska

73 Short-Time Speech Analysis
Digital Systems: Hardware Organization and Design 11/21/2018 Short-Time Speech Analysis Complex Cepstrum of Voiced Speech Recall: H(z)=AG(z)V(z)RL(z) The output speech then is: Gain Lip Radiation Model Glottal Model Vocal tract Model 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

74 Complex Cepstrum of Voiced Speech
Digital Systems: Hardware Organization and Design 11/21/2018 Complex Cepstrum of Voiced Speech General form for stable V(z): Zeros inside & outside the unit circle Poles inside the unit circle Goal is to separate h[n] from p[n]. Let s[n]=w[n](p[n]*h[n]) be approximately equal to 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

75 Complex Cepstrum of Voiced Speech
Digital Systems: Hardware Organization and Design 11/21/2018 Complex Cepstrum of Voiced Speech ~ Recall that x[n]≈s[n] if window is 2-3 pitch-periods long and its center aligned with h[n]. Using the DFT of order N the following denotes discrete complex cepstrum: For a typical speaker the duration of the short-time window lies in the range of 20ms-40ms. Assuming that: Source and systems components lie roughly in separate quefrency regions Negligible aliasing of the replicas of h[n] Most of the h[n] occurs within P/2 from origin Distortion function D[n] is smooth in the same range for |n|<P/2 and thus it makes other higher order replicas negligible for |n|>P/2. Then, applying a cepstral lifter function: ^ ^ 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

76 Complex Cepstrum of Voiced Speech
Digital Systems: Hardware Organization and Design 11/21/2018 Complex Cepstrum of Voiced Speech Low-Quefrency lifter: to separate h[n] from p[n]. Similarly high-quefrency lifter can be used to produce the input train pulse (pitch estimation). ^ ^ 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

77 Digital Systems: Hardware Organization and Design
11/21/2018 Example 6.11 Voiced female speech with pitch period of 5 ms. Sampling rate fs=10kHz. Hamming window of 15 ms. A 1024 point FFT/IFFT is used to obtain discrete complex cepstrum. Center window on h[n] (more about that latter). 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

78 Digital Systems: Hardware Organization and Design
11/21/2018 Example 6.11 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

79 Digital Systems: Hardware Organization and Design
11/21/2018 Example 6.11 Maximum Phase Minimum Phase Maximum Phase Minimum Phase 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

80 Complex Cepstrum of Unvoiced Speech
Digital Systems: Hardware Organization and Design 11/21/2018 Complex Cepstrum of Unvoiced Speech Recall the transfer function model for the unvoiced speech: H(z) = AV(z)R(z) In contrast to the voiced case, there is no glottal volume velocity contribution. Resulting speech waveform in time domain: x[n]=u[n]*h[n]=u[n]*v[n]*r[n] Resulting signal after applying short time analysis window: s[n]=w[n](u[n]*h[n]) White noise 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

81 Complex Cepstrum of Unvoiced Speech
Digital Systems: Hardware Organization and Design 11/21/2018 Complex Cepstrum of Unvoiced Speech Similarly to the arguments applied for voiced speech: Duration of the analysis window w[n] is selected so that the formant of the unvoiced speech power spectral density are not significantly broadened w[n] is sufficiently smooth so as to be as nearly constant over h[n] the following can be assumed: s[n]≈(w[n]u[n])*h[n] Defining the windowed white noise as q[n] = u[n]w[n], and Computing discrete complex cepstrum with N-point DFT 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

82 Complex Cepstrum of Unvoiced Speech
Digital Systems: Hardware Organization and Design 11/21/2018 Complex Cepstrum of Unvoiced Speech qN[n] – the discrete complex cepstrum of the noise source covers all quefrencies, and thus separation is not possible. Phase unwrapping of noisy signals is very unreliable. Real cepstrum is adequate for unvoiced speech (phase information not important for this case) resulting in minimum-phase versions of h[n]. Deconvolved excitation may contain interesting fine source structure for classes of sounds; e.g., voiced fricatives. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

83 Analysis/Synthesis Structure
21 November 2018 Veton Këpuska

84 Analysis/Synthesis Structure
Digital Systems: Hardware Organization and Design 11/21/2018 Analysis/Synthesis Structure In speech analysis underlying parameters of the speech model are estimated In speech synthesis stage the waveform is reconstructed from the model parameters. Liftering of low-quefrency region of the cepstrum ⇒ provides an estimate of the system impulse response Liftering of high-quefrency region of the cepstrum ⇒ provides an estimate of source excitation signal. Inverting the estimate of the source signal with homomorphic system to obtain excitation function. Convolution of the two resulting component estimates yields the original short-time segment exactly. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

85 Analysis/Synthesis Structure
Digital Systems: Hardware Organization and Design 11/21/2018 Analysis/Synthesis Structure With an overlap-add reconstruction from the short-time segments, the entire waveform is recovered. The homomorphic system performs transformation with no information reduction. This process is analogous to reconstructing the waveform, in linear prediction analysis/synthesis, from the convolution of the all-pole filter and the output of its inverse filter. In speech coding and speech modification applications a more efficient representation is desired. Complex or real cepstrum provides an approach to such a representation because pitch and voicing can be estimated from the peak (or lack of peak) in the high-quefrency region of the cepstrum. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

86 Zero and Minimum-Phase Synthesis
Digital Systems: Hardware Organization and Design 11/21/2018 Zero and Minimum-Phase Synthesis Assuming that we have a succinct and accurate characterization of the speech production source (as with linear prediction-based analysis/synthesis), able to synthesize an estimate of the speech waveform. This synthesis can be performed based on any one of several possible phase functions: Zero-phase, Minimum-phase, maximum-phase Mixed-phase functions 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

87 Zero and Minimum-Phase Synthesis
Digital Systems: Hardware Organization and Design 11/21/2018 Zero and Minimum-Phase Synthesis General framework for homomorphic analysis/synthesis: Real Cepstrum 1024-point P/2 Analysis window of ms 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

88 Mixed-Phase Synthesis
Digital Systems: Hardware Organization and Design 11/21/2018 Mixed-Phase Synthesis Example 6.13 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

89 Contrasting Linear Predication and Homomorphic Filtering
Digital Systems: Hardware Organization and Design 11/21/2018 Contrasting Linear Predication and Homomorphic Filtering Homomorphic Filtering is viewed as an alternative to linear prediction. Linear Prediction Homomorphic Filtering Parametric Non-parametric Sharp smooth resonances Wider spurious resonances All-pole representation Poles and zeros can be represented. Minimum-phase response estimate only Minimum-phase as well as Mixed-phase if complex cepstrum is used. Synthesized speech “crisper” but more “mechanical” Synthesized speech more “natural” but “muffled” 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

90 Contrasting Linear Predication and Homomorphic Filtering
Digital Systems: Hardware Organization and Design 11/21/2018 Contrasting Linear Predication and Homomorphic Filtering Similar problems with both methods: Linear Prediction Homomorphic Filtering Increased speech distortion with increasing pitch Aliasing of the vocal tract impulse response at the pitch period repetition rate Linear prediction windowing results in the prediction of nonzero values of the waveform from zeros outside the window. Windowing a periodic waveform distorts the convolutional model. Number of poles is required The length of the low-quefrency lifter must be chosen Best window and order selection is often a function of the pitch of the speaker. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

91 Homomorphic Prediction
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Prediction Number of speech analysis methods rely on combining homomorphic filtering with linear prediction and are referred to collectively as homomorphic prediction. Two primary advantages of combining the methods: By reducing the effects of waveform periodicity, an all-pole estimate suffers less from the effect of high-pitch aliasing. By removing ambiguity in waveform alignment, zero estimation can be performed without the requirement of pitch-synchronous analysis. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

92 Homomorphic Prediction
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Prediction Waveform Periodicity: Recall that for the waveform consisting of the convolution of a short-time impulse train and an impulse response: x[n]=p[n]*h[n] Autocorrelation function is given by the convolution of the autocorrelation function of the response and that of the impulse train: rx[]=rh[]*rp[] Thus, as the spacing between impulses (the pitch period) decreases, the autocorrelation function of the impulse response suffers form increasing distortion. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

93 Homomorphic Prediction
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Prediction Thus if spectrogram magnitude of h[n] can be estimated accurately then linear prediction analysis can be performed with an estimate of rh[] free of the waveform periodicity. This leads to the following idea: Use homomorphic filtering to deconvolve and estimate of h[n] by low-pass liftering the real or complex cepstrum of x[n]. Use autocorrelation method on the resulting impulse response estimate by linear prediction analysis to obtain the model parameters. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

94 Digital Systems: Hardware Organization and Design
11/21/2018 Example 6.14 Suppose h[n] is a minimum-phase all-pole sequence of order p. Consider a waveform x[n] constructed by convolving h[n] with a sequence p[n] where: p[n] = [n] + [n-N], with <1 Complex cepstrum of x[n] is given by: Where and are the complex cepstra of p[n] and h[n], respectively. The autocorrelation function is given by: rx[] = (1+2)rh[] + rh[-N] + rh[+N] rx[] is rh[] distorted by its neighboring terms centered at =+N and =-N. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

95 Homomorphic Prediction
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Prediction Important point of previous example: The first p coefficients of the real cepstrum of x[n] are undistorted (if a long-enough DFT length is used in the computation) The first p coefficients of the autocorrelation function rx[] of the waveform are distorted by aliasing of autocorrelation replicas (regardless of the DFT length) Cepstral lowpass lifter of duration less than p extracts a smoothed and not aliased version of the spectrum. Linear prediction coefficients can alternatively be obtained exactly through the recursive relation between the real cepstrum and predictor coefficients of the all-pole model when h[n] is all-pole (Exercise 6.13). 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

96 Homomorphic Prediction
Digital Systems: Hardware Organization and Design 11/21/2018 Homomorphic Prediction Zero Estimation: Consider a transfer function of poles and zeros of the form: Also consider a sequence x[n]=h[n]*p[n] where p[n] is a periodic impulse train. Suppose that: Estimate of h[n] is obtained through homomorphic filtering of x[n] Number of poles and zeros is known and Linear-phase component z-r has been removed. Then poles of h[n] can be estimated using the covariance method of linear predication. Other methods can be used (e.g., Shanks method described in Chapter 5) to estimate zeros. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

97 Homographic Prediction
Digital Systems: Hardware Organization and Design 11/21/2018 Homographic Prediction 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

98 Digital Systems: Hardware Organization and Design
11/21/2018 Summary This chapter focus was on the use of Homomorphic filtering with application to deconvolution-separation of source from a system. The presented methodology is general and can be applied not only to deconvolution of vocal tract from glottal source. Example Applications: Control of dynamic range of multiplicatively combined signals (Exercise 6.19) Recovery of speech from degraded recordings. Old acoustic recordings suffer from convolutional distortion imparted by an acoustic horn that can be approximated by a linear resonant filter. See Exercise 6.20 for details. In image processing, homomorphic filtering can be used for contrast enhancement (See Oppenheim and Shafer Book, “Digital Signal Processing”, p487, Prentice Hall 1975.) 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor

99 Digital Systems: Hardware Organization and Design
11/21/2018 Summary Homomorphic processing is applied in the phase Vocoder and sinewave analysis/synthesis. It also has been found useful in speech coding (Chapter 12) Speaker Recognition (Chapter 14) It also a basis for mel-cepstrum; Fourier Transform of a constant-Q filtered log-spectrum. Mel-cepstrum it is hypothesized that it approximates signal processing in the early stages of human auditory perception. Homomorphic filtering applied along the temporal trajectories of the mel-cepstral coefficients can be used to remove convolutional channel distortions even when the cepstrum of these distortions overlaps the cepstrum of speech (Chapter 13): Cepstral Mean Subtraction and RASTA processing. 21 November 2018 Veton Këpuska Architecture of a Respresentative 32 Bit Processor


Download ppt "Digital Systems: Hardware Organization and Design"

Similar presentations


Ads by Google