Presentation is loading. Please wait.

Presentation is loading. Please wait.

Speech Enhancement with Binaural Cues Derived from a Priori Codebook

Similar presentations


Presentation on theme: "Speech Enhancement with Binaural Cues Derived from a Priori Codebook"— Presentation transcript:

1 Speech Enhancement with Binaural Cues Derived from a Priori Codebook
Students and Teachers, good afternoon. I am glad to have the chance to give my presentation here. Today I would like to talk to you  about some of our work in the field of the codebook-based speech enhancement. The tile of my presentation is”…” Reporter:Nan Chen Beijing University of Technology

2 Results and Conclusions
Contents Introduction 1 The Proposed Method 2 Results and Conclusions 3 4 I’d like to give this presentation in three parts. At first, I want to talk about the introduction of the presentation. Then the proposed method is described in detail. At last, we give the experimental results and the conclusions are summarized here.

3 Introduction 1

4 Noise Introduction Street Car Babble office

5 The traditional method of speech enhancement
Introduction Spectral-Subtractive Algorithms Wiener Filtering Statistical-Model-Based Methods Subspace Algorithms 1 2 3 4 The traditional method of speech enhancement Until now the monaural speech enhancement is a challenging task for speech communication, such as speech coding and speech recognition, . The traditional method …have obtained a good performance for stationary noise, but the performance of these methods become worse when the non-stationary noise is introduced. The reason why this problem happens is that we cannot gain the accurate noise estimation from the noisy observation. If we can know some prior information about speech and noise in advance, the performance will be better.

6 Introduction Binaural Cue Coding(BCC) Framework
Purpose: recovering the perception of the original input signals BCC analysis: extract the side information of input signals BCC synthesis: recover the input signals by making use of the side information and the mono signal Now I want to talk some about BCC framework. The figure 1 show the BCC framework. The purpose of BCC is …,From figure 1, we can know that… Figure 1 :Block diagram of analysis and synthesis for BCC

7 Introduction Once the Discrete Fourier transform (DFT) coefficients of mono signal is known, the DFT coefficients of each output channel Sc,k can be calculated as Where is the ICLD between channel 1 and channel c for the nth sub-band. , is a random variable which is controlled by ICC (1) (2) As can be seen in figure 1,,,, where f is used to determine a level modification of DFT coefficients, c is the index of the channel and n is the frequency index (3)

8 Introduction BCC : recovering the perception of the original input signals. speech enhancement : separate clean signal from the noisy signal. The BCC principle is introduced to estimate the clean signal. The noisy speech is enhanced by BCC principle where the channel 1 is assumed as the clean speech and the channel 2 is regarded as the noise. Clean speech Clean speech Noisy speech Noise Noise BCC aims at recovering the perception of the original input signals. Meanwhile, the main purpose of speech enhancement is to separate clean signal from the noisy signal. Due to this, we introduce the technique of BCC to the procedure of monaural speech enhancement 。。。But we need to find the appropriate side information.

9 The Proposed Method 2 4

10 The Proposed Method Side Information The Clean Cue
speech and noise level difference (SNLD) speech and noise correlation (SNC) The Pre-enhanced Cue pre-enhanced speech and noise level difference (PNLD) pre-enhanced speech and noise correlation(PNC) posterior SNR (PSNR) speech presence probability (SPP) In BCC scheme, the binaural cues are considered as side information, but here. the clean cue, which is corresponding the binaural cues, can not be got directly. We obtain the clean cue through the pre-enhanced cue. So in out method, the side information contain the clean cue and pre---enhanced cue. the clean cue is …,the pre-enhanced cue is …

11 The Proposed Method Figure 2 describes the proposed method. We can see that The proposed method have two parts. One is offline training stage and the other is online enhancing stage. At training stage, the pre-enhanced speech is obtained through pre-processing. Then we can get the pre-enhanced cue. The clean cue is extracted from clean speech. And the noisy speech and the clean speech is one-to-one corresponding. At last, the pre-enhanced cue and the clean cue is used to train the codebook. At enhancing stage, we obtain the pre-enhanced speech first, then the online clean cue is estimated by weight codebook mapping with the trained codebook and online pre-enhanced cue. Figure 2: Block diagram of the proposed monaural speech enhancement method

12 The Proposed Method weighted codebook mapping algorithm
Figure 3 shows the scheme of weighted codebook mapping (WCBM) algorithm. Figure 3: Block diagram of the weighted codebook mapping

13 The Proposed Method Estimation of the clean cue:
1) By comparing the Euclidean distance (ED) between the online pre-enhanced cue and the trained pre-enhanced cue, we can choose M code-vectors with relative small ED from the trained codebook. 2) calculate the degree of membership ρ of the chosen code-vectors 3) the weight of each chosen code-vector can be defined as 4) the online clean cue is obtained by weighting the trained clean cue stored in the chosen code-vector. (4) The way to …by wcm algorithm is introduced here. (5)

14 The Proposed Method Speech Enhancement:
According to the BCC principle, we have: where is a random function with zero mean and constant variance. Finally, the noisy speech is enhanced by: (6) (7) after we get the online clean cue, which contain the speech and noise level deffirents and speech and noise correlation. We can enhance the noisy speech. (8)

15 Results and Conclusions
3 4

16 Results SSNR: This table shows the result of the segmental SNR improvement under different input SNR conditions in various noise. denotes the MMSE spectral amplitude estimate method And Ref. B indicates the codebook-based MMSE method From this table, we can see that the proposed method could get a better performance than the other two references in most cases.

17 Results PESQ: This table gives the test results of PESQ But we can find that the proposed method performs better than the other two references, especially under the noisy condition with high inut SNR.

18 Results LSD: In table 3, we show the test results of log spectrum distance. According to the results in table 3, the proposed method performs better than Ref. A. However, compared to Ref. B, it cannot have good performance in some noisy conditions. Ref. B models the spectral envelope, which makes it have a good performance in this table.

19 Results 5dB babble clean Ref.A poposed Ref.B
These are some demos for the reference and the two proposed methods, I will show you in the end. poposed Ref.B

20 Results 10dB babble clean Ref.A Ref.B poposed

21 Conclusions We enhance the noisy speech by modeling the spectral detail, which is the reason why it can reduce the noise between harmonics. The noise classification is cancelled because we introduce the binaural cues, which are not correlated with the type of noise, as priori information. In my presentation, we present two contributions

22 Thank You!


Download ppt "Speech Enhancement with Binaural Cues Derived from a Priori Codebook"

Similar presentations


Ads by Google