Download presentation
Presentation is loading. Please wait.
1
On improving the intelligibility of synchronized over-lap-and-add (SOLA) at low TSM factor Wong, P.H.W.; Au, O.C.; Wong, J.W.C.; Lau, W.H.B. TENCON '97. IEEE Region 10 Annual Conference. Fast time scale modification using envelope-matching technique (EM- TSM) Wong, J.W.C.; Au, O.C.; Wong, P.H.W. Circuits and Systems, 1998. ISCAS '98.
2
Outline n Introduction n Review of Synchronized Overlap-and-Add (SOLA) n Modified SOLA n Simulation and Results n Conclusions
3
Introduction n TSM (Time-Scale Modification) –to change the time scale of a signal –to make degraded speech more intelligible TSM factor α –α = 1 : the signal is unchanged –α > 1 : the signal is time expanded –α < 1 : the signal is time compressed
4
Introduction n TSM algorithms –time domain techniques OLA, SOLA …. –frequency domain techniques LSEE_MSTFTM (Least Square Error Estimation from Modified Short Time Fourier Transform Magnitude)
5
Introduction n SOLA –based on OLA which simply overlaps and adds adjacent frames –overlaps only at the points with highest similarity between the two overlapping frames.
6
Review of SOLA n x[n] : the analysis signal (input) –be segmented into frames that are a distance of Sa apart n y[n] : the synthesis signal (output) –be segmented into frames that are a distance of Ss apart
7
Review of SOLA 0 y[n] Ss2Ss3Ss x[n] 0Sa2Sa3Sa k min k max Ss = Sa x α
8
Review of SOLA n The normalized cross-correlation function
9
n On improving the intelligibility of synchronized over-lap-and-add (SOLA) at low TSM factor
11
Modified SOLA for small TSM factor Use a time varying TSM factor α(t), rather than a fixed constant α. α should be small when adjacent analysis frames are very similar and high when they are not so similar. n In addition, remove the silent frames characterized by very little frame energy. Use the cross-correlation as a check.
12
Modified SOLA for small TSM factor 1. All frames are tested for silent frames which are discarded. 2. All non-silent frames are assumed to be vowel- like frames and are to use a smaller-than-target TSM factor. 3. If the cross-correlation ever exceeds 0.9 with in the search range, the frame is confirmed to be vowel-like and the first peak above 0.9 will be considered the optimal position.
13
Modified SOLA for small TSM factor 4. If the cross correlation does not exceed 0.9 throughout the search range, the frame is considered a transient frame and a larger TSM factor is used. The search range is extended to cover the range for and further searching is done.
14
Simulation and Results
16
Fast time scale modification using envelope-matching technique (EM-TSM)
17
The envelope matching technique
18
Simulation and Results
19
n The mean square difference n the smaller in mean square difference indicates better quality.
20
Simulation and Results
21
Conclusions n Using time varying time scale factor rather than a constant one improves the intelligibility when the TSM factor is small. n By the fast technique for measuring the signal similarities, speed up factor in the order of 10 2 can be obtained with very good speech quality.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.