Download presentation
Presentation is loading. Please wait.
1
PATTERN COMPARISON TECHNIQUES
Test Pattern: Reference Pattern:
2
4.2 SPEECH (ENDPIONT) DETECTION
3
4.3 DISTORTION MEASURES-MATHEMATICAL CONSIDERATIONS
x and y: two feature vectors defined on a vector space X The properties of metric or distance function d: A distance function is called invariant if
4
PERCEPTUAL CONSIDERATIONS
Spectral changes that do not fundamentally change the perceived sound include:
5
PERCEPTUAL CONSIDERATIONS
Spectral changes that lead to phonetically different sounds include:
6
PERCEPTUAL CONSIDERATIONS
Just-discriminable change: known as JND (just-noticeable difference), DL (difference limen), or differential threshold
7
4.4 DISTORTION MEASURES-PERCEPTUAL CONSIDERATIONS
8
4.4 DISTORTION MEASURES-PERCEPTUAL CONSIDERATIONS
9
Spectral Distortion Measures
Spectral Density Fourier Coefficients of Spectral Density Autocorrelation Function
10
Spectral Distortion Measures
Short-term autocorrelation Then is an energy spectral density
11
Spectral Distortion Measures
Autocorrelation matrices
12
Spectral Distortion Measures
If σ/A(z) is the all-pole model for the speech spectrum, The residual energy resulting from “inverse filtering” the input signal with an all-zero filter A(z) is:
13
Spectral Distortion Measures
Important properties of all-pole modeling: The recursive minimization relationship:
14
LOG SPECTRAL DISTANCE
15
LOG SPECTRAL DISTANCE
16
CEPSTRAL DISTANCES The complex cepstrum of a signal is defined as
The Fourier transform of log of the signal spectrum.
17
Truncated cepstral distance
CEPSTRAL DISTANCES Truncated cepstral distance
18
CEPSTRAL DISTANCES
19
CEPSTRAL DISTANCES
20
Weighted Cepstral Distances and Liftering
It can be shown that under certain regular conditions, the cepstral coefficients, except c0, have: Zero means Variances essentially inversed proportional to the square of the coefficient index: If we normalize the cepstral distance by the variance inverse:
21
Weighted Cepstral Distances and Liftering
Differentiating both sides of the Fourier series equation of spectrum: This is an L2 distance based upon the differences between the spectral slopes
22
Cepstral Weighting or Liftering Procedure
h is usually chosen as L/2 and L is typically 10 to 16
23
weighted cepstral distance:
A useful form of weighted cepstral distance:
24
Likelihood Distortions
Previously defined: Itakura-Saito distortion measure Where and are one-step prediction errors of and as defined:
26
Likelihood Distortions
The residual energy can be easily evaluated by:
27
Likelihood Distortions
By replacing by its optimal p-th order LPC model spectrum: If we set σ2 to match the residual energy α : Which is often referred to as Itakura distortion measure
28
Likelihood Distortions
Another way to write the Itakura distortion measure is: Another gain-independent distortion measure is called the Likelihood Ratio distortion:
29
4.5.4 Likelihood Distortions
30
4.5.4 Likelihood Distortions
That is, when the distortion is small, the Itakura distortion measure is not very different from the LR distortion measure
31
4.5.4 Likelihood Distortions
32
4.5.4 Likelihood Distortions
Consider the Itakura-Saito distortion between the input and output of a linear system H(z)
33
4.5.4 Likelihood Distortions
34
4.5.4 Likelihood Distortions
35
4.5.5 Variations of Likelihood Distortions
Symmetric distortion measures:
36
4.5.5 Variations of Likelihood Distortions
COSH distortion
37
4.5.5 Variations of Likelihood Distortions
38
4.5.6 Spectral Distortion Using a Warped Frequency Scale
Psychophysical studies have shown that human perception of the frequency Content of sounds does not follow a linear scale. This research has led to the idea of defining subjective pitch of pure tones. For each tone with an actual frequency, f, measured in Hz, a subjective pitch is measured on a scale called the “mel” scale. As a reference point, the pitch of a 1 kHz tone, 40 dB above the perceptual hearing threshold, is defined as 1000 mels.
40
4.5.6 Spectral Distortion Using a Warped Frequency Scale
41
4.5.6 Spectral Distortion Using a Warped Frequency Scale
42
4.5.6 Spectral Distortion Using a Warped Frequency Scale
43
Examples of Critical bandwidth
44
Warped cepstral distance
b is the frequency in Barks, S(θ(b)) is the spectrum on a Bark scale, and B is the Nyquist frequency in Barks.
45
4.5.6 Spectral Distortion Using a Warped Frequency Scale
Where the warping function is defined by
46
4.5.6 Spectral Distortion Using a Warped Frequency Scale
47
4.5.6 Spectral Distortion Using a Warped Frequency Scale
48
4.5.6 Spectral Distortion Using a Warped Frequency Scale
49
4.5.6 Spectral Distortion Using a Warped Frequency Scale
Mel-frequency cepstrum: is the output power of the triangular filters Mel-frequency cepstral distance
50
4.5.7 Alternative Spectral Representations and Distortion Measures
51
4.5.7 Alternative Spectral Representations and Distortion Measures
52
4.5.7 Alternative Spectral Representations and Distortion Measures
53
Summary of Spectral Distortion Measures
Notation Expression Computation
54
Summary of Spectral Distortion Measures
Computation Expression Notation Distortion Measure
55
Summary of Spectral Distortion Measures
Computation Expression Notation Distortion Measure
56
4.6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE
57
4.6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE
Fitting the cepstral trajectory by a second order polynomial, Choose h1, h2, h3 such that E is minimized. Differentiating E with respect to h1, h2, and h3 and setting to zero results in 3 equations:
58
The solutions to these equations are:
4.6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE The solutions to these equations are:
59
4.6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE
60
4.6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE
61
4.6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE
A differential spectral distance: A second differential spectral distance:
62
Cepstral weighting or liftering by differentiating
4.6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE Cepstral weighting or liftering by differentiating
63
A weighted differential cepstral distance:
4.6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE A weighted differential cepstral distance:
64
Taking the L2 distance as an example:
4.6 INCORPORATION OF SPECTRAL DYNAMIC FEATURES INTO THE DISTORTION MEASURE Taking the L2 distance as an example:
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.