Download presentation
Presentation is loading. Please wait.
1
Speech Recognition Christian Schulze
Design of a speech recognition system which distinguishes the figures 0 to 9 and the words yes/no Applications: - speech input of telephone numbers for cellular phones (necessary in cars) - announcement of the different floors in the elevator
2
Problem Storage of all patterns requires too much memory
Algorithm which compares respective words with all stored patterns requires much calculation power => too costly and too expensive Instead of storing the whole signal storage of representative features of the signal => One possiblity: formants
3
What are formants? Speech consists of different tones which are combined with each other Every tone has a special spectrum in the frequency domain The maxima of the contour of the spectrum are called formants Every tone has its own representative formants (especially vowels)
5
Data collection (98 X 1) vector used as
Recording of 50 analog samples per word Division of the signal into parts of 10 ms length Calculation of the spectrum using Discrete Fourier Transformation figure 8 (500 ms) Storage of the first two maxima => 2-Formants-Recognition-System Smoothing of the spectrum using Cepstral Algorithm (98 X 1) vector used as input vector for training of an MLP-network Assign the signal into 1 of 12 classes
6
Network and results MLP using back propagation algorithm
3 hidden layers, each with 12 hidden neurons Learning rate=0.01, Momentum=0.1 epochs So far best solution: learning success rate = 86.11% testing success rate = 61,67% => has to be improved upon
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.