Speech Recognition Christian Schulze

Speech Recognition Christian Schulze
Design of a speech recognition system which distinguishes the figures 0 to 9 and the words yes/no Applications: - speech input of telephone numbers for cellular phones (necessary in cars) - announcement of the different floors in the elevator

Problem Storage of all patterns requires too much memory
Algorithm which compares respective words with all stored patterns requires much calculation power => too costly and too expensive Instead of storing the whole signal storage of representative features of the signal => One possiblity: formants

What are formants? Speech consists of different tones which are combined with each other Every tone has a special spectrum in the frequency domain The maxima of the contour of the spectrum are called formants Every tone has its own representative formants (especially vowels)

Data collection (98 X 1) vector used as
Recording of 50 analog samples per word Division of the signal into parts of 10 ms length Calculation of the spectrum using Discrete Fourier Transformation figure 8 (500 ms) Storage of the first two maxima => 2-Formants-Recognition-System Smoothing of the spectrum using Cepstral Algorithm (98 X 1) vector used as input vector for training of an MLP-network Assign the signal into 1 of 12 classes

Network and results MLP using back propagation algorithm
3 hidden layers, each with 12 hidden neurons Learning rate=0.01, Momentum=0.1 epochs So far best solution: learning success rate = 86.11% testing success rate = 61,67% => has to be improved upon

Speech Recognition Christian Schulze

Similar presentations

Presentation on theme: "Speech Recognition Christian Schulze"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Speech Recognition Christian Schulze

Similar presentations

Presentation on theme: "Speech Recognition Christian Schulze"— Presentation transcript:

Similar presentations

About project

Feedback