Speech Recognition Christian Schulze

Slides:



Advertisements
Similar presentations
Neural networks Introduction Fitting neural networks
Advertisements

Slide number 1 EE3P BEng Final Year Project Group Session 2 Processing Patterns using a Multi- Layer Perceptron (MLP) Martin Russell.
Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
3.1 Chapter 3 Data and Signals Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Chapter 3 Data and Signals
Signals The main function of the physical layer is moving information in the form of electromagnetic signals across a transmission media. Information can.
3.1 Chapter 3 Data and Signals Computer Communication & Networks.
Note To be transmitted, data must be transformed to electromagnetic signals.
Speech Sound Production: Recognition Using Recurrent Neural Networks Abstract: In this paper I present a study of speech sound production and methods for.
Non-linear classification problem using NN Fainan May 2006 Pattern Classification and Machine Learning Course Three layers Feedforward Neural Network (FFNN)
Neural Net Algorithms for SC Vowel Recognition Presentation for EE645 Neural Networks and Learning Algorithms Spring 2003 Diana Stojanovic.
A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING CS 525 : Project Presentation PALDEN LAMA and MOUNIKA NAMBURU.
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 1 Creating Data Representations On the other hand, sets of orthogonal vectors.
Communications & Multimedia Signal Processing Analysis of Effects of Train/Car noise in Formant Track Estimation Qin Yan Department of Electronic and Computer.
A PRESENTATION BY SHAMALEE DESHPANDE
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Traffic Sign Recognition Using Artificial Neural Network Radi Bekker
Representing Acoustic Information
Soft Computing Colloquium 2 Selection of neural network, Hybrid neural networks.
Audio classification Discriminating speech, music and environmental audio Rajas A. Sambhare ECE 539.
Isolated-Word Speech Recognition Using Hidden Markov Models
Kumar Srijan ( ) Syed Ahsan( ). Problem Statement To create a Neural Networks based multiclass object classifier which can do rotation,
Waqas Haider Khan Bangyal. Multi-Layer Perceptron (MLP)
Chapter 3 Data and Signals
COMPARISON OF IMAGE ANALYSIS FOR THAI HANDWRITTEN CHARACTER RECOGNITION Olarik Surinta, chatklaw Jareanpon Department of Management Information System.
Basics of Neural Networks Neural Network Topologies.
So Far……  Clustering basics, necessity for clustering, Usage in various fields : engineering and industrial fields  Properties : hierarchical, flat,
Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )
Korean Phoneme Discrimination Ben Lickly Motivation Certain Korean phonemes are very difficult for English speakers to distinguish, such as ㅅ and ㅆ.
Soft Computing Lecture 8 Using of perceptron for image recognition and forecasting.
Speech Recognition Feature Extraction. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 25 Nov 4, 2005 Nanjing University of Science & Technology.
Learning Long-Term Temporal Feature in LVCSR Using Neural Networks Barry Chen, Qifeng Zhu, Nelson Morgan International Computer Science Institute (ICSI),
An Artificial Neural Network Approach to Surface Waviness Prediction in Surface Finishing Process by Chi Ngo ECE/ME 539 Class Project.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Artificial Neural Networks (Cont.) Chapter 4 Perceptron Gradient Descent Multilayer Networks Backpropagation Algorithm 1.
David DuemlerMartin Pendergast Nick KwolekStephen Edwards.
BACKPROPAGATION (CONTINUED) Hidden unit transfer function usually sigmoid (s-shaped), a smooth curve. Limits the output (activation) unit between 0..1.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Data and Signals. To be transmitted, data must be transformed to electromagnetic signals. Note.
Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.
An Introduction To The Backpropagation Algorithm.
Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.
Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics.
Who Cares About the Arts? Predicting Formal Arts Participation from Survey Data Angela Han ECE 539 December 2005.
Part II Physical Layer Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optical RESERVOIR COMPUTING
A Signal Processing Approach to Vibration Control and Analysis with Applications in Financial Modeling By Danny Kovach.
The Gradient Descent Algorithm
Chapter 2 Data and Signals
Speaker Classification through Deep Learning
Presentation on Artificial Neural Network Based Pathological Voice Classification Using MFCC Features Presenter: Subash Chandra Pakhrin 072MSI616 MSC in.
Lecture 12. MLP (IV): Programming & Implementation
CHAPTER 3 DATA AND SIGNAL
Lecture 12. MLP (IV): Programming & Implementation
شبکه عصبی تنظیم: بهروز نصرالهی-فریده امدادی استاد محترم: سرکار خانم کریمی دانشگاه آزاد اسلامی واحد شهرری.
Cache Replacement Scheme based on Back Propagation Neural Networks
Notes Assignments Tutorial problems
Ala’a Spaih Abeer Abu-Hantash Directed by Dr.Allam Mousa
Physical Layer Part 1 Lecture -3.
An Introduction To The Backpropagation Algorithm
AUDIO SURVEILLANCE SYSTEMS: SUSPICIOUS SOUND RECOGNITION
Convolutional Neural Networks
Discrete Fourier Transform
Signals and Systems Lecture 11
3.1 Chapter 3 Data and Signals Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Artificial Neural Networks / Spring 2002
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Speech Recognition Christian Schulze Design of a speech recognition system which distinguishes the figures 0 to 9 and the words yes/no Applications: - speech input of telephone numbers for cellular phones (necessary in cars) - announcement of the different floors in the elevator

Problem Storage of all patterns requires too much memory Algorithm which compares respective words with all stored patterns requires much calculation power => too costly and too expensive Instead of storing the whole signal storage of representative features of the signal => One possiblity: formants

What are formants? Speech consists of different tones which are combined with each other Every tone has a special spectrum in the frequency domain The maxima of the contour of the spectrum are called formants Every tone has its own representative formants (especially vowels)

Data collection (98 X 1) vector used as Recording of 50 analog samples per word Division of the signal into parts of 10 ms length Calculation of the spectrum using Discrete Fourier Transformation figure 8 (500 ms) Storage of the first two maxima => 2-Formants-Recognition-System Smoothing of the spectrum using Cepstral Algorithm (98 X 1) vector used as input vector for training of an MLP-network Assign the signal into 1 of 12 classes

Network and results MLP using back propagation algorithm 3 hidden layers, each with 12 hidden neurons Learning rate=0.01, Momentum=0.1 100000 epochs So far best solution: learning success rate = 86.11% testing success rate = 61,67% => has to be improved upon