Demetz Clément ECE 539 Final Project Fall 2003

Demetz Clément ECE 539 Final Project Fall 2003 Lip-recognition Software using a Kohonen Algorithm for Image Compression

Outline -Problem and motivation -Data creation: preprocessing
-Kohonen self organization map (SOM) -Multi-Layer perceptron -Final results -Conclusion -References

Problem Problem of voice recognition:
A combined approach always leads to better results For cell phone and PDA: voice recognition and visual recognition Lip-recognition Combined recognition Voice-recognition

Problem of lip-recognition software
Need high computational power. Need to be implement on low-power systems (PDA, cell phone) How can we reduce the size of the information? Pb: Find a way to implement such an algorithm with few computation.

Multi-Layer perceptron
Motivation Reduce the size of the image with a Kohonen Self organization map Filter Kohonen SOM Image of a cell phone digital camera Contour of the mouth Multi-Layer perceptron

Preprocessing -Starting with low quality JPEG pictures
-Gradient filters are applied to only keep the contour of the mouths. -the opening of the mouth is a relevant input: needs to follow a certain pattern to pronounce a sound. Dark part of the mouth Contour of the dark part JPEG picture of the mouth Pb: a contour corresponds to thousands points: it is still too large to have a low computation time

Kohonen Self Organisation Map (SOM)
-Idea of using a Kohonen self organization map to reduce the information to 12 neurons problems: Initialization Bad stretching or turning of the SOM

Kohonen SOM problems: Initialization
Bad stretching or turning of the SOM We want to keep all the information: here we are losing the left part

Kohonen SOM A way to avoid problems:
We link the first and the last neurons

Kohonen SOM Results of the Kohonen Map: we keep 12 points representing the contour:

Multi-Layer perceptron
We take the 12 points given by the SOM as inputs. SOM applied many times on each picture to create the database 3 classes of pictures: only 3 sounds, because the lip-recognition is a support to a voice recognition Training on 15 pictures, testing on 3 pictures.

Multi-Layer perceptron: Result
Layers alpha momentum Configuration (hidden l) Testing classification rate(%) Training classification rate(%) 2 0.1 0.8 10 27 33 0.05 73.33 93 0.01 92 100 3 10 10 52 76 100% Classification rate is obtained

Multi-Layer perceptron: Result
100% Classification rate is obtained With a 400 iterations training.

Conclusion Kohonen SOM reduces the problem to a 12 dimension problem (previously, working on pictures mean thousands dimension) . Multi-Layer perceptron needs a training, but once it is trained computations are made very fast. we can obtain a 100% classification rate with 3 sounds. Pb: because of Matlab, transforming picture into Matrix needs computations. (solution: use another language more picture processing-oriented)

Some references -Image compression by Self-Organized kohonen Map
Christophe Amerijckx, Philippe Thissen..IEE Transition on Neural Networks 1998. -SRAM bitmap shape recognition and sorting Using Neural Networks. Randall S. Collica. IEEE. -From your lips to your printer. James Fallow. -SRAM bitmap shape recognition and sorting using neural networks. Collica, R.S., Card, J.P., and Martin. W. ISBN -A kohonen Neural Network Controlled All-optical router system. E.E.E Frietman, M.T. Hill, G.D. Khoe.

Demetz Clément ECE 539 Final Project Fall 2003

Similar presentations

Presentation on theme: "Demetz Clément ECE 539 Final Project Fall 2003"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Demetz Clément ECE 539 Final Project Fall 2003

Similar presentations

Presentation on theme: "Demetz Clément ECE 539 Final Project Fall 2003"— Presentation transcript:

Similar presentations

About project

Feedback