Automated Lip reading technique for people with speech disabilities by converting identified visemes into direct speech using image processing and machine.

Slides:



Advertisements
Similar presentations
Matthias Wimmer, Bernd Radig, Michael Beetz Chair for Image Understanding Computer Science TU München, Germany A Person and Context.
Advertisements

Advanced Image Processing Student Seminar: Lipreading Method using color extraction method and eigenspace technique ( Yasuyuki Nakata and Moritoshi Ando.
Audio Visual Speech Recognition
Model-based Image Interpretation with Application to Facial Expression Recognition Matthias Wimmer
Dual-domain Hierarchical Classification of Phonetic Time Series Hossein Hamooni, Abdullah Mueen University of New Mexico Department of Computer Science.
Data Visualization STAT 890, STAT 442, CM 462
Computer and Robot Vision I
Face Databases. Databases for Face Recognition The appearance of a face is affected by many factors because of its non-rigidity and complex 3D structure:
Exchanging Faces in Images SIGGRAPH ’04 Blanz V., Scherbaum K., Vetter T., Seidel HP. Speaker: Alvin Date: 21 July 2004.
Overview of Computer Vision CS491E/791E. What is Computer Vision? Deals with the development of the theoretical and algorithmic basis by which useful.
Face Detection: a Survey Speaker: Mine-Quan Jing National Chiao Tung University.
CONTENT BASED FACE RECOGNITION Ankur Jain 01D05007 Pranshu Sharma Prashant Baronia 01D05005 Swapnil Zarekar 01D05001 Under the guidance of Prof.
CS335 Principles of Multimedia Systems Multimedia and Human Computer Interfaces Hao Jiang Computer Science Department Boston College Nov. 20, 2007.
CRICOS No J † e-Health Research Centre/ CSIRO ICT Centre * Speech, Audio, Image and Video Research Laboratory Comparing Audio and Visual Information.
Face Recognition: An Introduction
Face Recognition and Retrieval in Video Basic concept of Face Recog. & retrieval And their basic methods. C.S.E. Kwon Min Hyuk.
Mahesh Sukumar Subramanian Srinivasan. Introduction Face detection - determines the locations of human faces in digital images. Binary pattern-classification.
Introduction to machine learning
Machine Learning Damon Waring 22 April of 15 Agenda Problem, Solution, Benefits Problem, Solution, Benefits Machine Learning Overview/Basics Machine.
Matthias Wimmer, Bernd Radig, Michael Beetz Chair for Image Understanding Computer Science Technische Universität München Adaptive.
Face Recognition Using EigenFaces Presentation by: Zia Ahmed Shaikh (P/IT/2K15/07) Authors: Matthew A. Turk and Alex P. Pentland Vision and Modeling Group,
REAL TIME EYE TRACKING FOR HUMAN COMPUTER INTERFACES Subramanya Amarnag, Raghunandan S. Kumaran and John Gowdy Dept. of Electrical and Computer Engineering,
Facial Feature Detection
Final Presentation. Lale AkarunOya Aran Alexey Karpov Milos Zeleny Hasim Sak Erinc Dikici Alp Kindiroglu Marek Hruz Pavel Campr Daniel Schorno Alexander.
Eng. Shady Yehia El-Mashad
EWatchdog: An Electronic Watchdog for Unobtrusive Emotion Detection based on Usage Analysis Rayhan Shikder Department.
Eigenedginess vs. Eigenhill, Eigenface and Eigenedge by S. Ramesh, S. Palanivel, Sukhendu Das and B. Yegnanarayana Department of Computer Science and Engineering.
Multimedia Specification Design and Production 2013 / Semester 2 / week 8 Lecturer: Dr. Nikos Gazepidis
Zhengyou Zhang Microsoft Research Digital Object Identifier: /MMUL Publication Year: 2012, Page(s): Professor: Yih-Ran Sheu Student.
(CMSC5720-1) MSC projects by Prof K.H. Wong (21 July2014) (shb907) MSC projects supervised by Prof.
 Speech is bimodal essentially. Acoustic and Visual cues. H. McGurk and J. MacDonald, ''Hearing lips and seeing voices'', Nature, pp , December.
Face Recognition System By Arthur. Introduction  A facial recognition system is a computer application for automatically identifying or verifying a person.
Project title : Automated Detection of Sign Language Patterns Faculty: Sudeep Sarkar, Barbara Loeding, Students: Sunita Nayak, Alan Yang Department of.
Facial Feature Extraction Yuri Vanzine C490/B657 Computer Vision.
HCI / CprE / ComS 575: Computational Perception Instructor: Alexander Stoytchev
Towards Coastal Threat Evaluation Decision Support Presentation by Jacques du Toit Operational Research University of Stellenbosch 3 December 2010.
22CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 10: Advanced Input.
Template matching and object recognition. CS8690 Computer Vision University of Missouri at Columbia Matching by relations Idea: –find bits, then say object.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
1 Terrorists Face recognition of suspicious and (in most cases) evil homo-sapiens.
Real-Time Detection, Alignment and Recognition of Human Faces Rogerio Schmidt Feris Changbo Hu Matthew Turk Pattern Recognition Project June 12, 2003.
Higher Vision, language and movement. Strong AI Is the belief that AI will eventually lead to the development of an autonomous intelligent machine. Some.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Face Detection Using Neural Network By Kamaljeet Verma ( ) Akshay Ukey ( )
9.913 Pattern Recognition for Vision Class9 - Object Detection and Recognition Bernd Heisele.
Introduction to Machine Learning August, 2014 Vũ Việt Vũ Computer Engineering Division, Electronics Faculty Thai Nguyen University of Technology.
Vision Based hand tracking for Interaction The 7th International Conference on Applications and Principles of Information Science (APIS2008) Dept. of Visual.
Giri.K.R [4jn08ec016] Harish.Kenchangowdar[4jn10ec401] Sandesh.S[4jn08ec043] Mahabusaheb.P[4jn09ec040]
WLD: A Robust Local Image Descriptor Jie Chen, Shiguang Shan, Chu He, Guoying Zhao, Matti Pietikäinen, Xilin Chen, Wen Gao 报告人:蒲薇榄.
Evaluation of Gender Classification Methods with Automatically Detected and Aligned Faces Speaker: Po-Kai Shen Advisor: Tsai-Rong Chang Date: 2010/6/14.
Face Detection 蔡宇軒.
MULTIMEDIA SYSTEMS CBIR & CBVR. Schedule Image Annotation (CBIR) Image Annotation (CBIR) Video Annotation (CBVR) Video Annotation (CBVR) Few Project Ideas.
Optical Character Recognition
FACE RECOGNITION. A facial recognition system is a computer application for automatically identifying or verifying a person from a digital image or a.
Under Guidance of Mr. A. S. Jalal Associate Professor Dept. of Computer Engineering and Applications GLA University, Mathura Presented by Dev Drume Agrawal.
Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.
MULTIMODAL AND NATURAL COMPUTER INTERACTION Domas Jonaitis.
A NONPARAMETRIC BAYESIAN APPROACH FOR
Saleh Ud-din Ahmad Dr. Md. Shamim Akhter
Eigenedginess vs. Eigenhill, Eigenface and Eigenedge by
AHED Automatic Human Emotion Detection
CS201 Lecture 02 Computer Vision: Image Formation and Basic Techniques
Video-based human motion recognition using 3D mocap data
What is Pattern Recognition?
HCI/ComS 575X: Computational Perception
AHED Automatic Human Emotion Detection
MULTI-VIEW VISUAL SPEECH RECOGNITION BASED ON MULTI TASK LEARNING HouJeung Han, Sunghun Kang and Chang D. Yoo Dept. of Electrical Engineering, KAIST, Republic.
Image Processing and Multi-domain Translation
Sign Language Recognition With Unsupervised Feature Learning
Presentation transcript:

Automated Lip reading technique for people with speech disabilities by converting identified visemes into direct speech using image processing and machine learning techniques Presented by : Ahmed Mesbah Ahmed El-taybany Mentor : Dr. Marwan Torki

Problem

Statistics

Background research SIGN LANGUAGE RECOGNITION

WATCH KEYBOARD ELECTRONIC LARYNX

Main idea - Decreasing physiological impacts - Semi-normal state - It was proved that human could replace ears with eyes for speech reading.

Audio-visual speech recognition (AVSR)

Capturing Hardware and design

Design advantages and proof of concept The Mouthesizer: A Facial Gesture Musical Interface 2004 No more face detection

Lip Feature extraction Image-based approachesModel-based approaches

Lip Feature extraction used methods

Classifiers - Hidden Markov Model and Neural Network were the most common classifiers

Dataset - AV letters (University of East Angela) - Oulu database (University of Oulu) -CUAVE database (Clemson University) - Home-made data set

Lip reading system problems for multi-speaker Variation in : Accents Talking speeds Skin color Lip shapes Illumination conditions Confusing recognition tasks Facial hair

International phonetics alphabetic (IPA) Visible speech phonemes visemes

seenunseenphonemes Using prediction technique to recover unseen letters like Microsoft Speech API or Google Letter Prediction methods

Lip reading system 1 Input 2 Feature extraction 3 Classification 4 Output

Applications

References [1] Hsu, Rein-Lien, Abdel-Mottaleb, Mohamed, Jain, Anil K., Face Detection in Color mages, IEEE ICIP 1999, pp [2] Lai-Kan-Thon, Olivier, Lips Localization, Brno 2003 [3] Smith, S. M., Brady, J. M., SUSAN – a new approach to low level image processing, International Journal of Computer Vision, 23(1):45-78, May 1997 [4] Ahlberg, J.: A system for face localization and facial feature extraction, Linkoping University, Tech.Rep. LiTH-ISY-R-2172 [5] Albiol, A., Torres, L., Delp, E. J.: Optimum color spaces for skin detection, In Proceeding of the International Conference on Image Processing 2001, vol. 1, [6] G. Potamianos, C. Neti, G. Gravier, A. Garg, and A. W.Senior, “Recent advances in the automatic recognition of audio-visual speech,” Proc. IEEE, 91(9): 1306– 1326, [7] D. Gatica-Perez, G. Lathoud, J.-M. Odobez, and I. Mc-Cowan, “Multimodal multispeaker probabilistic trackingin meetings,” in Proc. Int. Conf. Multimodal Interfaces (ICMI), [8] A. Pentland, “Smart rooms, smart clothes,” in Proc. Int.Conf. Pattern Recog. (ICPR), [9] CHIL: Computers in the Human Interaction Loop. [Online]. Available: [10] P. Lucey and G. Potamianos, “Lipreading using profile versus frontal views,” in Proc. Int. Works. Multimedia Signal Process. (MMSP), pp. 24–28, [11] P. Lucey, G. Potamianos, and S. Sridharan, “A unified approach to multi-pose audio-visual ASR,” (To Appear) in Proc. Interspeech, 2007.

Thanks