Multimodal Caricatural Mirror

Slides:



Advertisements
Similar presentations
Descriptive schemes for facial expression introduction.
Advertisements

Advanced Image Processing Student Seminar: Lipreading Method using color extraction method and eigenspace technique ( Yasuyuki Nakata and Moritoshi Ando.
The Extended Cohn-Kanade Dataset(CK+):A complete dataset for action unit and emotion-specified expression Author:Patrick Lucey, Jeffrey F. Cohn, Takeo.
Face Recognition and Biometric Systems Eigenfaces (2)
An Introduction of Support Vector Machine
SVM—Support Vector Machines
Designing Facial Animation For Speaking Persian Language Hadi Rahimzadeh June 2005.
Facial expression as an input annotation modality for affective speech-to-speech translation Éva Székely, Zeeshan Ahmed, Ingmar Steiner, Julie Carson-Berndsen.
Face Recognition & Biometric Systems Support Vector Machines (part 2)
Facial feature localization Presented by: Harvest Jang Spring 2002.
AUTOMATIC SPEECH CLASSIFICATION TO FIVE EMOTIONAL STATES BASED ON GENDER INFORMATION ABSTRACT We report on the statistics of global prosodic features of.
GMM-Based Multimodal Biometric Verification Yannis Stylianou Yannis Pantazis Felipe Calderero Pedro Larroy François Severin Sascha Schimke Rolando Bonal.
RECOGNIZING FACIAL EXPRESSIONS THROUGH TRACKING Salih Burak Gokturk.
Face Recognition with Harr Transforms and SVMs EE645 Final Project May 11, 2005 J Stautzenberger.
Presented by Zeehasham Rasheed
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Real-Time Face Detection and Tracking Using Multiple Cameras RIT Computer Engineering Senior Design Project John RuppertJustin HnatowJared Holsopple This.
Oral Defense by Sunny Tang 15 Aug 2003
Introduction to machine learning
Database Construction for Speech to Lip-readable Animation Conversion Gyorgy Takacs, Attila Tihanyi, Tamas Bardi, Gergo Feldhoffer, Balint Srancsik Peter.
Face Detection using the Viola-Jones Method
Multimodal Interaction Dr. Mike Spann
Chapter 7. BEAT: the Behavior Expression Animation Toolkit
Support Vector Machines Mei-Chen Yeh 04/20/2010. The Classification Problem Label instances, usually represented by feature vectors, into one of the predefined.
Multimodal Information Analysis for Emotion Recognition
Signature with Text-Dependent and Text-Independent Speech for Robust Identity Verification B. Ly-Van*, R. Blouet**, S. Renouard** S. Garcia-Salicetti*,
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Object Recognition in Images Slides originally created by Bernd Heisele.
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.
ECE738 Advanced Image Processing Face Detection IEEE Trans. PAMI, July 1997.
 Detecting system  Training system Human Emotions Estimation by Adaboost based on Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki ( Kobe University ) User's.
ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.
Human Detection Mikel Rodriguez. Organization 1. Moving Target Indicator (MTI) Background models Background models Moving region detection Moving region.
Dynamic Captioning: Video Accessibility Enhancement for Hearing Impairment Richang Hong, Meng Wang, Mengdi Xuy Shuicheng Yany and Tat-Seng Chua School.
Christopher M. Bishop, Pattern Recognition and Machine Learning.
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
Sparse Bayesian Learning for Efficient Visual Tracking O. Williams, A. Blake & R. Cipolloa PAMI, Aug Presented by Yuting Qi Machine Learning Reading.
Categorization by Learning and Combing Object Parts B. Heisele, T. Serre, M. Pontil, T. Vetter, T. Poggio. Presented by Manish Jethwa.
Singer Similarity Doug Van Nort MUMT 611. Goal Determine Singer / Vocalist based on extracted features of audio signal Classify audio files based on singer.
A generic face processing framework: Technologies, Analyses and Applications Supervised by: Prof. Michael R. Lyu Presented by Jang Kim Fung Oral Defense.
Speaker Change Detection using Support Vector Machines V.Kartik, D.Srikrishna Satish and C.Chandra Sekhar Speech and Vision Laboratory Department of Computer.
Activity Analysis of Sign Language Video Generals exam Neva Cherniavsky.
RECOGNIZING FACIAL EXPRESSIONS THROUGH TRACKING Salih Burak Gokturk.
Detection, Classification and Tracking in Distributed Sensor Networks D. Li, K. Wong, Y. Hu and A. M. Sayeed Dept. of Electrical & Computer Engineering.
6.S093 Visual Recognition through Machine Learning Competition Image by kirkh.deviantart.com Joseph Lim and Aditya Khosla Acknowledgment: Many slides from.
Design & Implementation of a Gesture Recognition System Isaac Gerg B.S. Computer Engineering The Pennsylvania State University.
Chapter 15: Classification of Time- Embedded EEG Using Short-Time Principal Component Analysis by Nguyen Duc Thang 5/2009.
On the relevance of facial expressions for biometric recognition Marcos Faundez-Zanuy, Joan Fabregas Escola Universitària Politècnica de Mataró (Barcelona.
Day 17: Duality and Nonlinear SVM Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute.
Course Outline (6 Weeks) for Professor K.H Wong
REAL-TIME DETECTOR FOR UNUSUAL BEHAVIOR
Automated Detection of Human Emotion
Face Detection EE368 Final Project Group 14 Ping Hsin Lee
AHED Automatic Human Emotion Detection
Traffic Sign Recognition Using Discriminative Local Features Andrzej Ruta, Yongmin Li, Xiaohui Liu School of Information Systems, Computing and Mathematics.
Session 7: Face Detection (cont.)
Mixture of SVMs for Face Class Modeling
Pawan Lingras and Cory Butz
Support Vector Machines Introduction to Data Mining, 2nd Edition by
Categorization by Learning and Combing Object Parts
Enhancing Diagnostic Quality of ECG in Mobile Environment
Multimodal Caricatural Mirror
Project #2 Multimodal Caricatural Mirror Intermediate report
Marian Stewart Bartlett, Gwen C. Littlewort, Mark G. Frank, Kang Lee 
AHED Automatic Human Emotion Detection
Support vector machine-based text detection in digital video
Automated Detection of Human Emotion
Motivation It can effectively mine multi-modal knowledge with structured textural and visual relationships from web automatically. We propose BC-DNN method.
End-to-End Speech-Driven Facial Animation with Temporal GANs
SPECIAL ISSUE on Document Analysis, 5(2):1-15, 2005.
Presentation transcript:

Multimodal Caricatural Mirror FINAL PROJECT PRESENTATION

Project Summary Create a Multimodal caricatural mirror : Multimodal = facial + vocal Caricatural = Amplify emotions Mirror = Face your avatar! 2/24/2019

Outline Project Architecture Visual Modality : Audio Modality : Face Tracking Facial Features Detection Facial Features Tracking Facial Expression Recognition Emotion Modeling Facial Animation Audio Modality : Vocal Features Extraction Prosody Amplification Multimodal Fusion : Multimodal Synchronized Emotion Synthesis 2/24/2019

Project Architecture The ‘Mamama’ option User Face tracking Speech Signal Avatar Facial Features Tracking Prosodic Features Extraction Emotion Recognition Prosody Processing t’ = f(t) Facial Animation Prosody Amplification Fusion 2/24/2019

Face Tracking We chose to use an open-source software : OpenCV face tracker Trained on a large database (no tuning necessary) Color Tracking using CAMSHIFT algorithm (Mean-Shift) 2/24/2019

Facial Features Detection Step 1 : Facial Features Detection (1st frame) Transformation to grayscale image Computation of image’s trace transform (luminance on M vertical lines) From sets of local minima, infer positions of facial features (eyebrows, eyes and mouth) using a priori face’s morphology knowledge (heuristics)  Automatic Initialization of the Candide grid (1st frame) 2/24/2019

Facial Features Tracking 2/24/2019

Facial Features Tracking Step 2 : Facial Features Tracking (all frames > 1) 2/24/2019

Emotion Recognition (visual modality) We use Support Vector Machines (SVMs) as emotion classifier : Find the hyperplanes such as the margins between classes are maximized (in a low-dimensional subspace), using an appropriate kernel function. Classification for every frame, but possibility to introduce temporal dependancies between successive decisions, to recover short tracking errors (not ‘error bursts’) Good robustness against overfitting (only training samples defining the margins are kept) 2/24/2019

Emotions modeling 2/24/2019

Emotions modeling 2/24/2019

Facial Animation Among 3D face models, we chose to use Candide3 for the animation It includes animation units and MPEG-4 FAPs Animation software is written in C++ by using OpenGL and SDL APIs, which are open source and can run on many platforms. 2/24/2019

Vocal Features extraction Pitch Variations Amplification Pitch is extracted using an algorithm based on autocorrelation function Pitch variations are then modified using PSOLA Figure: downtrend of the pitch is first removed, then pitch movements are amplified and finally downtrend is set back 2/24/2019

‘Speaking Rate’ Processing Low-pass filtering (emphasize voiced regions) Sliding window (~0.75s)  # energy maxima ? Estimation of ‘speaking rate’ over time ‘Speaking-rate’ distorsion function : sigmoïdal transition function 2/24/2019

Multimodal Emotion Synthesis Changing ‘speaking rate’ is equivalent to changing time scale …  The ‘speaking rate processing’ stage generates a time scale distorsion function:  t < T0 : t’ = F (t) which is fed to the facial animation engine, which generates animation synchronized on output speech signal (‘mamama’ only ;-) 2/24/2019

Come & Try it by yourself this afternoon !