Project #2 Multimodal Caricatural Mirror Intermediate report
Project Summary Create a Multimodal caricatural mirror : Multimodal = facial + vocal Caricatural = Amplify emotions Mirror = Face your avatar! 2/24/2019
Outline Project Architecture : 2 versions Visual Modality : Face Tracking Facial Features Tracking Facial Expression Recognition Facial Animation Audio Modality : Vocal Features Extraction Emotion detection in Speech Prosody Amplification 2/24/2019
Project Architecture #1 Face tracking Facial Features Tracking Speech Signal Vocal Features Extraction Emotion Detection Facial Animation User Fusion Wide Screen Prosody Amplification Movements Amplification 2/24/2019
Project Architecture #2 The ‘Mamama’ option User Face tracking Speech Signal Facial Features Tracking Vocal Features Extraction Emotion Recognition Prosody Amplification Wide Screen Facial Animation Fusion 2/24/2019
Face Tracking We chose to use an open-source software : The OpenCV face tracker Provides real-time face-tracking using C/C++ open-source Intel Computer Vision Library Exemple using OpenCV face tracker, with OUR face tracked !! Picture/Video to be inserted 2/24/2019
Facial Features Tracking Step 1 : Facial Features Detection (1st frame) Computation of image’s trace transform (luminance on M vertical lines) From sets of local minima, infer positions of facial features (eyebrows, eyes and mouth) Build binary image from N darkest pixels per line Facial features’ positions are detected to be among the above mentioned dark pixels Automatic Initialization of the Candide grid (1st frame) 2/24/2019
Facial Features Tracking 2/24/2019
Grid Initialization 2/24/2019
Facial Features Tracking Step 2 : Facial Features Tracking (all frames > 1) Video missing here … 2/24/2019
Emotions modeling Four positions to interpolate, for each emotion: CLOSED MOUTH CLOSED MOUTH MAMA INTERMEDIATE FINAL EMOTION 2/24/2019
Emotions modeling Happiness Sadness 2/24/2019
Facial Animation Among 3D face models, we chose to use Candide3 for the animation It includes animation units and MPEG-4 FAPs Animation software is written in C++ by using OpenGL and SDL APIs, which are open source and can run on many platforms. 2/24/2019
Vocal Features extraction For the moment : pitch only Pitch is extracted by means of the autocorrelation method and modified by means of PSOLA. Ex: downtrend of the pitch is removed, pitch movements amplified and downtrend is set back 2/24/2019
Emotion Detection & Prosody Amplification For vocal features, our aim is to classify : Emotions inducing small pitch variations Emotions inducing high pitch variations Can be done based on pitch or other features such as spectral ones. Original Pitch-powered 2/24/2019