An Infant Facial Expression Recognition System Based on Moment Feature Extraction C. Y. Fang, H. W. Lin, S. W. Chen Department of Computer Science and Information Engineering National Taiwan Normal University Taipei, Taiwan
Outline Introduction System Flowchart Experimental Results Infant Face Detection Feature Extraction Correlation Coefficient Calculation Infant Expression Classification Experimental Results Conclusions and Future Work
Introduction Infants can not protect themselves generally. Vision-based surveillance systems can be used for infant care. Warn the baby sitter Avoid dangerous situations This paper presents a vision-based infant facial expression recognition system for infant safety surveillance. A video camera is set above the crib. Here shows an example image which is captured by the video camera.
The classes of infant expressions Five infant facial expressions: crying, gazing, laughing, yawning and vomiting Three poses of the infant head: front, turn left and turn right Total classes: 15 classes turn right front This example shows that the infant is gazing and his head is turning right. In this example, the infant is laughing and faces the camera. In this example, the infant is vomiting and his head is turning left. crying gazing laughing turn left yawning vomiting
System Flowchart Infant face detection: Feature extraction: to remove the noises and to reduce the effects of lights and shadows to segment the image based on the skin color information Feature extraction: to extract three types of moments as features, including Hu moments, R moments, and Zernike moments Feature correlation calculation: to calculate the correlation coefficients between two moments of the same type for each 15-frame sequence Classification: to construct the decision trees to classify the infant facial expressions To detect the infant face, … In feature extraction, the system will… Then the system will calculate the feature correlation coefficients… These correlation coefficients will be used as attributes to classify the infant facial expressions by the classification trees.
Lighting compensation Infant Face Detection Stage 1: Lighting compensation Stage 2: Infant face extraction Step1: Skin color detection Using three bands S of HSI Cb of YCrCb U of LUX Step2: Noise reduction Using 10x10 median filter Step3: Infant face identification Using temporal information Lighting compensation Skin color detection Here we discusses the above flowchart in details. We use three bands of different color models to define the skin color range. They are … Here shows the result after skin color detection. Noise reduction
Infant Face Detection Step 3: Infant face identification Here shows two examples of infant face identification.
Moments To calculate three types of moments Hu moment [Hu1962] R moment [Liu2008] Zernike moment [Zhi2008] Given an image I and let f be an image function. The digital (p, q)th moment of I is given by The central (p, q)th moments of I can be defined as where and The normalized central moments of I where In this study, we want to find if the moments can be good features in facial expression recognition. As mentioned above, the system calculates three types of moments as features.
normalized central moments Hu Moment Hu moments are translation, scale, and rotation invariant. normalized central moments
Example: Hu Moments crying This example shows a crying sequence and its corresponding values of Hu moments.
Example: Hu Moments yawning This example shows a yawning sequence and its corresponding values of Hu moments.
Example: Hu Moments yawning crying Compare these two sequences, we can observe that the values of Hu moments are different if the infant facial expressions are different. So they may be used the values of Hu moments to classify the different expressions.
R Moment Liu (2008) proposed ten R moments which can improve the scale invariability of Hu moments. Hu moments Here shows the formulas of R moments. R moments are developed to improve the scale invariability of Hu moments. We want to understand if R moments are better than Hu moments or not.
Example: R Moments crying Hu moments This example shows a crying sequence and its corresponding values of R moments. We can compare the result with the figures of the values of Hu moments. These two moments may have different properties.
Zernike Moment Zernike moments of order p with repetition q for an image function f is where To simplify the index, we use Z1, Z2,…, Z10 to represent Z80, Z82,…, Z99, respectively. real part imaginary part Zernike moments are the third moments we consider in this study. They are also rotation, scale, and translation invariant.
Example: Zernike Moments crying This example shows a crying sequence and its corresponding values of Zernike moments.
Correlation Coefficients A facial expression is a sequential change of the values of the moments. The correlation coefficients may be used to represent the facial expressions. Let Ai = , i = 1, 2,…, m, indicates the ith moment Ai of the frame Ik, k = 1, 2,…, n. The correlation coefficients between Ai and Aj can be defined as where and : the mean of the elements in Ai Now, we will calculate the correlation coefficients of each two moments. Because we think a facial expression is a sequential change of the values of the moments. The correlation coefficients may be used to represent the facial expressions. Here shows the formulas to calculate the correlation coefficients.
Correlation Coefficients The correlation coefficients between seven Hu moment sequences. yawning H1 H2 H3 H4 H5 H6 H7 1 0.8778 0.9481 -0.033 -0.571 -0.8052 0.8907 0.9474 0.1887 -0.4389 -0.8749 0.9241 0.1410 -0.6336 -0.9044 0.9719 0.0568 -0.3431 0.2995 0.7138 -0.6869 -0.9727
correlation coefficients Decision Tree Decision trees are used to classify the infant facial expressions. correlation coefficients H1H2 H1H3 H2H3 - + H1H3>0 In this example, there are five triangles and five squares mixed in the set. We can select a feature to divided these two different shape objects. In this case, there are three features shown here, we can see that H1H3 is a good feature.
Decision Tree The correlation coefficients between two attributes Ai and Aj of a training instance are used to split the training instances. Let the training instances in S be split into two subsets S1 and S2 by the correlation coefficient, then the measure function is The best correlation coefficient selected by the system is We need a measure function to measure the goodness of a feature. The measure function is an entropy function. The best correlation coefficient selected by the system is the one which obtains the minima values of the measure function.
Decision tree construction Step 1: Initially, put all the training instances into the root SR, regard SR as an internal decision node and input SR into a decision node queue. Step 2: Select an internal decision node S from the decision node queue calculate the entropy of node S. If the entropy of node S larger then a threshold Ts, then goto Step 3, else label node S as a leaf node, goto Step 4. Step 3: Find the best correlation coefficient to split the training instances in node S. Split the training instances in S into two nodes S1 and S2 by correlation coefficients and add S1, S2 into the decision node queue. Goto Step 2. Step 4: If the queue is not empty, then goto Step 2, else stop the algorithm.
Experimental Results Training data: 59 sequences Testing data: 30 sequences Five infant facial expressions: crying, laughing, dazing, yawning, vomiting Three different poses of infant head: front, turn left, and turn right Fifteen classes are classified. crying laughing dazing yawning vomiting Turn left Front Turn right
Feature type: Hu moments yes no crying yawning laughing dazing vomiting Feature type: Hu moments The decision tree is constructed by the correlation coefficients of Hu moments. Each internal node includes a decision rule, and each leaf node includes one class of the facial expression. Two leaf nodes may represent a same facial expression. For example, this node represents a dazing class and this node also represents a similar class. The height of this tree is 8.
Classification results Experimental Results Testing sequences Classification results laughing dazing laughing The left column shows four testing sequences. The right column indicates their corresponding classification results. vomiting
Feature type: R moments no yes vomiting yawning dazing crying laughing The decision tree is constructed by the correlation coefficients of R moments. Each internal node includes a decision rule, and each leaf node includes one class of the facial expression. Two leaf nodes may represent a same facial expression. For example, this node represents a dazing class and this node also represents a similar class. The height of this tree is 10.
Classification results Experimental Results Testing sequences Classification results crying yawning dazing The left column shows four testing sequences. The right column indicates their corresponding classification results. dazing
Feature type: Zernike moments no yes crying vomiting laughing dazing yawning The decision tree is constructed by the correlation coefficients of Zernike moments. Each internal node includes a decision rule, and each leaf node includes one class of the facial expression. Two leaf nodes may represent a same facial expression. For example, this node represents a dazing class and this node also represents a similar class. The height of this tree is 7.
Experimental Results Testing sequences Classification results crying vomiting crying The left column shows four testing sequences. The right column indicates their corresponding classification results. crying
Conclusions and Future Work Results Number of training sequences Number of nodes Height of the decision tree Number of testing sequences Classifica-tion Rate Hu moments 59 16+17 8 30 90% R moments 15+17 10 80% Zernike moments 19+20 7 87%
Conclusions and Future Work This paper presents a vision-based infant facial expression recognition system. Infant face detection Moment features extraction Correlation coefficient calculation Decision tree classification Future work To collect more experimental data To fuzzify the decision tree Binary classification trees may have less noise tolerant ability. If the correlation coefficients are close to zero, the noises will greatly affect the classification results.