Download presentation
Presentation is loading. Please wait.
Published byMitchell Peters Modified over 9 years ago
1
1 Facial Expression Recognition using KCCA with Combining Correlation Kernels and Kansei Information Yo Horikawa Kagawa University, Japan
2
2 1. Purpose of this study Apply kernel canonical correlation analysis (kCCA) with correlation kernels to facial expression recognition ・ Combining multi-order correlation kernels ・ Use of Kansei information
3
3 2. Facial expression recognition Facial images → Expressions Basic six expressions: happiness, sadness, surprise, anger, disgust, fear Happy Sad Surprised Angry Disgusted Fearful Neutral
4
4 3. Kernel canonical correlation analysis (KCCA) Pairs of feature vectors of sample objects: (x i, y i ) (1 ≤ i ≤ n) Canonical variates (u, v): projections with the maximum correlation between (implicit) nonlinear functions h(x i ) and h’(y i ). u = w φ ・ h(x) v = w θ ・ h’(y) u h(x) h ’ (y) v
5
5 Canonical variates are calculated with the kernel function. u = ∑ i=1 n f i φ(x i, x) v = ∑ i=1 n g i θ(y i, y) Kernel functions: the inner products of implicit functions of x and y φ(x i, x j ) = h(x i ) ・ h(x j ) θ(y i, y j ) = h’(y i ) ・ h’(y j ) f = t (f 1, ∙∙∙, f n ) and g = t (g 1, ∙∙∙, g n ): the eigenvectors of the generalized eigenvalue problem: Φ=Φ ij =φ(x i, x j ) Θ=Θ ij =θ(y i, y j ) (1 ≤ i, j ≤ n) I: Identity matrix of n×n
6
6 4. KCCA for classification problems and use of Kansei information 4.1 CCA for classification problems Use an indicator vector (IV) as the second feature vector y. y = (y 1, ∙∙∙, y nc ) corresponding to x: y c = 1 if x belongs to class c y c = 0 otherwise (n c : the number of classes) In the recognition of 6 basic facial expressions (happiness, sadness, surprise, anger, disgust, fear, neutral) y = ( 1, 0, 0, 0, 0, 0, 0 )
7
7 Canonical variates u r (1 ≤ r ≤ n c -1) for a new object (x, ?) are calculated by u r =∑ i=1 n f ri φ(x i, x) (1 ≤ r ≤ n c -1) Standard classification methods are applied in the canonical variate space (u 1, …, u nc-1 ). KCCA IV (1, 0, 0, 0, 0, 0, 0) y x Image Canonical variates (u 1, …, u 6 )
8
8 4.2 Kansei information and its use in KCCA Semantic ratings on the expressions of facial images by human Ratings on the basic six expressions: (happiness, sadness, surprise, anger, disgust, fear) ( 5, 2, 3, 1, 0, 1 ) Semantic rating vector (SRV) KCCA Canonical variates (u 1, …, u 5 ) SRV (4.39, 1.35, 2.29, 1.16, 1.23, 1.26) y x Image Classifiers
9
9 5. Correlation kernel Correlation kernel: an inner product of the autocorrelation functions of the feature vectors. r xi (t 1 ) =∫x i (t)x i (t+t 1 )dt r xj (t 1 ) =∫x j (t)x j (t+t 1 )dt φ(x i, x j ) =∫r xi (t 1 ) ・ r xj (t 1 ) dt 1 x i (t) x j (t)
10
10 The kth-order autocorrelation of data x i (t): r xi (t 1, t 2, ∙∙∙, t k-1 ) = ∫x i (t)x i (t+t 1 ) ・・・ x i (t+t k-1 )dt The inner product between r xi and r xj is calculated with the k-th power of the 2nd-order cross-correlation function: r xi ・ r xj =∫{cc xi, xj (t 1 )} k dt 1 cc xi, xj (t 1 ) =∫x i (t)x j (t+t 1 )dt The calculation of explicit values of the autocorrelations is avoided. → Higher-order autocorrelations are tractable with practical computational cost. Linear correlation kernel:φ(x i (t), x j (t)) = r xi ・ r xj
11
11 Calculation of correlation kernels r xi ・ r xj for 2-dimensional image data: x(l, m) (1≤ l ≤ L, 1≤ m ≤ M) ・ Calculate the cross-correlations between x i (l, m) and x j (l, m): cc xi, xj (l 1, m 1 ) = ∑ l=1 L-l1 ∑ m=1 M-m1 x i (l, m)x j (l+l 1, m+m 1 )/(LM) (1 ≤ l 1 ≤ L 1, 1 ≤ m 1 ≤ M 1 ) ・ Sum up the kth-power of the cross-correlations: r xi ・ r xj = ∑ l1=0 L1-1 ∑ m1=0 M1-1 {cc xi, xj (l 1, m 1 )} k /(L 1 M 1 )
12
12 Modified correlation kernels Higher-order and odd-order correlation kernels are less performed. Correlation kernel (C) : ∑ l1, m1 cc xi, xj (l 1, m 1 ) k (14) → Modified correlation kernels with the kth root and absolute values L p norm kernel (P): r xi ∙r xj = sgn(cc xi, xj (l 1, m 1 ))|∑ l1,m1 {cc xi, xj (l 1, m 1 )} k | 1/k (15) Absolute correlation kernel (A): r xi ∙r xj = ∑ l1, m1 |cc xi, xj (l 1, m 1 )| k (16) Absolute L p norm kernel (AP): r xi ∙r xj = |∑ l1, m1 {cc xi, xj (l 1, m 1 )} k | 1/k (17) Absolute L p norm absolute kernel (APA): r xi ∙r xj = |∑ l1, m1 |cc xi, xj (l 1, m 1 )| k | 1/k (18) Max norm kernel (Max): r xi ∙r xj = max l1, m1 cc xi, xj (l 1, m 1 ) (19) Max norm absolute kernel (MaxA): r xi ∙r xj = max l1, m1 |cc xi, xj (l 1, m 1 )| (20)
13
13 6. Combining correlation kernels Multiple classifiers may give higher performance than a single classifiers. Cartesian spaces of the canonical variates obtained with a set of the kernel functions e.g., U = (u 1, ···, u nc-1 ), U’ = (u’ 1, ···, u’ nc-1 ), U” = (u” 1, ···, u” nc-1 ) → U ⊗ U’ ⊗ U” = (u 1, ···, u nc-1, u’ 1, ···, u’ nc-1, u” 1, ···, u” nc-1 ) Classifier 1 ‘7’ Classifier 2 ‘9’ Classifier 3 ‘7’ Classifier 4 ‘7’ ・ Classifier M ‘1’ Objects Final decision ‘7’
14
14 7. Facial expression recognition experiment Object: JAFFE (Japanese female facial expression) database 213 facial images of 10 Japanese females 3 or 4 examples of each of 6 basic facial expressions (happiness, sadness, surprise, anger, disgust, fear) and a neutral face 8bit gray scale valued of 256×256 pixels Happy Sad Surprised Angry Disgusted Fearful Neutral Figure 1. Sample images in JAFFE database. From left to right: happiness, sadness, surprise, anger, disgust, fear, neutral.
15
15 ALL 213 images in JAFFE database
16
16 Center regions of 200×200 pixels are taken. They are resized to 20×20 pixels with averaging of 10×10 pixels. Preprocessing: linear normalization with the mean 0 and SD 1.0. Happy Sad Surprised Angry Disgusted Fearful Neutral Figure 2. Images of 20×20 real valued matrix data of Fig. 1 as the first feature x in kCCA.
17
17 Semantic rating vectors (SRVs) in JAFFE database Averages of semantic ratings on 6 expressions (happiness, sadness, surprise, anger, disgust, fear) on 5 point scales obtained from 60 Japanese females (4.39, 1.35, 2.29, 1.16, 1.23, 1.26) Happy (HAP, SAD, SUR, ANG, DIS, FEA) (4.77, 1.29, 2.45, 1.26, 1.23, 1.23) (1.39, 3.97, 1.68, 2.19, 3.68, 3.61) Images SRV Sad Happy (2.87, 1.55, 4.68, 1.52, 1.52,1.65 ) Surprised (HAP, SAD, SUR, ANG, DIS, FEA) (1.55, 1.90, 2.10, 4.32, 3.90, 1.81) (3.03, 2.16, 2.06, 1.94, 1.84, 1.87) Images SRV Neutral Angry
18
18 Experiment ( Ⅰ ) Facial expression recognition with 2 images for each expression per person Sample set: 2 images×7 expressions×10 persons = 140 images Test set: the remaining 73 images Experiment ( Ⅱ ) Facial expression recognition with a leave one-person out method Sample set: Images of 9 persons (about 190 images) Test set: Images of the remaining 1 person (about 20 images) Averaging 10 tests
19
19 The kernel function φ and the second feature vector y is shown with the set (φ, y) using the following symbols. For the kernel function φ of image data Kth-order correlation kernel: Ck Kth-order L p norm kernel: Pk Kth-order absolute correlation kernel: Ak Max norm kernel: Max etc., Total 44 kinds For the second feature vector y Indicator vector: IV Impression vector: SRV E.g., (C2, SRV) is the 2nd-order correlation kernel and the semantic rating vectors. Classifiers in the canonical space: Nearest neighbor method
20
20 8. Results of the experiment Experiment ( Ⅰ ) Correct classification rates (CCRs) with single classifiers Figure 3. CCR with single kernel functions of 44 kinds in the experiment (I). IV: indicator vector, SRV: semantic rating vector. Highest CCR (the most right-hand side) is obtained with Indicator vectors (IV), not with semantic rating vectors (SRV). Highest (SRV) 89%Highest (IV) 94.5%
21
21 Table 1( Ⅰ ). Highest CCR with single kernel functions and combining two and three kernel functions in the experiment ( Ⅰ ). Combining correlation kernels increases CCRs. Highest CCR (97.3%) is superior to the past studies (94.6%) It is obtained with not only indicator vectors (IV) but also semantic rating vectors (SRV).
22
22 Experiment ( Ⅱ ) Figure 4. Highest CCRs with kernel functions of 44 kinds in 10 tests in the experiment (II). Single kernel (a), combining two (b) and three (c) kernels. IV: indicator vector, SRV: semantic rating vector. Highest average CCR (the most right-hand side in Fig. 4(c)) is again obtained with combining 3 kernels including semantic rating vectors (SRV).
23
23 Table 1( Ⅱ ). Highest CCR with single kernel functions and combining two and three kernel functions in the experiment ( Ⅱ ). Highest CCR (67.0%) is again obtained with combining 3 kernels including not only indicator vectors but also semantic rating vectors (SRV).
24
24 9. Conclusion KCCA with multiple correlation kernels and Kansei information was applied to facial expression recognition through the experiment with JAFFE database. High correct classification rates (CCRs) equivalent to the past studies were obtained with correlation kernel CCA without any feature extraction. Combining multiple correlation kernels and Kansei information with the semantic rating vectors contributed to increase CCRs.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.