Download presentation
Presentation is loading. Please wait.
Published byBrooke Mosley Modified over 9 years ago
1
Character Identification in Feature-Length Films Using Global Face-Name Matching IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 11, NO. 7, NOVEMBER 2009 Yi-Fan Zhang, Student Member, IEEE, Changsheng Xu, Senior Member, IEEE, Hanqing Lu, Senior Member, IEEE, and Yeh-Min Huang, Member, IEEE
2
Outline Introduction Face Clustering Face-Name Association Applications Experiment Conclusions
3
Outline Introduction Face Clustering Face-Name Association Applications Experiment Conclusions
4
Introduction In a film, the interactions among the characters resemble them into a relationship network, which makes a film be treated as a small society. In the video, faces can stand for characters and co- occurrence of the faces in a scene can represent an interaction between characters. In the film script, the spoken lines of different characters appearing in the same scene also represents an interaction. scene title brief description: environment, actions speaker name spoken line
5
Introduction speaking face tracks face affinity network name affinity network a graph matching method an EMD-based measure of face track distance leading characters & cliques Since we try to keep as same as possible with the name statistics in the script, we select the speaking face tracks to build the face affinity network, which is based on the co-occurrence of the speaking face tracks.
6
Outline Introduction Face Clustering Face-Name Association Applications Experiment Conclusions
7
Face Clustering frame face detection: detect faces on each frame of the video face track (the same person): store face position, scale and the start and end frame number of the track video video scene segmentation: 1.The scene segmentation points can be inserted in the boundary between two shots which have the high degree of discontinuity. 2.To align with the scene partition in the film script, we change discontinuity degree threshold to get the same number of scenes in the video with the script. the scene segmentation point scene 5 scene 6 scene 7 speaking face track detection: 1.the mouth ROI is located 2.SIFT 3.normalized sum of absolute difference (NSAD) 4. if a face track has more than 10% frames labeled as speaking, it will be determined as a speaking face track
8
Face Clustering face representation by locally linear embedding(LLE): 1. It is a dimensionality reduction technique. 2. It project high dimensional face features into the embedding space which can still preserve the neighborhood relationship. extract the dominant clusters: 1. We employ spectral clustering to do clustering on all the faces in the LLE space. 2. The number of clusters K is set by prior knowledge derived from the film script. spectral clustering k dominant clusters
9
Face Clustering earth mover’s distance (EMD): It is a metric to evaluate the dissimilarity between two distributions. measure face track distance by EMD: represent face track: : is the cluster center : is the number of faces belonging to this cluster 3 4 5 6 1 2 7 8 9 dominant clusters: cluster1 cluster2 cluster3 cluster4 cluster5 face track P : 1 2 34 5 : the ground distance between cluster centers and : the flow between and
10
Face Clustering constrained K-Means Clustering: 1. K-Means clustering is performed to group the scatted face tracks. 2. The two face tracks which share the common frames cannot be clustered together. 3. The target number of clusters on face tracks is the same as K we set in spectral clustering on the faces. 4. We also ignore those characters whose spoken lines are less than three in the script. 5. To clean the noise from the clustering results, a pruning method is employed in the next step. speaking face track clusters cluster1 cluster2 cluster3 cluster4 cluster5 face track noise face track cluster pruning: We refine the clustering results by pruning the marginal points which have low confidence belonging to the current cluster. : the EMD between the face track F and its cluster center k : the number of K-nearest neighbors of F : the number of K-nearest neighbors which belong to the same cluster with F All the marginal points: We do a re-classification which incorporates the speaker voice features for enhancement. : the likelihood of ‘s voice model for X X : the feature vector of the corresponding audio clip 1. To clean noises, we set a threshold. 2. The face track will be classified into the cluster whose function score is maximal.
11
Outline Introduction Face Clustering Face-Name Association Applications Experiment Conclusions
12
Face-Name Association We use a name entity recognition software to extract every name in front of the spoken lines and the scene titles. name occurrence matrix m : the number of names n : the number of scenes : the name count of the ith character in kth scene name affinity matrix The affinity value between two names is represented by their co-occurrence. face affinity matrix
13
Face-Name Association vertices matching between two graphs: The name affinity network and the face affinity network both can be represented as an undirected, weighted graphs, respectively: We use spectral matching method to find the final results of name-face association.
14
Face-Name Association Spectral matching method: It is commonly used for finding consistent correspondences between two sets of features. A B D C 1 2 4 3 M(a,a): It measures how well the feature i matches the feature i’. ex: M(A,3)=4, M(A,1)=1 M(a,b): It measures how well the edge (i,j) matches the edge (i’,j’). ex: M((A,3),(B,1))=4, M((A,3),(B,4))=0
15
A1A1 A2A2 A3A3 A4A4 B1B1 B2B2 B3B3 B4B4 C1C1 C2C2 C3C3 C4C4 D1D1 D2D2 D3D3 D4D4 A1A1 2000014303120234 A2A2 0300103130242042 A3A3 0040430112043402 A4A4 0001311024404220 B1B1 0243400003340423 B2B2 2031030030424033 B3B3 4301002034022303 B4B4 3110000342203330 C1C1 0312033430000312 C2C2 3024304202003024 C3C3 1204340200101204 C4C4 2440422000042440 D1D1 0334042203123000 D2D2 3042403330240400 D3D3 3402230312040030 D4D4 4220233024400002 A B D C 1 2 4 3
16
Outline Introduction Face Clustering Face-Name Association Applications Experiment Conclusions
17
Applications
19
Character-Centered Browsing
20
Outline Introduction Face Clustering Face-Name Association Applications Experiment Conclusions
21
Experiment film information speaking face track detection
22
Experiment The higher the value of is, the more speaking face tracks will be pruned. Precision/recall curves of face track clustering
23
Experiment name-face association relationship mining
24
Outline Introduction Face Clustering Face-Name Association Applications Experiment Conclusions
25
A graph matching method has been utilized to build name- face association between the name affinity network and the face affinity network. As an application, we have mined the relationship between characters and provided a platform for character-centered film browsing.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.