Loris Bazzani*, Marco Cristani*†, Alessandro Perina*, Michela Farenzena*, Vittorio Murino*† *Computer Science Department, University of Verona, Italy †Istituto Italiano di Tecnologia (IIT), Genova, Italy Multiple-shot Person Re- identification by HPE signature This research is founded by the EU-Project FP7 SAMURAI,grant FP7-SEC No
Analysis of the problem (1) Person Re-identification: Recognizing an individual in diverse locations over different (non-)overlapping camera views T = 222 T = 145 T = 1 T = 23 Different cameras Same camera 2
Analysis of the problem (2) We focus on the problem with non-overlapping cameras Problems in real scenarios: – Very low resolution – Severe Occlusions – Illumination variations – Pedestrians with very similar clothes – Pose and view-point changes – No geometry of the environment Solution: - Histogram Plus Epitome (HPE) descriptor, and - Multiple-shot approach 3
Outline 4 Overview of the proposed method Pre-processing: Background Subtraction “Images selection” for Multiple-shot HPE descriptor - Global descriptor - Local descriptors HPEs’ Matching Results Conclusions
Overview of the proposed method 5 Employing global and local appearance-based features Exploiting the temporal consistency to make robust the descriptor
Background Subtraction 6 We employ a novel generative model: STEL [Jojic el al. 2009] Capture the structure of an image class as a mixture of component segmentations Isolate meaningful parts that exhibit tight feature distributions Learned Mixture Components
“Images selection” for Multiple-shot 7 Objective: discard redundant information and images with occlusions Gaussian Mixture Models Clustering [Figueiredo and Jain 2002] of HSV histograms Automatic model selection employing the Bayesian Information Criterion [Figueiredo and Jain 2002] Discard the clusters with low number of instances Keep a random instance for each cluster Examples of ruled-out examples:
HPE descriptor: Global feature 8 36-dimensional HSV histogram (H=16, S=16, V=4) Average the histograms of the multiple instances Robust to illumination and pose variations, keeping the predominant chromatic information only Capture chromatic global information Caused by illumination changes
HPE descriptor: Local feature (1) 9 Epitome [Jojic el al. 2003]: generative model that analyzes the presence of recurrent, structured local patterns Generic Epitome Local Epitome Local Epitome
HPE descriptor: Local feature (2) 10 Generic Epitome : 36-dimensional HSV histogram of the Epitome Local Epitome : Keep the patches with high : probability that a patch in the epitome having (i, j) as left-upper corner represents several ingredient patches Discard the patches with low entropy Extract a 36-dimensional HSV histogram of the “survived” patches
HPEs’ Matching 11 Re-identification: associating each element in the probe set B to the corresponding element in the gallery set A Minimize the following distance where is the Bhattacharyya distance and
Results (1) 12 iLIDS dataset: - Multiple images of 119 pedestrians 128x64 pixels - Comparison with Context-based method [Zheng et al. 2009] - Cross-validation: SvsS 10 trials, MvsS/MvsM 100 trials
Results (2) 13 ETHZ dataset: - Three datasets of 83, 35 and 28 pedestrians of 64x32 pixels - Comparison with Partial Least Square (PLS) method [Schwartz and Davis 2009] - Cross-validation: Settings as for iLIDS
Results (3) 14 How many images do we need to perform a “good” person re-identification? N = Number of images for the multi-shot approach N = 5 seems to be the best trade-off
Conclusions 15 We proposed a novel descriptor for the person re- identification problem, i.e., HPE descriptor The descriptor is robust to low resolution, occlusions, illumination variations, pedestrians with very similar clothes, pose changes It is based on the accumulation of images to gain robustness Person re-identification problem is still far from being solved The results suggest that further improvements can be reached
References [Jojic el al. 2009] N. Jojic, A. Perina, M. Cristani, V. Murino, and B. Frey, “Stel component analysis: Modeling spatial correlations in image class structure,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 2044– 2051, [Figueiredo and Jain 2002] M. Figueiredo and A. Jain, “Unsupervised learning of finite mixture models,” IEEE Trans. PAMI, vol. 24, no. 3, pp. 381–396, [Jojic el al. 2003] N. Jojic, B. J. Frey, and A. Kannan, “Epitomic analysis of appearance and shape,” in IEEE International Conference on Computer Vision. Washington, DC, USA: IEEE Computer Society, 2003, p. 34. [Schwartz and Davis 2009] W. Schwartz and L. Davis, “Learning discriminative appearance-based models using partial least squares,” in XXIISIBGRAPI, [Zheng et al. 2009] W. Zheng, S. Gong, and T. Xiang, “Associating groups of people,” in BMVC,