Download presentation
Presentation is loading. Please wait.
Published byDiane Holland Modified over 8 years ago
1
Eyes detection in compressed domain using classification Eng. Alexandru POPA alexandru_popa@autenticmedia.com Technical University of Cluj-Napoca Faculty of Electronics, Telecommunications and Information Technology
2
Object detection in digital images The principle of image processing in the compressed domain The Discrete Cosine Transform (DCT) The spatial relationship of DCT coefficients between a block and its sub-blocks Object recognition using classification The linear discriminant classifier (LDA, Fisher classifier) Demo Results Conclusions 2
3
the approached method consists in feature extraction using image transformations, creation of a new space of features followed by objects classification in that space feature extraction methods: DCT, Wavelet, Gabor DCT gives in general good features for object description. Is the base of the JPEG standard, and the properties of the DCT coefficients blocks, makes them very good for generating features spaces the idea is to make the classification of the objects direct in JPEG compressed domain DCT = Discrete Cosine Transform 3
4
almost all image processing algorithms are defined in pixel level; rewriting them in the compressed domain is not direct standard implementation schemes decompress the image, apply the algorithm and them recompress the image. The disadvantage is that these schemes are time consuming it is wished to rewrite these algorithms directly in the compressed domain for optimizing the processing chain 4
5
The formula for DCT applied on a image: Properties: Decorelation – the principal advantage of transformed images is the low redundancy between neighbours pixels. From this fact results uncorrelated coefficients which can be coded independently Energy compactness – the capacity of the transformation to pack the input datas in as few coefficients as possible Separability – the 2D DCT can be calculated in two steps by applying the 1D formula successively on the lines and the columns of an image 5 (1) (2)
6
a new problem could occur from the fact that various DCT block sizes have to be used in order to ensure optimized performances 8x8 blocks used in JPEG, 4x4 blocks used in image indexing, and 16x16 macro- blocks in MPEG to deal with inter-transfer of DCT coefficients from different blocks with various sizes, the existing approach would have to decompress the pixel data in the spatial domain via the IDCT, redivide the pixels into new blocks with the required size and then apply the DCT again to produce the DCT coefficients it is obvious that the approach is inefficient Bibliography: The Spatial Relationship of DCT Coefficients Between a Block and Its Sub-blocks, Jianmin Jiang and Guocan Feng 6
7
4x4 block Transformation from 4 blocks of 2x2 pixels in one of 4x4 pixels: 106978385 106958485 105847469 77605789 The block with the pixels luminance DCT 20210168 1 8485 16319144-13 26219 The DCT coefficients of 4 block of 2x2 pixels Matricea A* 3392322-3 3413-130 -12-168-5 13-44 1010 0.92390.3827-0.92390.3827 010 -0.38270.92390.38270.9239 Ecuation: Original image 7 (3)
8
Transformation form a 4x4 block to 4 block of 2x2 pixels: 106978385 106958485 105847469 77605789 DCT 20210168 1 8485 16319144-13 26219 The inverse matrix of A* 3392322-3 3413-130 -12-168-5 13-44 0.50.46190.5-0.1913 00.191300.4619 0.5-0.46190.50.1913 0 00.4619 Ecuation : 8 (4) The block with the pixels luminance The DCT coefficients of The 4x4 block Original image4x4 block
9
geometric classifiers are those classifiers which implies the deduction of some decision borders in the features space a classifier demands a set of training datas (datas + labels) the number of datas must be big enough for a correct learning with generalization capacity for unknown datas 9 Data classification: means that an unknown sample is presented to the classifier, his position regarding the decision boundaries is calculated and depending on it a label is associated
10
LDA (Linear Discriminant Analysis) using Fisher’s classifier implies finding a line in the features space and projecting the datas from the training set on this line. Describes the datas by their projections Considering a bi-dimensional space we have: 10 Fisher’s criteria for selecting w and w0 parameters: The optimal direction w is the line direction for which: 1) the distance between the projections of the classes centers on w is maximum 2) the variance of the projections from each class is minimum The optimum value w0 is the scalar value which minimize the classification error in the training data set is the label assigned to the i data by the Fisher classifier
11
11
12
12
13
13 The image form which the training set was taken
14
it was proved that the implementation of Fisher`s classifier in compressed domain was a wise choice because it has good results in eyes regions detection it`s a novelty in the image processing field because this algorithm wasn`t written in compressed domain using the spatial relationship of DCT coefficients between a block and its sub-blocks facilitates the computation of coefficients for big blocks starting from small blocks in the way of speed and computation complexity Others applications that can derive: gaze tracking/focusing automatic system for detecting the vigilance of drivers biometrics applications: person identification using iris recognition, conteaz ă foarte mult structura acesteia precum şi setul de antrenare 14
15
Thank you for your attention! Questions? 15
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.