Download presentation
Presentation is loading. Please wait.
Published byMarshall Ford Modified over 6 years ago
1
An Image Database Retrieval Scheme Based Upon Multivariate Analysis and Data Mining
Presented by C.C. Chang Dept. of Computer Science and Information Engineering, National Chung Cheng University
2
Outline Introduction Image Retrieval
The Proposed Scheme Based Upon PCA and Data Mining Image Feature Extraction Data Mining for Image Features Illustration Future Works Conclusions
3
Introduction Image database Query image
The ability to develop an efficient and effective image retrieval system to access desired images in the depth of the database has been a more and more interesting and challenging topic of research
4
Introduction Image retrieval system Text-based retrieval Text-based
Content-based Text-based retrieval Query by keywords Keywords: setting sun, mountain, ocean, purple,… The ability to develop an efficient and effective image retrieval system to access desired images in the depth of the database has been a more and more interesting and challenging topic of research
5
Introduction Content-based image retrieval
Images are indexed by their content, color, shape, texture, features and so on. Feature extraction methods Histogram Neural network (NN) Support vector machines (SVM) Genetic algorithm (GA) Principal component analysis (PCA) … The ability to develop an efficient and effective image retrieval system to access desired images in the depth of the database has been a more and more interesting and challenging topic of research
6
The proposed scheme based upon PCA and data mining
If a digital image can be transformed into a transaction database, then we can use its corresponding derived association rules as its main features to filter out all the undesired digital images for a query image.
7
Principal component analysis (PCA)
Given a set of points Y1, Y2, …, and YM where every Yi is characterized by a set of variables X1, X2, …, and XN. We want to find a direction D = (d1, d2, …, dN), where such that the variance of points projected onto D is maximized.
8
Principal component analysis (PCA)
Algorithm of PCA Start by coding the variables Y = (Y1, Y2, …YN) to have zero means and unit variances. Calculate the covariance matrix C of the samples. Find the eigenvalues λ1, λ2, …, λN, for C, where λi λi+1, i = 1, 2, …, N-1. Let D1, D2, … DN denote the corresponding eigenvectors. D1 is the first principal component direction, D2 is the second principal component direction, … , DN is the Nth principal component direction .
9
Principal component analysis (PCA)
Let A be a n*n covariance matrix. is an eigenvalue of A, and x is an eigenvector associated with the eigenvalue x = Ix, where I is an n*n identity matrix The characteristic polynomial of the matrix A
10
Principal component analysis (PCA)
For example, Let A be a 2*2 matrix.
11
PCA For example, 40 samples with 2 variables, X1 and X2
Covariance matrix λ1 = λ2 =36.780
12
Principal component analysis (PCA)
D1 = [ ] D2 = [ ]
13
Image Feature Extraction -PCA
Gray level value M = Next, we shall illustrate how PCA can be used to extract features from images. There is a example image M with 10 * 10 pixels.
14
Image Feature Extraction -PCA
10*10 pixels Each block with 4 pixels We partition the image into 5 * 5 blocks each with 4 pixels. Where NB is the number of blocks which is 25. Number of blocks (NB) is 25
15
Image Feature Extraction -PCA
Let matrix A be a matrix, which collects blocks of the image.
16
Image Feature Extraction -PCA
(1) Compute the covariance matrix of an image C1 C4 CM = Next, we construct a variance covariance matrix (VCM) for A. Each column can be regarded as a variable, which means the number of variables is N. Let Ck denote a variable that is the kth column of A. Here we given two variables Cs and Ct, and are the means of Cs and Ct, respectively. Equation shows the formula of covariance between any two variables. Var (Ck) = Cov(Ck,Ck)
17
Image Feature Extraction -PCA
(1) Compute the covariance matrix of an image CM = This slide shows the variance covariance matrix of A.
18
Image Feature Extraction -PCA
(2) Determine eigenvalues and eigenvectors =21860, =1743, = , and =393.73, The EValues of M are =21860, =1743, = , and = Eigenvalues
19
Image Feature Extraction -PCA
(2) Determine eigenvalues and eigenvectors CM = =21860, =1743, = , and =393.73, Eigenvectors Each EVector corresponds to an EValue; therefore, there are as many EVectors as EValues. Each EVector can be seen as a direction of an axis.
20
Image Feature Extraction -PCA
(3) Form the principal components (PCs) M = 23.9 = 20 * * * * 0.511
21
Image Feature Extraction -PCA
(4) Normalize the projected values
22
Image Feature Extraction -PCA
(4) Normalize the projected values
23
Principal component analysis (PCA)
PCA is a popular multivariate analysis technique, which can be used to extract features from images and to filter candidate images from image database. Nerveless, the number of candidate images offered by PCA is usually very large for a huge image database. Therefore, data mining technique is applied to speed up the retrieving speed and increase the accuracy rate.
24
Data Mining – Association Rules
Candidate 1-itemsets I = {A, B, C, D} Frequent 1-itemsets Minimum Support = 3
25
Data Mining – Association Rules
Candidate 2-itemsets I = {A, B, C, D} Frequent 2-itemsets Minimum Support = 3
26
Data Mining – Association Rules
Minimum Confidence = 100% Frequent 2-itemsets Association Rules
27
Data Mining for Image Features
28
Data Mining for Image Features
29
Data Mining for Image Features
Database for Normalization Projected Image(NPIDB) In Horizontal Direction
30
Data Mining for Image Features
Minimum Support = 3 Candidate 1-itemsets Candidate 2-itemsets Frequent 1-itemsets
31
Data Mining for Image Features
Minimum Confidence = 75% Frequent 2-itemsets
32
Data Mining for Image Features
Association Rules in Horizontal Direction
33
Data Mining for Image Features
Database for Normalization Projected Image(NPIDB) In Vertical Direction
34
Data Mining for Image Features
Association Rules in Vertical Direction
35
Data Mining for Image Features
Database for Normalization Projected Image(NPIDB) In Diagonal Direction
36
PCA and data mining
37
Illustration 450 full-color images 300 blocks for each image
4*4 pixels for a block
38
Illustration A query image Q
The set of eigenvalues of Q is {0, 2, 4, 6, 8}
39
Illustration Rules of Q are File name is “SW003.JPG.”
40
Future works - VQ and PCA
Vector Quantization (VQ) An image is separated into a set of input vectors Each input vector is matched with a codeword of the codebook
41
Vector Quantization (VQ)
Definition of vector quantization (VQ): , where Y is a finite subset of Rk. VQ is composed of the following three parts: Codebook generation process, Encoding process, and Decoding process.
42
Vector Quantization (VQ)
Image Index table
43
Vector Quantization (VQ)
Codebook generation 1 . N-1 N Training Images Training set Separating the image to vectors
44
Vector Quantization (VQ)
Codebook generation 1 . 1 . 254 255 N-1 N Initial codebook Training set Codebook initiation
45
Vector Quantization (VQ)
1 . Index sets 1 . 254 255 (1, 2, 5, 9, 45, …) (101, 179, 201, …) (8, 27, 38, 19, 200, …) N-1 N (23, 0, 67, 198, 224, …) Codebook Ci Training set 1 . Compute mean values 254 255 Replace the old vectors New Codebook Ci+1 Training using iteration algorithm
46
Example Codebook To encode an input vector, for example, v = (150,145,121,130) (1) Compute the distance between v with all vectors in codebook d(v, cw1) = d(v, cw2) = d(v, cw3) = 112.3 d(v, cw4) = d(v, cw5) = d(v, cw6) = 235.1 d(v, cw7) = d(v, cw8) = 63.2 (2) So, we choose 8 to replace the input vector v.
47
The Encoding algorithm using PCA
Codebook The covariance matrix
48
The Encoding algorithm using PCA
From the covariance matrix, we compute D1: (0.5038, , , ), λ1=19552, D2: ( , , , ), λ2=151, D3: ( , , , ), λ3=86 and D4: (0.7098, , , ), λ4=6. D1: (0.5038, , , ) is a coordinate D1 reserves 98.77% information of the variance of the codewords.
49
The Encoding algorithm using PCA
The new sorted codebook and the corresponding projected value of codewords Codebook The sorted codewords The projected values D1: (0.5038, , , )
50
The Encoding algorithm using PCA
Encode an input vector v = (150, 145, 121, 130) Transform v to α=D1*v α= (0.5038, , , ) * (150, 145, 121, 130)T= is the closet value to For , d(v, cw’5) = 63.2 For , d(v, cw’4) = 122.3 For , d(v, cw’6) = 114.2 So, we choose cw’5 to replace the v.
51
VQ and PCA for image retrieval
Association Rules:
52
VQ and PCA for image retrieval
Association Rules: ~
53
Query image Projected image ~
54
Conclusions An efficient image retrieval scheme based upon multivariate analysis technique and a data mining technique. PCA – extracting image features Association rules - matching the candidate images. VQ and PCA for similar image retrieval.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.