Pattern Recognition & Detection: Texture Classifier ECE 4226 Presented By: Denis Petrusenko December 10, 2007
Introduction – Problem description – Feature extraction – Textures used – Models considered – Overview of work being presented Implementation Details Experimentation Summary Conclusion Demo References 2 Overview
Texture Recognition – Texture: “The characteristic appearance of a surface having a tactile quality” Problem – Given: texture samples for different classes – Objective: recognize which class a never-before- seen texture sample belongs to 3 Overview: Problem Description
Histogram H(k) with K=0…L-1 [1] – L gray intensity levels vs number of occurrences – Represented by L-dimensional feature vector – Several approaches for summarizing H(k) Intensity Moments – First, calculate mean: – Then, central moments » » µ 2 : Variance, µ 3 : Skewness, µ 4 : Kurtosis Entropy: Uniformity: 4 Overview: Feature Extraction
GLCM,[1] Relationship R: pixels shifted, usually by 1 Comes out to L 2 features Again, several ways to summarize – Maximum – Square Sum – Intensity Moments 5 Feature Extraction: Gray Level Co-Occurrence Matrix
Entropy Uniformity Variance Skewness Kurtosis GLCM Features – Cmax – Csq – Qp-2 – Qp+2 For all calculations, L = 16 6 Feature Extraction: Actual Features Used
Picked 8 textures from Brodatz set [10] Cut out a 200x200 chunk for speed Samples were taken with a sliding window – Window size 60x60 pixels Keeping it close to example size for consistency – Arbitrary overlap of 15 pixels 3 to 4 random samples per class for testing – Taken before resizing to allow for fresh data 7 Introduction: Textures Used
8
Parzen Windows Classifier (PWC) [2] – Basically, find least error Training data versus arbitrary example – Uses class label from closest training data – Training is just computing priors All the “hard work” happens during classification 9 Introduction : Models Considered
Multi-Level Perceptron (MLP) [3] 10 Introduction : Models Considered
Created working texture recognizer – Training images broken down into chunks Example-sized window scans across with overlap Dimensions controlled from GUI – Feature extraction pulls 9 unique features Calculates means and st. dev. for normalization Arbitrary samples are normalized prior to classification – Training data passed to classifiers PWC just uses the training data directly MLP runs training until some error threshold – Arbitrary samples classified correctly most of the time 11 Introduction : Overview of Work
Each classifier (PWC, MLP) is a class – Abstract class takes care of generic stuff Keeping track of training data Static feature extraction code – Histogram intensity moments – GLCM calculations Examples in vector form, with class label Vectors can do most basic vector operations Used C# and Windows Forms for UI 12 Implementation Details
13 PWC uses a Gaussian as the kernel function – Spread was set to 0.5 and 100,000 – Does not make any difference after normalization MLP – Maximum error rate 8% – Class count * 2 hidden nodes – Learning rate set to 0.20 – Everything is adjustable from the GUI – Uses backpropagation [4] for updating weights 13 Implementation Details
14 Training data cached when training starts Don’t have to load features every time Classifier states not cached – PWC only has priors for state Negligible time to compute them every time – MLP produces different results every run Due to random weights assignment Training progress shown in real time 14 Implementation Details
Histogram: Entropy 15 Experimentation: Linear Feature Separation
Histogram: Uniformity 16 Experimentation: Linear Feature Separation
17 Histogram: Variance 17 Experimentation: Linear Feature Separation
18 Histogram: Skewness 18 Experimentation: Linear Feature Separation
19 Histogram: Kurtosis 19 Experimentation: Linear Feature Separation
20 GLCM: Cmax 20 Experimentation: Linear Feature Separation
21 GLCM: Csq 21 Experimentation: Linear Feature Separation
22 GLCM: Qn Experimentation: Linear Feature Separation
23 GLCM: Qn Experimentation: Linear Feature Separation
Experimentation: Confusion Matrix
– Default, 1 error – Same as default – 22 errors – 22 errors – Total failure Experimentation: MLP Min Error
26 Experimentation: MLP Hidden Nodes 6 Nodes: Nodes: Nodes: Nodes: Nodes: Node: Nodes: Nodes: Epoch count: 500, none had time to converge
Full convergence 4 Nodes Perfect performance with 5+ hidden nodes Experimentation: MLP Hidden Nodes
Epochs to converge vs hidden nodes 28 Experimentation: MLP Hidden Nodes
29 Experimentation: MLP Hidden Nodes Color Assignment and Original Mosaic
8 hidden neurons 30 Experimentation: MLP Hidden Nodes
16 hidden neurons 31 Experimentation: MLP Hidden Nodes
32 Experimentation: MLP Hidden Nodes
33 Experimentation: MLP Hidden Nodes
34 Experimentation: MLP Hidden Nodes
GLCM features better than Histogram features Both types combined work out pretty good PWC and MLP can achieve the same quality – Not necessarily true for less separated textures PWC has only one modifiable parameter – Make no difference on normalized features! MLP has 3 parameters – Minimal error rate is crucial – Lambda mostly affects convergence rate – Hidden layers seem to need at least class count Trouble converging with too few 35 Summary
Created texture recognizer Computed 9 features – 5 from histogram – 4 from GLCM Employed two different classifiers – PWC No parameters – MLP Several parameters to tweak UI allows to work with multiple files 36 Conclusion
37 Demo
1.Robert M Haralick, K Shanmugam, Its'hak Dinstein (1973). "Textural Features for Image Classification". 2.Emanuel Parzen (1962). On estimation of a probability density function and mode. 3.Warren S. McCulloch and Walter Pitts (1943). A logical calculus of ideas immanent in nervous activity. 4.Robert Hecht-Nielsen (1989). Theory of the backpropagation neural network. 5.Lee K. Jones (1990). Constructive approximations for neural networks by sigmoidal functions. 6.Keinosuke Fukunaga (1990). Introduction to Statistical Pattern Recognition. 7.R. Rojas: Neural Networks, Springer-Verlag, Berlin, Claude Mbusa Takenga, Koteswara Rao Anne, K. Kyamakya, Jean Chamberlain Chedjou: Comparison of Gradient descent method, Kalman Filtering and decoupled Kalman in training Neural Networks used for fingerprint-based positioning References