Efficient Image Classification on Vertically Decomposed Data

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Multiclass SVM and Applications in Object Classification
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Empirical Evaluation of Dissimilarity Measures for Color and Texture
Designing Facial Animation For Speaking Persian Language Hadi Rahimzadeh June 2005.
Image Processing David Kauchak cs160 Fall 2009 Empirical Evaluation of Dissimilarity Measures for Color and Texture Jan Puzicha, Joachim M. Buhmann, Yossi.
M.S. Student, Hee-Jong Hong
Computer Vision Group, University of BonnVision Laboratory, Stanford University Abstract This paper empirically compares nine image dissimilarity measures.
Small Codes and Large Image Databases for Recognition CVPR 2008 Antonio Torralba, MIT Rob Fergus, NYU Yair Weiss, Hebrew University.
RETRIEVAL OF MULTIMEDIA OBJECTS USING COLOR SEGMENTATION AND DIMENSION REDUCTION OF FEATURES Mingming Lu, Qiyu Zhang, Wei-Hung Cheng, Cheng-Chang Lu Department.
Expectation Maximization Method Effective Image Retrieval Based on Hidden Concept Discovery in Image Database By Sanket Korgaonkar Masters Computer Science.
Content-Based Image Retrieval (CBIR) Student: Mihaela David Professor: Michael Eckmann Most of the database images in this presentation are from the Annotated.
Face Recognition with Harr Transforms and SVMs EE645 Final Project May 11, 2005 J Stautzenberger.
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Radial-Basis Function Networks
Image Processing David Kauchak cs458 Fall 2012 Empirical Evaluation of Dissimilarity Measures for Color and Texture Jan Puzicha, Joachim M. Buhmann, Yossi.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed.
Data Mining Techniques
Supervised Learning and k Nearest Neighbors Business Intelligence for Managers.
Performance Improvement for Bayesian Classification on Spatial Data with P-Trees Amal S. Perera Masum H. Serazi William Perrizo Dept. of Computer Science.
1 Lazy Learning – Nearest Neighbor Lantz Ch 3 Wk 2, Part 1.
Vertical Set Square Distance: A Fast and Scalable Technique to Compute Total Variation in Large Datasets Taufik Abidin, Amal Perera, Masum Serazi, William.
Wavelet-Based Multiresolution Matching for Content-Based Image Retrieval Presented by Tienwei Tsai Department of Computer Science and Engineering Tatung.
10/24/2015 Content-Based Image Retrieval: Feature Extraction Algorithms EE-381K-14: Multi-Dimensional Digital Signal Processing BY:Michele Saad
Handwritten digit recognition
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
A Fast and Scalable Nearest Neighbor Based Classification Taufik Abidin and William Perrizo Department of Computer Science North Dakota State University.
A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Fast and Scalable Nearest Neighbor Based Classification Taufik Abidin and William Perrizo Department of Computer Science North Dakota State University.
Content-Based Image Retrieval Using Color Space Transformation and Wavelet Transform Presented by Tienwei Tsai Department of Information Management Chihlee.
Clustering Microarray Data based on Density and Shared Nearest Neighbor Measure CATA’06, March 23-25, 2006 Seattle, WA, USA Ranapratap Syamala, Taufik.
Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.
Data Transformation: Normalization
Histograms CSE 6363 – Machine Learning Vassilis Athitsos
School of Computer Science & Engineering
School of Computer Science & Engineering
Efficient Image Classification on Vertically Decomposed Data
Improving the Performance of Fingerprint Classification
Yue (Jenny) Cui and William Perrizo North Dakota State University
Basic machine learning background with Python scikit-learn
SoC and FPGA Oriented High-quality Stereo Vision System
Histogram—Representation of Color Feature in Image Processing Yang, Li
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Machine Learning for Online Query Relaxation
In summary C1={skin} C2={~skin} Given x=[R,G,B], is it skin or ~skin?
A Fast and Scalable Nearest Neighbor Based Classification
K Nearest Neighbor Classification
Image Segmentation Techniques
A Fast and Scalable Nearest Neighbor Based Classification
Data Mining extracting knowledge from a large amount of data
Outline Introduction Background Our Approach Experimental Results
Image Classification Painting and handwriting identification
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Review Given a training space T(A1,…,An, C) and its features subspace X(A1,…,An) = T[A1,…,An], a functional f:X Reals, distance d(x,y)  |f(x)-f(y)| and.
An Adaptive Nearest Neighbor Classification Algorithm for Data Streams
Announcements Project 2 artifacts Project 3 due Thursday night
Taufik Abidin and William Perrizo
Topological Signatures For Fast Mobility Analysis
Notes from 02_CAINE conference
CAMCOS Report Day December 9th, 2015 San Jose State University
The “Margaret Thatcher Illusion”, by Peter Thompson
Presentation transcript:

Efficient Image Classification on Vertically Decomposed Data Taufik Abidin, Aijuan Dong, Hongli Li, and William Perrizo Computer Science North Dakota State University The 1st IEEE International Workshop on Multimedia Databases and Data Management (MDDM-06)

Outline Image classification The application of SMART-TV algorithm in image classification SMART-TV algorithm Experimental results Summary

Image Classification Why classifying images? The proliferation of digital images The need to organize them into semantic categories for effective browsing for effective retrieval Techniques for image classification: SVM, Bayesian, Neural Network, KNN

Image Classification Cont. In this work, we focus on KNN method KNN is widely used in image classification: Simple and easy to implement Good classification results Problems: Classification time is linear to the size of image repositories When the repositories are very large, contains millions of images, KNN is impractical

Our Contributions We apply our recently developed classification algorithms, a.k.a. SMART-TV for image classification task and analyze its performance We demonstrate that SMART-TV, a classification algorithm that uses P-tree vertical data structure, is fast and scalable to very large image databases We show that for Corel images a combination of color and texture features are a good alternative to represent the low-level of images

Image Preprocessing We extracted color and texture features from the original pixel of the images We created 54-dimension color histogram in HVS (6x3x3) color space for color features and created 8 multi-resolutions Gabor filter (4 orientations and 2 scales) to extract texture features of the images (see B.S. Manjunath, IEEE Trans. on Pattern Analysis and Machine Intelligence, 1996, for more detail about the filter)

Image Preprocessing Cont. Color Features Convert RGB to HSV HSV  to the way humans tend to perceive color The value is in the range of 0..1 Quantize the image into 54 bins i.e. (6 x 3 x 3) bins Record the frequency of the HSV of each pixel in the images

Image Preprocessing Cont. Texture Features Transform the images into frequency domain using the 8 filters generated (4 orientations and 2 scales parameters) and record the standard deviation and the mean of the pixel in the image after transformation This process will produce 16 texture features for each image

Overview of SMART-TV Compute Root Counts Store the root count Measure TV of each object in each class Large Training Set Store the root count and TV values Preprocessing Phase Unclassified Object Approximate the candidate set of NNs Search the K-nearest neighbors from the candidate set Vote Classifying Phase

SMART-TV Algorithm SMART-TV: SMall Absolute diffeRence of ToTal Variation Approximates a set of candidates of nearest neighbors by examining the absolute difference between the total variation of each data object in the training set and the total variation of the unclassified object The k-nearest neighbors are searched from the candidate set Computing Total Variation (TV):

Total Variation The Total Variation of a set X about (the mean), , measures total squared separation of objects in X about , defined as follows: TV(X,)=TV(X,x33) 1 2 3 4 5 X TV g a- a  Y

SMART-TV Algorithm

The Independency of RC The root count operations are independence from , which allows us to run the operations once in advance and retain the count results In classification task, the sets of classes are known and unchanged. Thus, the total variation of an object about its class can be pre-computed

Preprocessing Phase Preprocessing: The computation of root counts of each class Cj, where 1  j  number of classes. O(kdb2) where k is the number of classes, d is the total of dimensions, and b is the bit-width Compute , 1 j  number of classes. O(n) where n is the number of images in the training set

Classifying Phase Classifying: For each class Cj, where 1  j  number of classes do: a. Compute , where is the feature of the unclassified image Find hs images in Cj such that the absolute difference between the total variation of the images in Cj and the total variation of are the smallest, i.e. Let A be an array and , where c. Store the ID of the images in an arrayTVGapList

Classifying Phase (Cont.) For each objectIDt, 1 t  Len(TVGapList) where Len(TVGapList) is equal to hs times the total number of classes, retrieve the corresponding object features from the training set and measure the pair-wise Euclidian distance between and , i.e. and determine the k nearest neighbors of Vote the class label for from the k nearest neighbors

Dataset We used Corel images (http://wang.ist.psu.edu/docs/related) 10 categories Originally, each category has 100 images Number of feature attributes 70 (54 from color and 16 from texture) We randomly generated several bigger size datasets to evaluate the speed and scalability of the algorithms 50 images for testing set, 5 for each category

Dataset Cont.

Experimental Results Experimental Setup : Intel P4 CPU 2.6 GHz machine, 3.8GB RAM running Red Hat Linux Classification Accuracy Comparison Class SMART-TV KNN k=3 k=5 k=7 hs=15 hs=25 hs=35 C1 0.69 0.72 0.75 0.74 0.73 0.78 0.81 0.77 0.79 C2 0.64 0.60 0.59 0.62 0.68 0.63 0.66 C3 0.65 0.67 0.76 0.57 0.70 C4 0.84 0.87 0.90 0.88 C5 0.91 0.92 0.93 0.89 0.94 C6 0.61 0.71 C7 0.85 C8 0.96 C9 0.52 0.43 0.45 0.54 C10 0.82

Example on Corel Dataset

Experimental Results Cont. Loading Time Classification Time

Summary We have presented the SMART-TV algorithm, a classification algorithm that uses vertical data structure, and applied it in image classification task We found that the speed of our algorithm outperforms the speed of the classical KNN algorithm Our method scales well to large image repository. Its classification accuracy is very comparable to that of KNN algorithm