Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed.

Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed Hefeeda By: Ahmed Abdelsadek (aabdelsa@sfu.ca)

Outlines Introduction Project Scope Work Flow Image Features Indexing and Retrieval Matching Evaluation Conclusion

Introduction Current image search engines rely heavily on text to retrieve images ▫User provides keywords, and images having that keyword in the filename or in nearby html are candidates for retrieval. In this project we are willing to try content- based retrieval techniques where the query is an image.

Project Scope Similarity using local features. Extracting features from the reference images. Index these features in efficient data structure in a scalable large scale environment Process query images. Search and Match. This project is NOT ▫Recognition, Classification, Categorization

Work Flow

Image Features Using SIFT features (Scale-invariant feature transform). ▫A SIFT feature is a selected image region (also called keypoint) with an associated descriptor. ▫A SIFT descriptor is a histogram of the image gradients surrounding a keypoint. ▫Using PCA for Dimension Reduction

KD-Tree Using KD-Trees ▫Each tree level represent a dimension of a feature ▫Searching the index for the K-nearest neighbours

Logical View

Physical View

Matching For each query we extract the features and then search the index for the K-NN features. For each query feature, each neighbouring feature of it votes to certain image with a score of its rank. The maximum 10 images for the voting array are reported as the most similar images.

Evaluation Core KNN ▫Experiments on local machine. ▫Our results vs brute force Image retrieval ▫CalTech, and TRICVID datasets ▫On amazon AWS cloud. ▫We 8 machines.  Dual core  4 GB ram

Precision of KNN

Scanned Bins Size

Affect of Data Size

Image Recall @ K

First Correct @ K

Implementation Details The system is implemented in Java We use Hadoop 1.0.3 We run cloud experiments on AWS services ▫S3 ▫EMR We use some open source libraries ▫For images preprocessing we use : FFMPEG ▫For extracting SIFT features we use : VLFeat

Conclusion We implement a full pipeline for image retrieval problem. ▫The framework can easily support different types of features, different indexing methods. We show how we can build a big cloud system from small components.

Conclusion Intersection with my research Contributions ▫Feature Selection and Extraction ▫Implement Dimension Reduction ▫Design and Implement Map/Reduce Index ▫Implement Image Matching and Ranking

Questions ?

Thank you !

Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed.

Similar presentations

Presentation on theme: "Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed.

Similar presentations

Presentation on theme: "Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed."— Presentation transcript:

Similar presentations

About project

Feedback