A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Slides:



Advertisements
Similar presentations
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Image Retrieval: Current Techniques, Promising Directions, and Open Issues Yong Rui, Thomas Huang and Shih-Fu Chang Published in the Journal of Visual.
DONG XU, MEMBER, IEEE, AND SHIH-FU CHANG, FELLOW, IEEE Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment.
Presented by Xinyu Chang
Kien A. Hua Division of Computer Science University of Central Florida.
Content-Based Image Retrieval
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Content-based Video Indexing, Classification & Retrieval Presented by HOI, Chu Hong Nov. 27, 2002.
Young Deok Chun, Nam Chul Kim, Member, IEEE, and Ick Hoon Jang, Member, IEEE IEEE TRANSACTIONS ON MULTIMEDIA,OCTOBER 2008.
Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.
Morris LeBlanc.  Why Image Retrieval is Hard?  Problems with Image Retrieval  Support Vector Machines  Active Learning  Image Processing ◦ Texture.
UMC – HCI seminar series 1 Human Computer Interaction Query by Sketch Chi-Ren Shyu Department of Computer Engineering and Computer Science University of.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
ACM Multimedia th Annual Conference, October , 2004
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2005 with a lot of slides stolen from Steve Seitz and.
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Image Categorization by Learning and Reasoning with Regions Yixin Chen, University of New Orleans James Z. Wang, The Pennsylvania State University Published.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Presented by Zeehasham Rasheed
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Spatial Pyramid Pooling in Deep Convolutional
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
Content-based Image Retrieval (CBIR)
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Multiclass object recognition
AdvisorStudent Dr. Jia Li Shaojun Liu Dept. of Computer Science and Engineering, Oakland University 3D Shape Classification Using Conformal Mapping In.
A Thousand Words in a Scene P. Quelhas, F. Monay, J. Odobez, D. Gatica-Perez and T. Tuytelaars PAMI, Sept
1 Faculty of Information Technology Generic Fourier Descriptor for Shape-based Image Retrieval Dengsheng Zhang, Guojun Lu Gippsland School of Comp. & Info.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
Image Retrieval Part I (Introduction). 2 Image Understanding Functions Image indexing similarity matching image retrieval (content-based method)
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Content-Based Image Retrieval
A Comparison Between Bayesian Networks and Generalized Linear Models in the Indoor/Outdoor Scene Classification Problem.
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
A Two-level Pose Estimation Framework Using Majority Voting of Gabor Wavelets and Bunch Graph Analysis J. Wu, J. M. Pedersen, D. Putthividhya, D. Norgaard,
Non-Photorealistic Rendering and Content- Based Image Retrieval Yuan-Hao Lai Pacific Graphics (2003)
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Competence Centre on Information Extraction and Image Understanding for Earth Observation 29th March 2007 Category - based Semantic Search Engine 1 Mihai.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
October Andrew C. Gallagher, Jiebo Luo, Wei Hao Improved Blue Sky Detection Using Polynomial Model Fit Andrew C. Gallagher, Jiebo Luo, Wei Hao Presented.
Image Classification for Automatic Annotation
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
A Novel Visualization Model for Web Search Results Nguyen T, and Zhang J IEEE Transactions on Visualization and Computer Graphics PAWS Meeting Presented.
Category Independent Region Proposals Ian Endres and Derek Hoiem University of Illinois at Urbana-Champaign.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Yixin Chen and James Z. Wang The Pennsylvania State University
Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations ZUO ZHEN 27 SEP 2011.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
On Using SIFT Descriptors for Image Parameter Evaluation Authors: Patrick M. McInerney 1, Juan M. Banda 1, and Rafal A. Angryk 2 1 Montana State University,
Relevance Feedback in Image Retrieval System: A Survey Tao Huang Lin Luo Chengcui Zhang.
1 Knowledge-Based Medical Image Indexing and Retrieval Caroline LACOSTE Joo Hwee LIM Jean-Pierre CHEVALLET Daniel RACOCEANU Nicolas Maillot Image Perception,
Visual Information Retrieval
Recognition using Nearest Neighbor (or kNN)
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
CSE 635 Multimedia Information Retrieval
Knowledge-based event recognition from salient regions of activity
Ying Dai Faculty of software and information science,
Support vector machine-based text detection in digital video
Presentation transcript:

A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Introduction Motivation –To do content based image retrieval from non-specific images in a broad domain.

Literature Review Semantic labeling approach by Town & Sinclair into 11 visual categories by ANN. Monotonic tree Approach for classifying image into semantic regions (8 in total). Associating image with words but not scalable for image with diverse content.

Image Retrieval Cycle User images Feature extraction Feature & Image storage User Query term comparison with features Result StorageRetrieval

Semantic Gap Semantic Extraction –Requires object recognition & scene understanding –Monotonic tree Semantic Interpretation –Pre-query (manual annotation), query, post query (relevance feedback)

Semantic Gap Image Low level feature extraction Object recognition User search term & requirement

Structured learning for Image Indexing Based on SSR – salient image patches that exhibit semantic meaning. SSR are learned a priori and detected during image indexing. No region segmentation step is required. Image indexing onto the classification space spanned by semantic label.

Semantic Support Region (SSR) Introduced to address the issue of high content diversity Modular view based object detector Generate spatial semantic signature Similarity based and fuzzy logic based query processing. Not restricted to the main area of attention in image.

Semantic Support Region (SSR) FaceFigureCrowdSkin FlowerGreenFarBranch CloudyClearFloorBlue sand Rocky PoolGrass FarCity wall Old wooden Pond china River FabricLight

Semantic Support Regions (SSR) SSR are segmentation free image region. Have semantic meanings. Detected from tessellated image blocks Reconciled across multiple resolution Aggregated spatially

SSR learning Use of Support Vector Machines Features employed –Color (YIQ) – 6 dimensions –Texture (Gabor coefficient) – 60 dimensions 26 classes of SSR – 8 super classes (People(4), sky(3), ground(3), water(3), foliage(3), mountain(2), building(3), interior(5)) Kernel – polynomial with degree 2 and a constant. Total data for train & test – 554 image regions from 138 images. Training data – 375 image regions from 105 images. Test data – 179 image regions.

SSR Detection

Feature vectors z c and z t (for color & texture) 3 color maps and 30 texture maps from Gabor Coefficient. Windows of different scales used for scale invariance. Each pixel will consolidate the SSR classification vector T i (z)

Multiscale Reconciliation Object detected in different region in image Fusing multiple SSR detected from different image scale Comparing two detection map at a time (from 60 x 60 & 50 x 50 to 30 x 30 & 20 x 20) Smallest scan windows consolidating the result

Spatial Aggregation Summarize the reconciled detection map in larger spatial region. Spatial aggregation Map (SAM) variable emphasis (weights). SAM are invariant to image rotation & translation SAM effected slightly by change of angle of view, change of scale, occlusion.

Spatial Aggregation Map

Scalability Modular Nature Independent training of binary detectors. Parallel computation of feature map. Multiple SSR detection simultaneously Concurrent spatial aggregation by different nodes in SAM. Retraining of SVM with the addition of new SSR.

Query Methods Low-level features –QBE (Query By Example) –QBC (Query By Canvas) Semantic Information –QBK (Query By keywords) –QBS (Query By sketches) –QBSI (Query By Spatial Icons)

Query Formulation & Parsing QBME (Query by multiple examples) Similarity computed based on the similarity between their tessellated blocks. Larger block for similar semantics but different spatial arrangement. Smaller blocks for spatial specificity. City block distance provide best performance.

Query Formulation & Parsing QBSI (Query by Spatial icons) –Spatial arrangement of visual semantics –Q (Visual query term) specify region R for SSR i. –Chaining of these term VQT. –Two level is-a hierarchy of SSRs –Use of max in abstract visual semantics.

Query Formulation & Parsing Disjunctive normal form of VQT can be used (with or without negation). Fuzzy operation to remove the uncertainty in values. Vocabulary for the QBSI limited by the semantics Graphical interface provided for VQT Indexing the images with 3 x 3 spatial tessellation with 26 SSR.

Experimental Results Tested on consumer images »More challenging & complex »Diverse content »Faded, over exposed, blurred, dark »Different focuses, distances and occlusion heterogeneous photos of a single family taken over the span of 5 years Indoor and outdoor settings Resolution of 256 x 384 converted to 240 x 360 No pre-selection of images.

QBME Experiment 24 semantic queries for 2400 images Truth values based on the opinion of 3 subjects Comparison with feature based approach (CTO). »Best performing parameters selected

QBSI Experiment 15 QBSI queries for 2400 photos Query examples for QBSI

QBSI Experiment Results Precision on top retrieved images for QBSI experiment

Advantages of QBSI Explicit specification of visual semantics with combination Better and more accurate expression than sketches and visual icons.

Conclusion & Future Work SSR allows image indexing based on local semantics without region segmentation A unique and powerful query language. Extendable to other domains like medical images.

Questions