Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.

Slides:



Advertisements
Similar presentations
A probabilistic model for retrospective news event detection
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
January 23 rd, Document classification task We are interested to solve a task of Text Classification, i.e. to automatically assign a given document.
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Víctor Ponce Miguel Reyes Xavier Baró Mario Gorga Sergio Escalera Two-level GMM Clustering of Human Poses for Automatic Human Behavior Analysis Departament.
1 A scheme for racquet sports video analysis with the combination of audio-visual information Visual Communication and Image Processing 2005 Liyuan Xing,
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Automatic Image Annotation and Retrieval using Cross-Media Relevance Models J. Jeon, V. Lavrenko and R. Manmathat Computer Science Department University.
Expectation Maximization Method Effective Image Retrieval Based on Hidden Concept Discovery in Image Database By Sanket Korgaonkar Masters Computer Science.
Abstract We present a model of curvilinear grouping using piecewise linear representations of contours and a conditional random field to capture continuity.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Incremental Learning of Temporally-Coherent Gaussian Mixture Models Ognjen Arandjelović, Roberto Cipolla Engineering Department, University of Cambridge.
1 Integrating User Feedback Log into Relevance Feedback by Coupled SVM for Content-Based Image Retrieval 9-April, 2005 Steven C. H. Hoi *, Michael R. Lyu.
WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona.
Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.
Presented by Zeehasham Rasheed
Object Class Recognition using Images of Abstract Regions Yi Li, Jeff A. Bilmes, and Linda G. Shapiro Department of Computer Science and Engineering Department.
A Probabilistic Framework for Video Representation Arnaldo Mayer, Hayit Greenspan Dept. of Biomedical Engineering Faculty of Engineering Tel-Aviv University,
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Distributed Representations of Sentences and Documents
Scalable Text Mining with Sparse Generative Models
A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)
A Bidirectional Matching Algorithm for Deformable Pattern Detection with Application to Handwritten Word Retrieval by K.W. Cheung, D.Y. Yeung, R.T. Chin.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Speaker: Chi-Yu Hsu Advisor: Prof. Jian-Jung Ding Leveraging Stereopsis for Saliency Analysis, CVPR 2012.
Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.
SVCL Automatic detection of object based Region-of-Interest for image compression Sunhyoung Han.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Hierarchical Distributed Genetic Algorithm for Image Segmentation Hanchuan Peng, Fuhui Long*, Zheru Chi, and Wanshi Siu {fhlong, phc,
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.
Universit at Dortmund, LS VIII
ALIP: Automatic Linguistic Indexing of Pictures Jia Li The Pennsylvania State University.
TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
D. M. J. Tax and R. P. W. Duin. Presented by Mihajlo Grbovic Support Vector Data Description.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Competence Centre on Information Extraction and Image Understanding for Earth Observation 29th March 2007 Category - based Semantic Search Engine 1 Mihai.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Region-Based Saliency Detection and Its Application in Object Recognition IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM FOR VIDEO TECHNOLOGY, VOL. 24 NO. 5,
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
Effective Reranking for Extracting Protein-protein Interactions from Biomedical Literature Deyu Zhou, Yulan He and Chee Keong Kwoh School of Computer Engineering.
Image Classification for Automatic Annotation
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Unsupervised Mining of Statistical Temporal Structures in Video Liu ze yuan May 15,2011.
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
Yixin Chen and James Z. Wang The Pennsylvania State University
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
Learning to Estimate Query Difficulty Including Applications to Missing Content Detection and Distributed Information Retrieval Elad Yom-Tov, Shai Fine,
NTU & MSRA Ming-Feng Tsai
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Content-Based Image Retrieval Using Color Space Transformation and Wavelet Transform Presented by Tienwei Tsai Department of Information Management Chihlee.
Ontology-based Automatic Video Annotation Technique in Smart TV Environment Jin-Woo Jeong, Hyun-Ki Hong, and Dong-Ho Lee IEEE Transactions on Consumer.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
Semantic Video Classification
Project Implementation for ITCS4122
PEBL: Web Page Classification without Negative Examples
SMEM Algorithm for Mixture Models
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Video Compass Jana Kosecka and Wei Zhang George Mason University
Knowledge-based event recognition from salient regions of activity
Authors: Wai Lam and Kon Fan Low Announcer: Kyu-Baek Hwang
Presentation transcript:

Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu SIGIR 2004

Abstract In this paper, we propose a multi-level approach to annotate the semantics of natural scenes by using both the dominant image components and the relevant semantic concepts. We use the salient objects as the dominant image components to achieve automatic image annotation at the content level. To detect the salient objects automatically, a set of detection functions are learned from the labeled image regions by using Support Vector Machine (SVM) classifiers.

Abstract To generate the semantic concepts, finite mixture models are used to approximate the class distributions of the relevant salient objects. An adaptive EM algorithm has been proposed to determine the optimal model structure and model parameters simultaneously.

Salient Object Detection The salient objects are defined as the visually distinguishable image components or the global visual properties of whole images that can be identified by using the spectrum templates in the frequency domain. We will focus on modeling and detecting the salient objects that correspond to the visually distinguishable image components.

Salient Object Detection The objective of image analysis is to parse the natural images into the salient objects in the basic vocabulary. we have already implemented 32 functions to detect 32 types of salient objects in natural scenes, and each function is able to detect a certain type of these salient objects in the basic vocabulary.

Salient Object Detection

Each detection function consists of three parts: (a) automatic image segmentation by using mean shift technique; (b) image region classification by using the SVM classifiers with an optimal model parameter search scheme; (c) label-based region aggregation for salient object generation.

Salient Object Detection The image region classifier is learned from these available labeled training samples. Each labeled training sample is a pair (X l, L j (X l )) that consists of a set of region-based low-level visual features X l and the semantic label L j (X l ) for the corresponding labeled homogeneous image region. We use the well-known SVM classifiers for binary image region classification.

Semantic Image Classification and Multi-level Annotation We use the finite mixture model (FMM) to approximate the class distribution of the salient objects that are relevant to a certain semantic concept Cj :

Semantic Image Classification and Multi-level Annotation The optimal model structure and parameters for a certain semantic concept are determined by: where L(C j, X, κ, ω cj, Θ cj ) = −log P(X, C j, κ, ω cj, Θ cj ) + log p(κ, ω cj, Θ cj ) is the likelihood function, P(X, C j, κ, ω cj, Θ cj ) is the class distribution as described in Eq. (3), log p(κ, ω cj, Θ cj ) = is the minimum description length (MDL) term to penalize the complex models

Semantic Image Classification and Multi-level Annotation The estimate of maximum likelihood can be achieved by using the EM algorithm. Unfortunately, the EM iteration needs the knowledge of κ. To estimate the optimal number of mixture components, we propose an adaptive EM algorithm by integrating parameter estimation and model selection (i.e., selecting the optimal number κ of mixture components) seamlessly in a single algorithm.

Adaptive EM Algorithm Step 1: Our adaptive EM algorithm starts from a reasonably large value of κ to explain all the structure of the training samples and reduce the number of mixture components sequentially. Step 2: If two strongly overlapped mixture components provide similar densities and overpopulate the relevant sample areas. Thus they can be potentially merged as a single mixture component. To detect the best candidate for merging, our adaptive EM algorithm tests κ(κ−1) / 2 pairs of mixture components, and the pair with the minimum value of the local Kullback divergence is selected as the best candidate for merging.

Adaptive EM Algorithm Step 3: Given a certain number of mixture components, the traditional EM algorithm is performed to estimate their mixture parameters. After the EM iteration procedure converges, a weak classifier is built. If the average performance of this weak classifier is good enough, P(C j |X, κ, ω cj,Θ cj ) ≥ δ 1, go to step 4. Otherwise, go back step 2. Step 4: Output the mixture parameters κ, Θ cj, and ω.

Semantic Image Classification and Multi-level Annotation Given a certain test image I l, its salient objects are detected automatically. Thus I l = {S 1, · · ·, S i, · · ·, S n }. The class distribution of these salient objects is then modeled as a finite mixture model P(X, C j |κ, ω cj,Θ cj ). The test image I l is finally classified into the best matching semantic image concept C j with the maximun posterior probability.

Semantic Image Classification and Multi-level Annotation The text keywords for interpreting the salient objects provide the annotations of the images at the content level. The text keywords for interpreting the relevant semantic concepts provide the annotations of the images at the concept level. Figure 8: The result for our multi-level image annotation system, where the image annotation includes the keywords for the salient objects “sky”, “sea water”, “sand field” and the semantic concept “beach”.

Performance Evaluation two image databases: –a photography database that is obtained from Google search engine pictures –a Corel image database 125,000 pictures These images (total 160,000) are classified into 15 pre-defined classes of semantic concepts and one additional category for outliers. Training sets for 15 semantic concepts consist of 1,800 labeled samples, where each semantic concept has 120 positive labeled samples.

Performance Evaluation

Figure 10: The performance comparison (i.e., classification precision versus classification recall) between finite mixture model and SVM approach.

Figure 12: The semantic image classification and annotation results of the natural scenes that consist of the semantic concept “garden” and the most relevant salient objects.

Performance Evaluation Figure 13: The semantic image classification and annotation results of the natural scenes that consist of the semantic concept “sailing” and the most relevant salient objects.

Conclusion This paper has proposed a novel framework to enable more effective semantic image classification and multi-level annotation. It is worth noting that the proposed automatic salient object detection and semantic image classification techniques can also be used for other image domains when the labeled training samples are available.