Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos Yiming Liu, Dong Xu, Ivor W. Tsang, Jiebo Luo Nanyang Technological.

Slides:



Advertisements
Similar presentations
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Advertisements

Image Retrieval With Relevant Feedback Hayati Cam & Ozge Cavus IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK Hayati CAM Ozge CAVUS.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Three things everyone should know to improve object retrieval
Content-Based Image Retrieval
Lazy vs. Eager Learning Lazy vs. eager learning
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
1 Learning on Relevance Feedback in Content-based Image Retrieval Hoi, Chu-Hong ( Steven ) Supervisor: Prof. Michael R. Lyu Venue:RM1027 Date: 11:00a.m.
Morris LeBlanc.  Why Image Retrieval is Hard?  Problems with Image Retrieval  Support Vector Machines  Active Learning  Image Processing ◦ Texture.
NCKU CSIE Visualization & Layout for Image Libraries Baback Moghaddam, Qi Tian IEEE Int’l Conf. on CVPR 2001 Speaker: 蘇琬婷.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
ACM Multimedia th Annual Conference, October , 2004
1 Integrating User Feedback Log into Relevance Feedback by Coupled SVM for Content-Based Image Retrieval 9-April, 2005 Steven C. H. Hoi *, Michael R. Lyu.
1998/5/21by Chang I-Ning1 ImageRover: A Content-Based Image Browser for the World Wide Web Introduction Approach Image Collection Subsystem Image Query.
Presentation in IJCNN 2004 Biased Support Vector Machine for Relevance Feedback in Image Retrieval Hoi, Chu-Hong Steven Department of Computer Science.
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
Presented by Zeehasham Rasheed
Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.
Optimizing Learning with SVM Constraint for Content-based Image Retrieval* Steven C.H. Hoi 1th March, 2004 *Note: The copyright of the presentation material.
DVMM Lab, Columbia UniversityVideo Event Recognition Video Event Recognition: Multilevel Pyramid Matching Dong Xu and Shih-Fu Chang Digital Video and Multimedia.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Information Retrieval in Practice
Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.
Content-Based Video Retrieval System Presented by: Edmund Liang CSE 8337: Information Retrieval.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.
APPLICATIONS OF DATA MINING IN INFORMATION RETRIEVAL.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Multimedia Databases (MMDB)
HANOLISTIC: A HIERARCHICAL AUTOMATIC IMAGE ANNOTATION SYSTEM USING HOLISTIC APPROACH Özge Öztimur Karadağ & Fatoş T. Yarman Vural Department of Computer.
Content-Based Image Retrieval
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University Carnegie Mellon Question Answering from Errorful.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
INTELLIGENT ORACLE CEMNET, SCE, NTU Speaker: Zeng Zinan
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
Group Sparse Coding Samy Bengio, Fernando Pereira, Yoram Singer, Dennis Strelow Google Mountain View, CA (NIPS2009) Presented by Miao Liu July
Chapter 6: Information Retrieval and Web Search
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
IEEE Int'l Symposium on Signal Processing and its Applications 1 An Unsupervised Learning Approach to Content-Based Image Retrieval Yixin Chen & James.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
Unsupervised Learning of Visual Sense Models for Polysemous Words Kate Saenko Trevor Darrell Deepak.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
Content-Based Image Retrieval Using Fuzzy Cognition Concepts Presented by Tienwei Tsai Department of Computer Science and Engineering Tatung University.
INTERACTIVELY BROWSING LARGE IMAGE DATABASES Ronald Richter, Mathias Eitz and Marc Alexa.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Flickr Tag Recommendation based on Collective Knowledge BÖrkur SigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW Summarized and presented.
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
Semi-Automatic Image Annotation Liu Wenyin, Susan Dumais, Yanfeng Sun, HongJiang Zhang, Mary Czerwinski and Brent Field Microsoft Research.
Image Classification for Automatic Annotation
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Combining Text and Image Queries at ImageCLEF2005: A Corpus-Based Relevance-Feedback Approach Yih-Cheng Chang Department of Computer Science and Information.
Relevance Feedback in Image Retrieval System: A Survey Tao Huang Lin Luo Chengcui Zhang.
1 Learning Bias & Clustering Louis Oliphant CS based on slides by Burr H. Settles.
1 Knowledge-Based Medical Image Indexing and Retrieval Caroline LACOSTE Joo Hwee LIM Jean-Pierre CHEVALLET Daniel RACOCEANU Nicolas Maillot Image Perception,
Content-Based Image Retrieval Using Color Space Transformation and Wavelet Transform Presented by Tienwei Tsai Department of Information Management Chihlee.
1 A Methodology for automatic retrieval of similarly shaped machinable components Mark Ascher - Dept of ECE.
Ontology-based Automatic Video Annotation Technique in Smart TV Environment Jin-Woo Jeong, Hyun-Ki Hong, and Dong-Ho Lee IEEE Transactions on Consumer.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
Efficient Image Classification on Vertically Decomposed Data
Efficient Image Classification on Vertically Decomposed Data
Instance Based Learning
Multimedia Information Retrieval
Semi-Automatic Data-Driven Ontology Construction System
Presentation transcript:

Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos Yiming Liu, Dong Xu, Ivor W. Tsang, Jiebo Luo Nanyang Technological University & Kodak Research Lab

Motivation Digital cameras and mobile phone cameras popularize rapidly: –More and more personal photos; –Retrieving images from enormous collections of personal photos becomes an important topic. ? How to retrieve?

Previous Work Content-Based Image Retrieval (CBIR) –Users provide images as queries to retrieve personal photos. The paramount challenge -- semantic gap: –The gap between the low-level visual features and the high-level semantic concepts. … Low-level Feature vector Image with high- level concept queryresult … … Feature vectors in DB compare Semantic Gap

A More Natural Way For Consumer Applications Let the user to retrieve the desirable personal photos using textual queries. Image annotation is used to classify images w.r.t. high-level semantic concepts. –Semantic concepts are analogous to the textual terms describing document contents. An intermediate stage for textual query based image retrieval. query Sunset Annotation Result: high-level concepts Annotation Result: high-level conceptsannotate compare …database …result rank

Our Goal Web images are accompanied by tags, categories and titles. … building people, family people, wedding sunset … WebImagesContextualInformation Web Images Consumer Photos Leverage information from web image to retrieve consumer photos in personal photo collection. information No intermediate image annotation process. A real-time textual query based consumer photo retrieval system without any intermediate annotation stage.

When user provides a textual query, Textual Query Classifier Automatic Web Image Retrieval Automatic Web Image Retrieval Large Collection of Web images (with descriptive words) Relevant/ Irrelevant Images WordNet Relevance Feedback Relevance Feedback Refined Top-Ranked Photos Consumer Photo Retrieval Consumer Photo Retrieval Raw Consumer Photos Top-Ranked Consumer Photos It would be used to find relevant/irrelevant images in web image collections. Then, a classifier is trained based on these web images. And then consumer photos can be ranked based on the classifier’s decision value. The user can also gives relevance feedback to refine the retrieval results. System Framework

“boat” Inverted File Inverted File Relevant Web Images Irrelevant Web Images boat ark barge dredgerhouseboat … Semantic Word Trees Based on WordNet For user’s textual query, first search it in the semantic word trees. The web images containing the query word are considered as “relevant web images”. The web images which do not contain the query word and its two-level descendants are considered as “irrelevant web images”. Automatic Web Image Retrieval

Decision Stump Ensemble Train a decision stump on each dimension. Combine them with their training error rates.

Why Decision Stump Ensemble? Main reason: low time cost –Our goal: a (quasi) real-time retrieval system. –For basic classifiers: SVMs are much slower; –For combination: boosting is also much slower. The advantage of decision stump ensemble: –Low training cost; –Low testing cost; –Very easy to parallelize;

Asymmetric Bagging Imbalance: count(irrelevant) >> count(relevant) –Side effects, e.g. overfitting. Solution: asymmetric bagging –Repeat 100 times by using different randomly sampled irrelevant web images. irrelevant images relevant images 100 training sets … …

Relevance Feedback The user labels n l relevant or irrelevant consumer photos. –Use this information to further refine the retrieval results; Challenge 1: Usually n l is small; Challenge 2: Cross-domain learning –Source classifier is trained on the web image domain. –The user labels some personal photos.

Method 1: Cross-Domain Combination of Classifiers Re-train classifiers with data from both domain? –Neither effective nor efficient; A simple but effective method: –Train an SVM on the consumer photo domain with user-labeled photos; –Convert the responds of source classifier and SVM classifier to probability, and add them up; –Rank consumer photos based on this sum value. Referred as DS_S+SVM_T.

Method 2: Cross-Domain Regularized Regression (CDRR) Construct a linear regression function f T (x): –For labeled photos: f T (x i ) ≈ y i ; –For unlabeled photos: f T (x i ) ≈ f s (x i ); Source Classifier

Other images f T (x) should be f s (x) Design a target linear classifier f T (x) = w T x. User-labeled images x 1,…,x l f T (x) should be the user’s label y(x) A regularizer to control the complexity of the target classifier f T (x) This problem can be solved with least square solver.

Hybrid Method A combination of two methods. For labeled consumer photos: –Measure the average distance d avg to their 30 nearest unlabeled neighbors in feature space; –If d avg < ε: Use DS_S+SVM_T; –Otherwise: Use CDRR. Reason: –For consumer photos which are visually similar to user-labeled images, they should be influenced more by user-labeled images.

Experimental Results

Dataset and Experimental Setup Web Image Database: –1.3 million photos from photoSIG. –Relatively professional photos. Text descriptions for web images: –Title, portfolio, and categories accompanied with web images; –Remove the common high-frequency words; –Remove the rarely-used words. –Finally, words in our vocabulary.

Dataset and Experimental Setup Testing Dataset #1: Kodak dataset –Collected by Eastman Kodak Company: From about 100 real users. Over a period of one year. –1358 images: The first keyframe from each video. –21 concepts: We merge “group_of_two” and “group_of_three_or_more” to one concept.

Dataset and Experimental Setup Testing Dataset #2: Corel dataset –4999 images 192x128 or 128x192. –43 concepts: We remove all concepts in which there are fewer than 100 images.

Visual Features Grid-Based color moment (225D) –Three moments of three color channels from each block of 5x5 grid. Edge direction histogram (73D) –72 edge direction bins plus one non-edge bin. Wavelet texture (128D) Concatenate all three kinds of features: –Normalize each dimension to avg = 0, stddev = 1 –Use first 103 principal components.

Retrieval without Relevance Feedback For all concepts: –Average number of relevant images:

Retrieval without Relevance Feedback kNN: rank consumer photos with average distance to 300-nn in the relevant web images. DS_S: decision stump ensemble.

Retrieval without Relevance Feedback Time cost: –We use OpenMP to parallelize our method; –With 8 threads, both methods can achieve interactive level. –But kNN is expected to cost much time on large- scale datasets.

Retrieval with Relevance Feedback In each round, the user labels at most 1 positive and 1 negative images in top-40; Methods for comparison: –kNN_RF: add user-labeled photos into relevant image set, and re-apply kNN; –SVM_T: train SVM based on the user-labeled images in the target domain; –A-SVM: Adaptive SVM; –MR: Manifold Ranking based relevance feedback method;

Retrieval with Relevance Feedback Setting of y(x) for CDRR: –Positive: +1.0; –Negative: -0.1; Reason: –The top-ranked negative images are not extremely negative; –Positive: “what is”; Negative: “what is not”. positive images negative images

Retrieval with Relevance Feedback On Corel dataset:

Retrieval with Relevance Feedback On Kodak dataset:

Retrieval with Relevance Feedback Time cost: –All methods except A-SVM can achieve real-time speed.

System Demonstration

Query: Sunset

Query: Plane

The User is Providing The Relevance Feedback …

After 2 pos 2 neg feedback…

Summary Our goal: (quasi) real-time textual query based consumer photo retrieval. Our method: –Use web images and their surrounding text descriptions as an auxiliary database; –Asymmetric bagging with decision stumps; –Several simple but effective cross-domain learning methods to help relevance feedback.

Future Work How to efficiently use more powerful source classifiers? How to further improve the speed: –Control training time within 1 seconds; –Control testing time when the consumer photo set is very large.

Thank you! Any questions?