Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search Lu Jiang 1, Deyu Meng 2, Teruko Mitamura 1, Alexander G. Hauptmann 1 1 School.

Slides:

Advertisements

Similar presentations

Latent Variables Naman Agarwal Michael Nute May 1, 2013.

Advertisements

PEBL: Web Page Classification without Negative Examples Hwanjo Yu, Jiawei Han, Kevin Chen- Chuan Chang IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,

Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

A Comparison of Rule-Based versus Exemplar-Based Categorization Using the ACT-R Architecture Matthew F. RUTLEDGE-TAYLOR, Christian LEBIERE, Robert THOMSON,

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.

Limin Wang, Yu Qiao, and Xiaoou Tang

A generic model to compose vision modules for holistic scene understanding Adarsh Kowdle *, Congcong Li *, Ashutosh Saxena, and Tsuhan Chen Cornell University,

Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,

Ziming Zhang*, Ze-Nian Li, Mark Drew School of Computing Science Simon Fraser University Vancouver, Canada {zza27, li, AdaMKL: A Novel.

SUPER: Towards Real-time Event Recognition in Internet Videos Yu-Gang Jiang School of Computer Science Fudan University Shanghai, China

Relevance Feedback Content-Based Image Retrieval Using Query Distribution Estimation Based on Maximum Entropy Principle Irwin King and Zhong Jin Nov

Restrict learning to a model-dependent “easy” set of samples General form of objective: Introduce indicator of “easiness” v i : K determines threshold.

ACM Multimedia th Annual Conference, October , 2004

Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson

1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.

Relevance Feedback based on Parameter Estimation of Target Distribution K. C. Sia and Irwin King Department of Computer Science & Engineering The Chinese.

1 Integrating User Feedback Log into Relevance Feedback by Coupled SVM for Content-Based Image Retrieval 9-April, 2005 Steven C. H. Hoi *, Michael R. Lyu.

Presentation in IJCNN 2004 Biased Support Vector Machine for Relevance Feedback in Image Retrieval Hoi, Chu-Hong Steven Department of Computer Science.

Dept. of Computer Science & Engineering, CUHK Pseudo Relevance Feedback with Biased Support Vector Machine in Multimedia Retrieval Steven C.H. Hoi 14-Oct,

Presented by Zeehasham Rasheed

Graph-Based Semi-Supervised Learning with a Generative Model Speaker: Jingrui He Advisor: Jaime Carbonell Machine Learning Department

1 Automated Feature Abstraction of the fMRI Signal using Neural Network Clustering Techniques Stefan Niculescu and Tom Mitchell Siemens Medical Solutions,

Bing LiuCS Department, UIC1 Learning from Positive and Unlabeled Examples Bing Liu Department of Computer Science University of Illinois at Chicago Joint.

Scalable Text Mining with Sparse Generative Models

Relevance Feedback Content-Based Image Retrieval Using Query Distribution Estimation Based on Maximum Entropy Principle Irwin King and Zhong Jin The Chinese.

SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,

Improving web image search results using query-relative classifiers Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy.

Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.

Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.

School of Information Technology & Electrical Engineering Multiple Feature Hashing for Real-time Large Scale Near-duplicate Video Retrieval Jingkuan Song*,

Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.

A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA

Object Bank Presenter ： Liu Changyu Advisor ： Prof. Alex Hauptmann Interest ： Multimedia Analysis April 4 th, 2013.

Curriculum Learning Yoshua Bengio, U. Montreal Jérôme Louradour, A2iA

Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University Carnegie Mellon Question Answering from Errorful.

Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.

Beauty is Here! Evaluating Aesthetics in Videos Using Multimodal Features and Free Training Data Yanran Wang, Qi Dai, Rui Feng, Yu-Gang Jiang School of.

Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.

Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.

Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.

LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.

PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.

Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.

Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.

Towards Efficient Learning of Optimal Spatial Bag-of-Words Representations Lu Jiang 1, Wei Tong 1, Deyu Meng 2, Alexander G. Hauptmann 1 1 School of Computer.

Bing LiuCS Department, UIC1 Chapter 8: Semi-supervised learning.

1/18 New Feature Presentation of Transition Probability Matrix for Image Tampering Detection Luyi Chen 1 Shilin Wang 2 Shenghong Li 1 Jianhua Li 1 1 Department.

Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.

Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang.

Carnegie Mellon School of Computer Science Language Technologies Institute CMU Team-1 in TDT 2004 Workshop 1 CMU TEAM-A in TDT 2004 Topic Tracking Yiming.

Duc-Tien Dang-Nguyen, Giulia Boato, Alessandro Moschitti, Francesco G.B. De Natale Department to Information and Computer Science –University of Trento.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

6.S093 Visual Recognition through Machine Learning Competition Image by kirkh.deviantart.com Joseph Lim and Aditya Khosla Acknowledgment: Many slides from.

Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.

Max-Confidence Boosting With Uncertainty for Visual tracking WEN GUO, LIANGLIANG CAO, TONY X. HAN, SHUICHENG YAN AND CHANGSHENG XU IEEE TRANSACTIONS ON.

Coached Active Learning for Interactive Video Search Xiao-Yong Wei, Zhen-Qun Yang Machine Intelligence Laboratory College of Computer Science Sichuan University,

Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.

Relevant Document Distribution Estimation Method for Resource Selection Luo Si and Jamie Callan School of Computer Science Carnegie Mellon University

Wen Chan 1 ， Jintao Du 1, Weidong Yang 1, Jinhui Tang 2, Xiangdong Zhou 1 1 School of Computer Science, Shanghai Key Laboratory of Data Science, Fudan.

Cross-modal Hashing Through Ranking Subspace Learning

1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.

Semi-Supervised Learning Using Label Mean

2 Xi’an Jiaotong University 3 University of Technology Sydney

Distributed Learning of Multilingual DNN Feature Extractors using GPUs

Paper Reading Dalong Du April.08, 2011.

Learning to Rank with Ties

Learning Incoherent Sparse and Low-Rank Patterns from Multiple Tasks

Primal Sparse Max-Margin Markov Networks

Presentation transcript:

Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search Lu Jiang 1, Deyu Meng 2, Teruko Mitamura 1, Alexander G. Hauptmann 1 1 School of Computer Science, Carnegie Mellon University 2 School of Mathematics and Statistics, Xi'an Jiaotong University

People CMU Informedia Team Teruko MitamuraDeyu MengAlexander G. Hauptmann

Acknowledgement This work was partially supported by Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number D11PC Deyu Meng was partially supported by 973 Program of China ( CB329404) and the NSFC project ( ). The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DoI/NBC, or the U.S. Government.

Outline  Background  Related Work  Self-Paced Reranking (SPaR)  Experiment Results  Conclusions

Outline  Background  Related Work  Self-Paced Reranking (SPaR)  Experiment Results  Conclusions

Zero-Example Search Zero-Example Search (also known as 0Ex) represents a multimedia search condition where zero relevant examples are provided. An example: TRECVID Multimedia Event Detection (MED). The task is very challenging. – Detect every-day event in Internet videos Birthday Party Changing a vehicle tire Wedding ceremony – Content-based search. No textual metadata (title/description) is available.

Zero-Example Search Event of Interest: Birthday Party ELAMP Prototype System

Zero-Example Search Event of Interest: Birthday Party ELAMP Prototype System Birthday Cake, Kids, Gift

Zero-Example Search Event of Interest: Birthday Party ELAMP Prototype System Birthday Cake, Kids, Gift Happy Birthday, Cheering

Zero-Example Search Event of Interest: Birthday Party ELAMP Prototype System Birthday Cake, Kids, Gift Happy Birthday, Cheering Birthday Party

Zero-Example Search Event of Interest: Birthday Party ELAMP Prototype System Birthday Cake, Kids, Gift Happy Birthday, Cheering Birthday Party 0Ex System

Reranking Intuition: initial ranked result is noisy. Refined by the multimodal info residing in the top ranked videos/images. Reranking

Outline  Background  Related Work  Self-Paced Reranking (SPaR)  Experiment Results  Conclusions

Related Work Categorization of reranking methods: – Classification-based (Yan et al. 2003) (Hauptmann et al. 2008)(Jiang et al. 2014) – Clustering-based (Hsu et al. 2007) – LETOR(LEarning TO Rank)-based (Liu et al. 2008) (Tian et al. 2008) (Tian et al. 2011) – Graph-based (Hsu et al. 2007) (Nie et al. 2012) R. Yan, A. G. Hauptmann, and R. Jin. Multimedia search with pseudo-relevance feedback. In CVIR, A. G. Hauptmann, M. G. Christel, and R. Yan. Video retrieval based on semantic concepts. Proceedings of the IEEE, 96(4):602–622, L. Jiang, T. Mitamura, S.-I. Yu, and A. G. Hauptmann. Zero-example event search using multimodal pseudo relevance feedback. In ICMR, 2014 W. H. Hsu, L. S. Kennedy, and S.-F. Chang. Video search reranking through random walk over document-level context graph. In Multimedia, Y. Liu, T. Mei, X.-S. Hua, J. Tang, X. Wu, and S. Li. Learning to video search rerank via pseudo preference feedback. In ICME, X. Tian, Y. Lu, L. Yang, and Q. Tian. Learning to judge image search results. In Multimedia, X. Tian, L. Yang, J. Wang, Y. Yang, X. Wu, and X.-S. Hua. Bayesian video search reranking. In Multimedia, W. H. Hsu, L. S. Kennedy, and S.-F. Chang. Video search reranking through random walk over document-level context graph. In Multimedia, L. Nie, S. Yan, M. Wang, R. Hong, and T.-S. Chua. Harvesting visual concepts for image search with complex queries. In Multimedia, 2012.

Generic Reranking Algorithm

Pseudo labels: assumed (hidden) labels for samples. Zero-example: ground truth label unknown.

Generic Reranking Algorithm Iteration: 1

Generic Reranking Algorithm Iteration: 1 Iteration: 2

Intuition Existing methods assign equal weights to pseudo samples. Intuition: samples ranked at the top are generally more relevant than those ranked lower. Our approach: learn the weight together with the reranking model.

Generic Reranking Algorithm Questions: 1.Why the reranking algorithm performs iteratively? 2.Does the process converge? If so, to where? 3.Does the arbitrarily predefined weighting scheme converge?

Generic Reranking Algorithm Questions: 1.Why the reranking algorithm performs iteratively? 2.Does the process converge? If so, to where? 3.Does the arbitrarily predefined weighting scheme converge?

Generic Reranking Algorithm Questions: 1.Why the reranking algorithm performs iteratively? 2.Does the process converge? If so, to where? 3.Does the arbitrarily predefined weighting scheme converge?

Outline  Background  Related Work  Self-Paced Reranking (SPaR)  Experiment Results  Conclusions

Self-paced Learning Curriculum Learning (Bengio et al. 2009) or self-paced learning (Kumar et al 2010) is a recently proposed learning paradigm that is inspired by the learning process of humans and animals. The samples are not learned randomly but organized in a meaningful order which illustrates from easy to gradually more complex ones. Prof. Koller Prof. Bengio Y. Bengio, J. Louradour, R. Collobert, and J. Weston. Curriculum learning. In ICML, M. P. Kumar, B. Packer, and D. Koller. Self-paced learning for latent variable models. In NIPS, pages 1189–1197, 2010.

Self-paced Learning Easy samples to complex samples. – Easy sample  smaller loss to the already learned model. – Complex sample  bigger loss to the already learned model. Age

Self-paced Learning In the context of reranking : easy samples are the top-ranked videos that have smaller loss. Age Ranked list of iteration 1 Ranked list of iteration n

Self-paced Reranking (SPaR) We propose a novel framework named Self-Paced Reranking (SPaR) pronounced as /’spä/. Inspired by the self-paced learning theory. Formulate the problem as a concise optimization problem. *Images from

Self-paced Reranking (SPaR) The propose model: Reranking models for each modality. The weight for each sample. The pseudo label.

Self-paced Reranking (SPaR) The propose model: loss-function constraints regularizer The loss in the reranking model is discounted by a weight. Reranking models for each modality. The weight for each sample. The pseudo label.

Self-paced Reranking (SPaR) The propose model: regularizer For example the Loss in the SVM model. Reranking models for each modality. The weight for each sample. The pseudo label.

Self-paced Reranking (SPaR) The propose model: loss-function constraints is the total number of modality. Reranking models for each modality. The weight for each sample. The pseudo label. is the self-paced function in self-paced learning. The self-paced is implemented by a regularizer. Physically corresponds to learning schemes that human use to learn different tasks.

Self-paced Function is the age parameter in self-paced learning. Physically it corresponds to the age of the learner. The definition which provides an axiom for self-paced learning. Convex function.

Self-paced Function We propose the definition which provides an axiom for self-paced learning. Favors easy samples. is the age parameter in self-paced learning. Physically it corresponds to the age of the learner.

Self-paced Function We propose the definition which provides an axiom for self-paced learning. When the model is young use less samples; When the model is mature use more; is the age parameter in self-paced learning. Physically it corresponds to the age of the learner.

Self-paced Function Existing self-paced functions only support binary weighting (Kumar et al 2010). We argument the weight schemes and proposes the following soft weighting. M. P. Kumar, B. Packer, and D. Koller. Self-paced learning for latent variable models. In NIPS, pages 1189–1197, Binary weighting Linear weighting Logarithmic weighting Mixture weighting

Self-paced Function Existing self-paced functions only support binary weighting (Kumar et al 2010). Binary weighting Linear weighting Logarithmic weighting Mixture weighting

Reranking in Optimization and Conventional Perspective CCM (Cyclic Coordinate Method) is used to solve the problem. Fixing one variable and optimizing the other variables.

Reranking in Optimization and Conventional Perspective Optimization perspective  theoretical justifications Conventional perspective offers practical lessons Reranking is a self-paced learning process. Optimization perspectiveConventional perspective

Reranking in Optimization and Conventional Perspective Q1: Why the reranking algorithm performs iteratively? A: Self-paced learning mimicking human and animal learning process (from easy to complex examples). Q2: Does the process converge? If so, to where? A: Yes, to the local optimum. See the theorem in our paper. Q3: Does the arbitrarily predefined weighting scheme converge? A: No, but the weights by self-paced function guarantees the convergence.

Reranking in Optimization and Conventional Perspective Q1: Why the reranking algorithm performs iteratively? A: Self-paced learning mimicking human and animal learning process (from easy to complex examples). Q2: Does the process converge? If so, to where? A: Yes, to the local optimum. See the theorem in our paper. Q3: Does the arbitrarily predefined weighting scheme converge? A: No, but the weights by self-paced function guarantees the convergence.

Reranking in Optimization and Conventional Perspective Q1: Why the reranking algorithm performs iteratively? A: Self-paced learning mimicking human and animal learning process (from easy to complex examples). Q2: Does the process converge? If so, to where? A: Yes, to the local optimum. See the theorem in our paper. Q3: Does the arbitrarily predefined weighting scheme converge? A: No, but the weights by self-paced function guarantees the convergence.

Outline  Background  Related Work  Self-Paced Reranking (SPaR)  Experiment Results  Conclusions

TRECVID Multimedia Event Detection Dataset: MED13Test (around 34,000 videos) on 20 predefined events. Test on the NIST’s split (25,000 videos). Evaluated by Mean Average Precision. Four types of high-level features: – ASR, OCR, SIN, and ImageNet DCNN Two types of low-level features: – Dense trajectory and MFCC Configurations: – Mixture self-paced function – Starting values obtained by MMPRF – Setting age parameter to include certain number of samples.

Results on MED13Test By far the best MAP of the 0Ex task reported on the dataset! Outperforms MMPRF on 15/20 events.

Comparison of top-ranked videos

TRECVID MED 2014 Very challenging task: – Search over MED14Eval Full (200K videos) – The ground-truth label is unavailable. – Can only submit one run. – Ad-Hoc queries (events) are unknown to the system. SPaR yields outstanding improvements for TRECVID MED14 000Ex and 010Ex condition! Take no more than 60 seconds/query on a workstation. Cost-effective Method!

Web Query Dataset Web image (353 queries over 71,478 images) Densely sampled SIFT are extracted. Parameters are tuned on a validation set. Mixture self-paced function is used. SPaR also works for image reranking (single modality)

Discussions Two scenarios where SPaR fails: – Initial top-ranked videos are completely off-topic. – Features used in reranking are not discriminative to the queries. Sensitive to random starting values – Initializing by existing reranking algorithms such as MMPRF/CPRF. Tuning the age parameter by the statistics collected from the ranked samples. – as opposed to absolute values.

Outline  Motivation  Related Work  Jensen - Shannon Tiling  Experiment Results  Conclusions

Summary A few messages to take away from this talk: – Reranking follows the self-paced learning process. – SPaR is a novel and general framework with theoretical backgrounds for multimodal reranking. – SPaR achieves by far the best result on the Multimedia Event Detection zero-example search.

Relation to Existing Reranking Methods Learning to rank model Classifier Y. Liu, T. Mei, X.-S. Hua, J. Tang, X. Wu, and S. Li. Learning to video search rerank via pseudo preference feedback. In ICME, L. Jiang, T. Mitamura, S.-I. Yu, and A. G. Hauptmann. Zero-example event search using multimodal pseudo relevance feedback. In ICMR, 2014 (Jiang et al. 2014) (Liu et al. 2008)

Relation to Existing Reranking Methods Learning to rank model Classifier Single modality (Yan et al. 2003) Y. Liu, T. Mei, X.-S. Hua, J. Tang, X. Wu, and S. Li. Learning to video search rerank via pseudo preference feedback. In ICME, L. Jiang, T. Mitamura, S.-I. Yu, and A. G. Hauptmann. Zero-example event search using multimodal pseudo relevance feedback. In ICMR, 2014 R. Yan, A. G. Hauptmann, and R. Jin. Multimedia search with pseudo-relevance feedback. In CVIR, (Jiang et al. 2014) (Liu et al. 2008)

SPaR for Practitioners 1. Pick a self-paced function – Binary/Linear/Logarithmic/Mixture weighting. 2. Pick a favorite reranking model – SVM*/Logistic Regression/Learning to Rank. 3. Get reasonable starting values – Initializing by existing reranking algorithms. *weighted sample LibSVM

SPaR for Practitioners Iterate the following steps: – Training a reranking model using the pseudo samples. – Selecting pseudo positive samples and their weights by self-paced function. Selecting some pseudo negative samples randomly. – Changing the model age to include more positive samples for the next iteration (setting to include certain number of examples).

Cyclic Coordinate Algorithm The propose model: Reranking models for each modality. The weight for each sample. The pseudo label. Algorithm (Cyclic Coordinate Method): 1.Fix, optimize 2.Fix optimize 3.Fix optimize 4.Change the age parameter to include more samples. Using the existing off-the-shelf algorithm. Enumerating binary labels. Selecting samples and their weights for the next iteration