ICIP 2004, Singapore, October A Comparison of Continuous vs. Discrete Image Models for Probabilistic Image and Video Retrieval Arjen P. de Vries and Thijs Westerveld
ICIP 2004, Singapore, October Theory
ICIP 2004, Singapore, October Generative Models… A statistical model for generating data –Probability distribution over samples in a given ‘language’ M P ( | M )= P ( | M ) P ( | M, ) © Victor Lavrenko, Aug aka ‘Language Modelling’
ICIP 2004, Singapore, October Basic question: –What is the likelihood that this document is relevant to this query? P(rel|I,Q) = P(I,Q|rel)P(rel) / P(I,Q) … in Information Retrieval P(I,Q|rel) = P(Q|I,rel)P(I|rel)
ICIP 2004, Singapore, October Retrieval (Query generation) Models P(Q|M 1 ) P(Q|M 4 ) P(Q|M 3 ) P(Q|M 2 ) Query Docs
ICIP 2004, Singapore, October ‘Language Modeling’ Not just ‘English’ But also, the language of –author –newspaper –text document –image Shakespeare or Dickens? Indeed the short and the long. Marry, ‘tis a noble Lepidus.
ICIP 2004, Singapore, October ‘Language Modeling’ Guardian or Times?Not just ‘English’ But also, the language of –author –newspaper –text document –image
ICIP 2004, Singapore, October ‘Language Modeling’ or ?Not just English! But also, the language of –author –newspaper –text document –image
ICIP 2004, Singapore, October The Fundamental Problem Usually, we don’t know the model M –But have a sample representative of that model First estimate a model from a sample Then compute the observation probability P ( | M ( ) ) M © Victor Lavrenko, Aug. 2002
ICIP 2004, Singapore, October Urn metaphor Unigram Language Models © Victor Lavrenko, Aug P( | ) ~ P ( | ) P ( | ) P ( | ) P ( | ) = 4/9 * 2/9 * 4/9 * 3/9
ICIP 2004, Singapore, October The Zero-frequency Problem Suppose some event not in our example –Model may assign zero probability to that event –And to any set of events involving the unseen event ?
ICIP 2004, Singapore, October Smoothing Idea: shift part of probability mass to unseen events Interpolation with background model –Reflects expected frequency of events –Plays role of IDF (inverse document freq.) – +(1- )
ICIP 2004, Singapore, October The IDF Role of Smoothing P(x| ) +(1- ) P(x|) P(x| ) = +1 (1- ) P(x|) –Ranking independent of
ICIP 2004, Singapore, October Practise
ICIP 2004, Singapore, October Pixel level: no semantics Pixel blocks/regions Image Retrieval
ICIP 2004, Singapore, October Modelling Images Compute local features –Eg., blueness and yellowness
ICIP 2004, Singapore, October 25-27
Discrete Model yellow blue
ICIP 2004, Singapore, October Discrete Model
ICIP 2004, Singapore, October Modelling Images blue yellow Histogram also models empty regions in the feature space Boundaries are hard
ICIP 2004, Singapore, October Continuous Model Build Gaussian Mixture model using expectation maximisation (EM) 2 Components –Centers, covariance –Random intialisation blue yellow
ICIP 2004, Singapore, October Continuous Model
ICIP 2004, Singapore, October Discrete vs. Continuous Discrete Model –Low indexing cost (binning) –Low retrieval cost (inverted file) –But… how to partition the indexing space? Continuous Model –High indexing cost (EM algorithm) –High retrieval cost (access all data) –But… less overfitting better generalisation
ICIP 2004, Singapore, October Experiments TRECVID2003 search task –Discrete vs. Continuous –Regions vs. full Query examples –All examples vs. designated only Mean average precision
ICIP 2004, Singapore, October Results Continuous Model significantly better on almost all queries However, Discrete Model significantly better for small number of highly focused queries (e.g., flames, airplane taking off) –More analysis needed though
ICIP 2004, Singapore, October Conclusions Language modelling approach to IR also applicable to retrieval of other media Discrete vs. Continuous Model –Continuous Model almost always better –Unfortunately, Discrete Model far easier to implement efficiently
ICIP 2004, Singapore, October Future Work Improve Sampling Process –Better texture representation? –Overlapping, multi-scale image patches? Improve Discrete Model –Partitioning of feature space in grid cells Compare the performance of the two models in interactive setting with relevance feedback –Higher quality per iteration vs. many iterations