Multimedia Retrieval. Outline Overview Indexing Multimedia Generative Models & MMIR –Probabilistic Retrieval –Language models, GMMs Experiments –Corel.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

© Arjen P. de Vries Arjen P. de Vries Fascinating Relationships between Media and Text.
Chapter 5: Introduction to Information Retrieval
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
TRECVID 2004 Tzvetanka (‘Tzveta’) I. Ianeva Lioudmila (‘Mila’) Boldareva Thijs Westerveld Roberto Cornacchia Djoerd Hiemstra (the 1 and only) Arjen P.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
ICASSP, May Arjen P. de Vries Thijs Westerveld Tzvetanka I. Ianeva Combining Multiple Representations on the TRECVID Search Task.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Morris LeBlanc.  Why Image Retrieval is Hard?  Problems with Image Retrieval  Support Vector Machines  Active Learning  Image Processing ◦ Texture.
IR Challenges and Language Modeling. IR Achievements Search engines  Meta-search  Cross-lingual search  Factoid question answering  Filtering Statistical.
Database Management Systems, R. Ramakrishnan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides.
Multimedia and Text Indexing. Multimedia Data Management The need to query and analyze vast amounts of multimedia data (i.e., images, sound tracks, video.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Automatic Image Annotation and Retrieval using Cross-Media Relevance Models J. Jeon, V. Lavrenko and R. Manmathat Computer Science Department University.
Expectation Maximization Method Effective Image Retrieval Based on Hidden Concept Discovery in Image Database By Sanket Korgaonkar Masters Computer Science.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
INFO 624 Week 3 Retrieval System Evaluation
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Presented by Zeehasham Rasheed
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
ICME 2004 Tzvetanka I. Ianeva Arjen P. de Vries Thijs Westerveld A Dynamic Probabilistic Multimedia Retrieval Model.
Visual Information Systems visual information retrieval.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Information Retrieval
Overview of Search Engines
Information Retrieval in Practice
A Search Engine for Historical Manuscript Images Toni M. Rath, R. Manmatha and Victor Lavrenko Center for Intelligent Information Retrieval University.
Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica.
Multimedia Databases (MMDB)
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Content-Based Image Retrieval
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.
1 Computing Relevance, Similarity: The Vector Space Model.
Information Retrieval Model Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 7. Topic Extraction.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Lecture 1: Overview of IR Maya Ramanath. Who hasn’t used Google? Why did Google return these results first ? Can we improve on it? Is this a good result.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Chapter 23: Probabilistic Language Models April 13, 2004.
1 Modeling Long Distance Dependence in Language: Topic Mixtures Versus Dynamic Cache Models Rukmini.M Iyer, Mari Ostendorf.
Competence Centre on Information Extraction and Image Understanding for Earth Observation 29th March 2007 Category - based Semantic Search Engine 1 Mihai.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
ICIP 2004, Singapore, October A Comparison of Continuous vs. Discrete Image Models for Probabilistic Image and Video Retrieval Arjen P. de Vries.
Conceptual structures in modern information retrieval Claudio Carpineto Fondazione Ugo Bordoni
Language Modeling Putting a curve to the bag of words Courtesy of Chris Jordan.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Multimedia Retrieval. Outline Overview Indexing Multimedia Generative Models & MMIR –Probabilistic Retrieval –Language models, GMMs Experiments –Corel.
DISTRIBUTED INFORMATION RETRIEVAL Lee Won Hee.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Microsoft Research Cambridge,
Visual Information Retrieval
(Note: a lot of input from Thijs Westerveld)
Introduction Multimedia initial focus
Multimedia Information Retrieval
Image Segmentation Techniques
Multimedia Information Retrieval
Information Retrieval and Web Design
Presentation transcript:

Multimedia Retrieval

Outline Overview Indexing Multimedia Generative Models & MMIR –Probabilistic Retrieval –Language models, GMMs Experiments –Corel experiments –TREC Video benchmark

Indexing Multimedia

A Wealth of Information Speech Audio Images Temporal composition Database

Associated Information id gender name country history picture Player Profile Biography

User Interaction Database Views results Gives examples video segments Poses a query query text feedback Evaluates

Indexing Multimedia Manually added descriptions –‘Metadata’ Analysis of associated data –Speech, captions, OCR, … Content-based retrieval –Approximate retrieval –Domain-specific techniques

Limitations of Metadata Vocabulary problem –Dark vs. somber Different people describe different aspects –Dark vs. evening

Limitations of Metadata Encoding Specificity Problem –A single person describes different aspects in different situations Many aspects of multimedia simply cannot be expressed unambiguously –Processes in left (analytic, verbal) vs. right brain (aesthetics, synthetic, nonverbal)

Approximate Retrieval Based on similarity –Find all objects that are similar to this one –Distance function –Representations capture some (syntactic) meaning of the object ‘Query by Example’ paradigm

N-dimensional space Feature extraction Ranking Display

Low-level Features

Query image

So, … Retrieval?!

IR is about satisfying vague information needs provided by users, (imprecisely specified in ambiguous natural language) by satisfying them approximately against information provided by authors (specified in the same ambiguous natural language) Smeaton

No ‘Exact’ Science! Evaluation is not done analytically, but experimentally –real users (specifying requests) –test collections (real document collections) –benchmarks (TREC: text retrieval conference) –Precision –Recall –...

Known Item

Query

Results …

Query

Results …

Semantic gap… raw multimedia data features concepts ?

Observation Automatic approaches are successful under two conditions: –the query example is derived from the same source as the target objects –a domain-specific detector is at hand

1. Generic Detectors

Retrieval Process Database Query Parsing Detector / Feature selection Filtering Ranking Query type Nouns Adjectives Camera operations People, Names Natural/physical objects Invariant color spaces

Parameterized detectors People detector Example Query text Topic 41 Results

Query Parsing Other examples of overhead zooming in views of canyons in the Western United States Nouns Adjectives The query type Find + names

Detectors Camera operations (pan, zoom, tilt, …) People (face based) Names (VideoOCR) Natural objects (color space selection) Physical objects (color space selection) Monologues (specifically designed) Press conferences (specifically designed) Interviews (specifically designed) Domain specific detectors FOCUSFOCUS The universe and everything

2. Domain knowledge

Player Segmentation Original image Initial segmentation Final segmentation

Show clips from tennis matches, starring Sampras, playing close to the net; Advanced Queries

3. Get to know your users

Mirror Approach Gather User’s Knowledge –Introduce semi-automatic processes for selection and combination of feature models Local Information –Relevance feedback from a user Global Information –Thesauri constructed from all users

N-dimensional space Feature extraction ClusteringRanking Concepts Display Thesauri

Low-level Features

Identify Groups

Representation Groups of feature vectors are conceptually equivalent to words in text retrieval So, techniques from text retrieval can now be applied to multimedia data as if these were text!

Query Formulation Clusters are internal representations, not suited for user interaction Use automatic query formulation based on global information (thesaurus) and local information (user feedback)

Interactive Query Process Select relevant clusters from thesaurus Search collection Improve results by adapting query –Remove clusters occuring in irrelevant images –Add clusters occuring in relevant images

Assign Semantics

Visual Thesaurus Correct cluster representing ‘Tree’, ‘Forest’ Glcm_47 ‘Incoherent’ cluster Fractal_23 Mis-labeled cluster Gabor_20

Learning Short-term: Adapt query to better reflect this user’s information need Long-term: Adapt thesaurus and clustering to improve system for all users

Thesaurus Only After Feedback

4. Nobody is unique!

Collaborative Filtering Also: social information filtering –Compare user judgments –Recommend differences between similar users People’s tastes are not randomly distributed You are what you buy (Amazon)

Collaborative Filtering Benefits over content-based approach –Overcomes problems with finding suitable features to represent e.g. art, music –Serendipity –Implicit mechanism for qualitative aspects like style Problems: large groups, broad domains

5. Ask for help

Query Articulation N-dimensional space Feature extraction How to articulate the query?

What is the query semantics ?

Details matter

Problem Statement Feature vectors capture ‘global’ aspects of the whole image Overall image characteristics dominate the feature-vectors Hypothesis: users are interested in details

Irrelevant Background QueryResult

Image Spots Image-spots articulate desired image details –Foreground/background colors –Colors forming ‘shapes’ –Enclosure of shapes by background colors Multi-spot queries define the spatial relations between a number of spots

Hist16HistSpotSpot+Hist Results Query Images

A: Simple Spot Query `Black sky’

B: Articulated Multi-Spot Query `Black sky’ above `Monochrome ground’

C: Histogram Search in `Black Sky’ images 2-4: 14:

Complicating Factors What are Good Feature Models? What are Good Ranking Functions? Queries are Subjective!

Probabilistic Approaches

Generative Models… A statistical model for generating data –Probability distribution over samples in a given ‘language’ M P ( | M )= P ( | M ) P ( | M, ) © Victor Lavrenko, Aug aka ‘Language Modelling’

Basic question: –What is the likelihood that this document is relevant to this query? P(rel|I,Q) = P(I,Q|rel)P(rel) / P(I,Q) … in Information Retrieval P(I,Q|rel) = P(Q|I,rel)P(I|rel)

‘Language Modelling’ Not just ‘English’ But also, the language of –author –newspaper –text document –image Hiemstra or Robertson? ‘Parsimonious language models explicitly address the relation between levels of language models that are typically used for smoothing.’

‘Language Modelling’ Guardian or Times?Not just ‘English’ But also, the language of –author –newspaper –text document –image

‘Language Modelling’ or ?Not just English! But also, the language of –author –newspaper –text document –image

Unigram and higher-order models Unigram Models N-gram Models Other Models –Grammar-based models, etc. –Mixture models = P ( ) P ( | ) P ( ) P ( ) P ( ) P ( ) P ( | ) P ( | ) P ( | ) © Victor Lavrenko, Aug. 2002

The fundamental problem Usually we don’t know the model M –But have a sample representative of that model First estimate a model from a sample Then compute the observation probability P ( | M ( ) ) M © Victor Lavrenko, Aug. 2002

Indexing: determine models Indexing –Estimate Gaussian Mixture Models from images using EM –Based on feature vector with colour, texture and position information from pixel blocks –Fixed number of components DocsModels

Retrieval: use query likelihood Query: Which of the models is most likely to generate these 24 samples?

Probabilistic Image Retrieval ?

Query Rank by P(Q|M) P(Q|M 1 ) P(Q|M 4 ) P(Q|M 3 ) P(Q|M 2 )

Topic Models Models P(Q|M 1 ) P(Q|M 4 ) P(Q|M 3 ) P(Q|M 2 ) Query P(D 1 |QM) P(D 2 |QM) P(D 3 |QM) P(D 4 |QM) Documents Query Model

Probabilistic Retrieval Model Text –Rank using probability of drawing query terms from document models Images –Rank using probability of drawing query blocks from document models Multi-modal –Rank using joint probability of drawing query samples from document models

Unigram Language Models (LM) –Urn metaphor Text Models P( ) ~ P ( ) P ( ) P ( ) P ( ) = 4/9 * 2/9 * 4/9 * 3/9 © Victor Lavrenko, Aug. 2002

Generative Models and IR Rank models (documents) by probability of generating the query Q: P( | ) = 4/9 * 2/9 * 4/9 * 3/9 = 96/9 P( | ) = 3/9 * 3/9 * 3/9 * 3/9 = 81/9 P( | ) = 2/9 * 3/9 * 2/9 * 4/9 = 48/9 P( | ) = 2/9 * 5/9 * 2/9 * 2/9 = 40/9

The Zero-frequency Problem Suppose some event not in our example –Model will assign zero probability to that event –And to any set of events involving the unseen event Happens frequently with language It is incorrect to infer zero probabilities –Especially when dealing with incomplete samples ?

Smoothing Idea: shift part of probability mass to unseen events Interpolation with background (General English) –Reflects expected frequency of events –Plays role of IDF – +(1- )

Hierarchical Language Model MNM Smoothed over multiple levels Alpha * P(T|Shot) + Beta * P(T|‘Scene’) + Gamma * P(T|Video) + (1–Alpha–Beta–Gamma) * P(T|Collection) Also common in XML retrieval –Element score smoothed with containing article

Image Models Urn metaphor not useful –Drawing pixels useless Pixels carry no semantics –Drawing pixel blocks not effective chances of drawing exact query blocks from document slim Use Gaussian Mixture Models (GMM) –Fixed number of Gaussian components/clusters/concepts

Key-frame representation Query model split colour channels Take samples Cr Cb Y DCT coefficients position EM algorithm

? Image Models Expectation-Maximisation (EM) algorithm –iteratively estimate component assignments re-estimate component parameters

Component 1Component 2Component 3 Expectation Maximization E M

animation Component 1Component 2Component 3 E M