Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multimedia Information Retrieval

Similar presentations


Presentation on theme: "Multimedia Information Retrieval"— Presentation transcript:

1 Multimedia Information Retrieval
Modern Information Retrieval Course Computer Engineering Department Sharif University of Technology Spring 2006

2 Sharif University, Modern Information Retrieval Course, Spring 2006
Outline Introduction Text-Based MMIR Content-Based Retrieval Multimedia IR Model Image Retrieval Audio Retrieval Video Retrieval Conclusions Sharif University, Modern Information Retrieval Course, Spring 2006

3 Sharif University, Modern Information Retrieval Course, Spring 2006
Outline Introduction Text-Based MMIR Content-Based Retrieval Multimedia IR Model Image Retrieval Audio Retrieval Video Retrieval Conclusions Sharif University, Modern Information Retrieval Course, Spring 2006

4 Support variety of data
Different kinds of media Image Graph,… Audio Music, speech,… Video Sharif University, Modern Information Retrieval Course, Spring 2006

5 Sharif University, Modern Information Retrieval Course, Spring 2006
MMIR Motivations Content, content, and more content … How to get what is needed ? Increasing availability of multimedia information Difficult to find, select, filter, manage AV content More and more situations where it is necessary to have ‘information about the content’ Sharif University, Modern Information Retrieval Course, Spring 2006

6 Sharif University, Modern Information Retrieval Course, Spring 2006
Key Issues in MMIR Sharif University, Modern Information Retrieval Course, Spring 2006

7 Sharif University, Modern Information Retrieval Course, Spring 2006
Goals Want to make multimedia content searchable like text information, Because the value of content depends on how easy it is to find, filter, manage, and use it. Need content description method beyond simple text annotation Sharif University, Modern Information Retrieval Course, Spring 2006

8 Sharif University, Modern Information Retrieval Course, Spring 2006
MMIR Approaches Text Based MMIR Content Based MMIR Sharif University, Modern Information Retrieval Course, Spring 2006

9 Sharif University, Modern Information Retrieval Course, Spring 2006
Outline Introduction Text-Based MMIR Content-Based Retrieval Multimedia IR Model Image Retrieval Audio Retrieval Video Retrieval Conclusions Sharif University, Modern Information Retrieval Course, Spring 2006

10 Sharif University, Modern Information Retrieval Course, Spring 2006
Text-Based Retrieval based on text associated with the file URL: Alt text: <img src=URL alt="picture of poodle"> Hyperlink text: <a href=URL>Sally the poodle</a> Sharif University, Modern Information Retrieval Course, Spring 2006

11 Text-based Search Engines
Indexing based on text in the container webpage Sharif University, Modern Information Retrieval Course, Spring 2006

12 Keyword-based System User Video Database Automatic Annotation Keyword
Information Need Including filename, video title, caption, related web page Sharif University, Modern Information Retrieval Course, Spring 2006

13 Sharif University, Modern Information Retrieval Course, Spring 2006
Why this happens? Most of these search engines are keyword based Have to represent your idea in keywords These keywords are expected to appear in the filename, or corresponding webpage Sharif University, Modern Information Retrieval Course, Spring 2006

14 Image: The Google Approach
How does image search work? Google analyzes the text on the page adjacent to the image, the image caption and dozens of other factors to determine the image content. Google also uses sophisticated algorithms to remove duplicates and ensure that the highest quality images are presented first in your results. Examples Campanile tcd Cliffs of Moher Recall may not be great… Sharif University, Modern Information Retrieval Course, Spring 2006

15 Sharif University, Modern Information Retrieval Course, Spring 2006
Google image search Sharif University, Modern Information Retrieval Course, Spring 2006

16 Sharif University, Modern Information Retrieval Course, Spring 2006
Google Image Search Sharif University, Modern Information Retrieval Course, Spring 2006

17 Problems with Text-Based
The text in the ALT tag has to be done manually Expensive Time consuming It is incomplete and subjective Some features are difficult to define in text such as texture or object shape Sharif University, Modern Information Retrieval Course, Spring 2006

18 Sharif University, Modern Information Retrieval Course, Spring 2006
Therefore…… Unable to handle semantic meaning of images Unable to handle visual position Unable to handle time information Unable to use images as query ………. Sharif University, Modern Information Retrieval Course, Spring 2006

19 Sharif University, Modern Information Retrieval Course, Spring 2006
So … Better for simple concepts e.g. A picture of a giraffe Don’t work for complex queries e.g. A picture of a brick home with black shutters and white pillars, with a pickup truck in front of it (image) Sharif University, Modern Information Retrieval Course, Spring 2006

20 Sharif University, Modern Information Retrieval Course, Spring 2006
Outline Introduction Text-Based MMIR Content-Based Retrieval Multimedia IR Model Image Retrieval Audio Retrieval Video Retrieval Conclusions Sharif University, Modern Information Retrieval Course, Spring 2006

21 Architecture for Multimedia Retrieval
AV Description Feature extraction Manual / automatic Storage Transmission Encoding (for transmission) Decoding Conf. points Search / query Pull Browse Filter Push Human or machine Sharif University, Modern Information Retrieval Course, Spring 2006

22 Query-retrieval matrix
humming examples speech sketch sound stills text query doc Example conventional text retrieval text video images speech music sketches multimedia you roar and get a wildlife documentary type “floods” and get BBC radio news hum a tune and get a music piece Sharif University, Modern Information Retrieval Course, Spring 2006

23 Sharif University, Modern Information Retrieval Course, Spring 2006
Main Components Feature Extraction & Analysis Description Schemes Searching & Filtering Examples: IBM’s Query By Image Content (QBIC) Virages’s VIR Image Engine Online Sharif University, Modern Information Retrieval Course, Spring 2006

24 Internal representation
Using attributes is not sufficient Feature Information extracted from objects Multimedia object is represented as a set of features Features can be assigned manually, automatically, or using a hybrid approach Sharif University, Modern Information Retrieval Course, Spring 2006

25 Sharif University, Modern Information Retrieval Course, Spring 2006
Features for MMIR high-level features words and phrases from text, speech recognition medium-level features face detector, regions classifiers, outdoor etc low-level features Fourier transforms, wavelet decomposition, texture histograms, colour histograms, shape primitives, filter primitives Sharif University, Modern Information Retrieval Course, Spring 2006

26 Internal representation
Values of some specific features are assigned to a object by comparing the object with some previously classified objects Feature extraction cannot be precise A weight is usually assigned to each feature value representing the uncertainty of assigning such a value to that feature 80% sure that a shape is a square Sharif University, Modern Information Retrieval Course, Spring 2006

27 Sharif University, Modern Information Retrieval Course, Spring 2006
Outline Introduction Text-Based MMIR Content-Based Retrieval Multimedia IR Model Image Retrieval Audio Retrieval Video Retrieval Conclusions Sharif University, Modern Information Retrieval Course, Spring 2006

28 MMIR Model’s Main Components
Query Language Indexing and Searching Sharif University, Modern Information Retrieval Course, Spring 2006

29 Sharif University, Modern Information Retrieval Course, Spring 2006
Query languages In designing a multimedia query language, two main aspects require attention How the user enters his/her request to the system Which conditions on multimedia objects can be specified in the user request Sharif University, Modern Information Retrieval Course, Spring 2006

30 Request specification
Interfaces Browsing and navigation Specifying the conditions the objects of interest must satisfy, by means of queries Queries can be specified in two different ways Using a specific query language Query by example Using actual data (object example) Sharif University, Modern Information Retrieval Course, Spring 2006

31 Conditions on multimedia data
Query predicates Attribute predicates Concern the attributes for which an exact value is supplied for each object Exact-match retrieval Structural predicates Concern the structure of multimedia objects Can be answered by metadata and information about the database schema “Find all multimedia objects containing at least one image and a video clip” Sharif University, Modern Information Retrieval Course, Spring 2006

32 Conditions on multimedia data
Semantic predicates Concern the semantic content of the required data, depending on the features that have been extracted and stored for each multimedia object “Find all the red houses” Exact match cannot be applied Sharif University, Modern Information Retrieval Course, Spring 2006

33 Indexing and searching
Searching similar patterns Distance function Given two objects, O1 and O2, the distance (=dissimilarity) of the two objects is denoted by D(O1,O2) Similarity queries Whole match Sub-pattern match Nearest neighbors All pairs Sharif University, Modern Information Retrieval Course, Spring 2006

34 Spatial access methods
Map objects into points in f-D space, and to use multiattribute access methods (also referred to as spatial access methods or SAMs) to cluster them and to search for them Methods R*-trees and the rest of the R-tree family Linear quadtrees Grid-files Linear quadtrees and grid files explode exponentially with the dimensionality Sharif University, Modern Information Retrieval Course, Spring 2006

35 Sharif University, Modern Information Retrieval Course, Spring 2006
R-tree R-tree Represent a spatial object by its minimum bounding rectangle (MBR) Data rectangles are grouped to form parent nodes (recursively grouped) The MBR of a parent node completely contains the MBRs of its children MBRs are allowed to overlap Nodes of the tree correspond to disk pages Sharif University, Modern Information Retrieval Course, Spring 2006

36 Sharif University, Modern Information Retrieval Course, Spring 2006

37 Sharif University, Modern Information Retrieval Course, Spring 2006
Outline Introduction Text-Based MMIR Content-Based Retrieval Multimedia IR Model Image Retrieval Audio Retrieval Video Retrieval Conclusions Sharif University, Modern Information Retrieval Course, Spring 2006

38 Sharif University, Modern Information Retrieval Course, Spring 2006
Visual Features ... Colour Shape Texture Sharif University, Modern Information Retrieval Course, Spring 2006

39 Sharif University, Modern Information Retrieval Course, Spring 2006
Histograms Greyscale histogram of image A Assuming 256 intensity levels hA(l) (l=1  256) hA(l) =#{(i,j)|A(i,j)=l, i = 1  m, for j = 1  n} i.e. a count of the number of pixels at each level Sharif University, Modern Information Retrieval Course, Spring 2006

40 Sharif University, Modern Information Retrieval Course, Spring 2006
Colour Histogram Describe the colors and its percentages in an image. Sharif University, Modern Information Retrieval Course, Spring 2006

41 Sharif University, Modern Information Retrieval Course, Spring 2006
Texture Matching Texture characterizes small-scale regularity Color describes pixels, texture describes regions Described by several types of features e.g., smoothness, periodicity, directionality Perform weighted vector space matching Usually in combination with a color histogram Sharif University, Modern Information Retrieval Course, Spring 2006

42 Sharif University, Modern Information Retrieval Course, Spring 2006
Texture Test Patterns Sharif University, Modern Information Retrieval Course, Spring 2006

43 Image Retrieval using low level features
See IBM demos at: (video) Hermitage Museum Sharif University, Modern Information Retrieval Course, Spring 2006

44 Sharif University, Modern Information Retrieval Course, Spring 2006
Berkeley Blobworld Sharif University, Modern Information Retrieval Course, Spring 2006

45 Sharif University, Modern Information Retrieval Course, Spring 2006
Berkeley Blobworld Sharif University, Modern Information Retrieval Course, Spring 2006

46 Sharif University, Modern Information Retrieval Course, Spring 2006
But….. Low-level feature doesn’t work in all the cases Sharif University, Modern Information Retrieval Course, Spring 2006

47 Solution: Regional Low-level Image Feature
Segmentation into objects Extract low-level features from each regions Sharif University, Modern Information Retrieval Course, Spring 2006

48 Solution: High-level Image Feature
Objects: Persons, Roads, Cars, Skies… Scenes: Indoors, Outdoors, Cityscape, Landscape, Water, Office, Factory… Event: Parade, Explosion, Picnic, Playing Soccer… Generated from low-level features Sharif University, Modern Information Retrieval Course, Spring 2006

49 Sharif University, Modern Information Retrieval Course, Spring 2006
Outline Introduction Text-Based MMIR Content-Based Retrieval Multimedia IR Model Image Retrieval Audio Retrieval Video Retrieval Conclusions Sharif University, Modern Information Retrieval Course, Spring 2006

50 Sharif University, Modern Information Retrieval Course, Spring 2006
Audio Genres Important types of audio data Speech-centered Radio programs Telephone conversations Recorded meetings Music-centered Instrumental, vocal Other sources Alarms, instrumentation, surveillance, … Sharif University, Modern Information Retrieval Course, Spring 2006

51 Speech-based Documents
Radio/TV news retrieval. Search archival radio/news broadcasts. Video and audio . Knowledge management : transfert of tacit knowledge to others. Search audio archives of meetings, lectures, etc… Sharif University, Modern Information Retrieval Course, Spring 2006

52 Sharif University, Modern Information Retrieval Course, Spring 2006
Preamble Two utterances of the same words by the same person under the same conditions generate very different waveforms. Variations due to loudness, pitch, brightness, bandwidth, harmonisity, and others are all continuous variables and are equivalent to color and texture in images. Sharif University, Modern Information Retrieval Course, Spring 2006

53 Detectable Speech Features
Content Phonemes, one-best word recognition, n-best Identity Speaker identification, speaker segmentation Language Language, dialect, accent Other measurable parameters Time, duration, channel, environment Sharif University, Modern Information Retrieval Course, Spring 2006

54 How Speech Recognition Works
Three stages What sounds were made? Convert from waveform to subword units (phonemes) How could the sounds be grouped into words? Identify the most probable word segmentation points Which of the possible words were spoken? Based on likelihood of possible multiword sequences All three stages are learned from training data Using hill climbing (a “Hidden Markov Model”) Sharif University, Modern Information Retrieval Course, Spring 2006

55 Sharif University, Modern Information Retrieval Course, Spring 2006
Speech Recognition Phoneme n-grams One-best phoneme transcription Phoneme Detection N-best phoneme sequences Phoneme lattices Phoneme transcription dictionary Word Construction One-best word transcript Word n-gram language model Word Selection Words Sharif University, Modern Information Retrieval Course, Spring 2006

56 Music and audio analysis
Music is a large and extremely variable audio class. The range of sounds is large, from music genres to animal cries to synthesizer samples. Any of the above can and will occur in combination. Sharif University, Modern Information Retrieval Course, Spring 2006

57 Audio retrieval-by-content
Require some measure of audio similarity. Most approaches to general audio retrieval take a perceptual approach, using measures such as loudness. Neural net to map a sound clip to a text description : An obvious drawback is the subjective nature of audio description. Sharif University, Modern Information Retrieval Course, Spring 2006

58 Sample system: Muscle fish
To analyze sound files for a specific set of psychoacoustic features. This results in a vector of attributes that include loudness, pitch, bandwidth and harmonicity. Given enough training samples, a Gaussian classifier can be constructed, or for retrieval. Sharif University, Modern Information Retrieval Course, Spring 2006

59 Sharif University, Modern Information Retrieval Course, Spring 2006
An Euclidean distance is used as a measure of similarity. For retrieval, the distance is computed between a given sound example and all other sound examples (about 400 in the demonstration). Sounds are ranked by distance, with the closer ones being more similar. Sharif University, Modern Information Retrieval Course, Spring 2006

60 Music and MIDI retrieval
Using archives of MIDI files, which are score-like representations of music intended for musical synthesizers or sequencers. Given a melodic query, the MIDI files can be searched for similar melodies. Sharif University, Modern Information Retrieval Course, Spring 2006

61 Polyphonic Music Indexing Technique
n-grams encode music as text strings using pitch and onsets index text words with text search engine process query in the same way application: eg, Query by Humming Sharif University, Modern Information Retrieval Course, Spring 2006

62 Monophonic pitch n-gramming
Interval: [ ] [ ] ZGZB [ ] GZBZ ZBZb Example: musical strings with interval-only representation Sharif University, Modern Information Retrieval Course, Spring 2006

63 Sharif University, Modern Information Retrieval Course, Spring 2006
Outline Introduction Text-Based MMIR Content-Based Retrieval Multimedia IR Model Image Retrieval Audio Retrieval Video Retrieval Conclusions Sharif University, Modern Information Retrieval Course, Spring 2006

64 Sharif University, Modern Information Retrieval Course, Spring 2006
Application Increasing demand for visual information retrieval Retrieve useful information from databases Sharing and distributing video data through computer networks Example: BBC BBC archive has +500k queries plus 1M new items … per year; From the BBC … Police car with blue light flashing Government plan to improve reading standards Two shot of Kenneth Clarke and William Hague Sharif University, Modern Information Retrieval Course, Spring 2006

65 Sharif University, Modern Information Retrieval Course, Spring 2006
Video Search Active Research Area Sharif University, Modern Information Retrieval Course, Spring 2006

66 Video Search: Features
Texture One of the earliest Image features [Harlick et al 70s] Co-occurrence matrix Orientation and distance on gray-scale pixels Contrast, inverse deference moment, and entropy [Gotlieb & Kreyszig] Human visual texture properties: coarseness, contrast, directionality, likeliness, regularity and roughness [Tamura et al] Wavelet Transforms [90s] [Smith & Chang] extracted mean and variance from wavelet subbands Gabor Filters And so on Region Segmentation Partition image into regions Strong Segmentation: Object segmentation is difficult. Weak segmentation: Region segmentation based on some homegenity criteria Scene Segmentation Shot detection, scene detection Look for changes in color, texture, brightness Context based scene segmentation applied to certain categories such as broadcast news Color Robust to background Independent of size, orientation Color Histogram [Swain & Ballard] “Sensitive to noise and sparse”- Cumulative Histograms [Stricker & Orgengo] Color Moments Color Sets: Map RGB Color space to Hue Saturation Value, & quantize [Smith, Chang] Color layout- local color features by dividing image into regions Color Autocorrelograms Sharif University, Modern Information Retrieval Course, Spring 2006

67 Video Search: Features
Face Face detection is highly reliable - Neural Networks [Rwoley] - Wavelet based histograms of facial features [Schneiderman] Face recognition for video is still a challenging problem. - EigenFaces: Extract eigenvectors and use as feature space OCR OCR is fairly successful technology. Accurate, especially with good matching vocabularies. Script recognition still an open problem. ASR Automatic speech recognition fairly accurate for medium to large vocabulary broadcast type data Large number of available speech vendors. Still open for free conversational speech in noisy conditions. Shape Outer Boundary based vs. region based Fourier descriptors Moment invariants Finite Element Method (Stiffness matrix- how each point is connected to others; Eigen vectors of matrix) Turing function based (similar to Fourier descriptor) convex/concave polygons[Arkin et al] Wavelet transforms leverages multiresolution [Chuang & Kao] Chamfer matching for comparing 2 shapes (linear dimension rather than area) 3-D object representations using similar invariant features Well-known edge detection algorithms. Sharif University, Modern Information Retrieval Course, Spring 2006

68 Sharif University, Modern Information Retrieval Course, Spring 2006
Video Structures Image structure Absolute positioning, relative positioning Object motion Translation, rotation Camera motion Pan, zoom, perspective change Shot transitions Cut, fade, dissolve, … Sharif University, Modern Information Retrieval Course, Spring 2006 7

69 Typical Retrieval Framework
User : provide query information that represents his information needs Database: store a large collection of video data Goal: Find the most relevant shots from the database Shots: “paragraph” in video, typically 20 – 40 seconds, which is the basic unit of video retrieval Sharif University, Modern Information Retrieval Course, Spring 2006

70 Sharif University, Modern Information Retrieval Course, Spring 2006
Bridging the Gap Video Database User Result Sharif University, Modern Information Retrieval Course, Spring 2006

71 Automatically Structure Video Data
The first step for video retrieval: Video “programmes” are structured into logical scenes, and physical shots If dealing with text, then the structure is obvious: paragraph, section, topic, page, etc. All text-based indexing, retrieval, linking, etc. builds upon this structure; Automatic shot boundary detection and selection of representative keyframes is usually the first step; Sharif University, Modern Information Retrieval Course, Spring 2006

72 Typical automatic structuring of video
a video document A set of shots Keyframe browser combined with transcript or object-based search Sharif University, Modern Information Retrieval Course, Spring 2006

73 Sharif University, Modern Information Retrieval Course, Spring 2006
Ideal solution Video Database User Information Need Video Structure Understanding the semantic meaning and retrieve Result Sharif University, Modern Information Retrieval Course, Spring 2006

74 Sharif University, Modern Information Retrieval Course, Spring 2006
Ideal solution However, Hard to represent query in natural language and for computer to understand Computers have no experience Other representation restriction like position, time Video Database User Information Need Video Structure Understanding the semantic meaning and retrieve Result Sharif University, Modern Information Retrieval Course, Spring 2006

75 Sharif University, Modern Information Retrieval Course, Spring 2006
Alternative Solution Video Database User Provide evidence of relevant information ( text, image, audio) Information Need Video Structure Match and combine Result Sharif University, Modern Information Retrieval Course, Spring 2006

76 Evidence-based Retrieval System
General framework for current video retrieval system Video retrieval based on the evidence from both users and database, including Text information Image information Motion information Audio information Return a relevant score for each evidence Combination of the scores Sharif University, Modern Information Retrieval Course, Spring 2006

77 Keyword-based System Video Database User Automatic Annotation Keyword
Information Need Video Structure Including filename, video title, caption, related web page Sharif University, Modern Information Retrieval Course, Spring 2006

78 Sharif University, Modern Information Retrieval Course, Spring 2006
Keyword-based System Video Database User Automatic Annotation Keyword Information Need Video Structure Manual Annotation Sharif University, Modern Information Retrieval Course, Spring 2006

79 Sharif University, Modern Information Retrieval Course, Spring 2006
Manual Annotation Manually creating annotation/keywords for image / video data Examples: Gettyimage.com (image retrieval) Pros: Represent the semantic meaning of video Cons Time-consuming, labor-intensive Keyword is not enough to represent information need Sharif University, Modern Information Retrieval Course, Spring 2006

80 Speech and OCR transcription
Video Database User Annotation Keyword Information Need Video Structure Speech Transcription OCR Transcription Sharif University, Modern Information Retrieval Course, Spring 2006

81 Query using speech/OCR information
Find pictures of Harry Hertz, Director of the National Quality Program, NIST Speech: We’re looking for people that have a broad range of expertise that have business knowledge that have knowledge on quality management on quality improvement and in particular … OCR: H,arry Hertz a Director aro 7 wa-,i,,ty Program ,Harry Hertz a Director Sharif University, Modern Information Retrieval Course, Spring 2006

82 Sharif University, Modern Information Retrieval Course, Spring 2006
What we lack? Video Database User Annotation Keyword Information Need Video Structure Speech Transcription Image Information OCR Transcription Sharif University, Modern Information Retrieval Course, Spring 2006

83 Image-based Retrieval
Video Database User Text Information Keyword Information Need Video Structure Image Feature Query Images Sharif University, Modern Information Retrieval Course, Spring 2006

84 Image-based Retrieval
Video Database User Text Information Keyword Information Need Video Structure Image Feature Query Images Low-level Feature High-level Feature Sharif University, Modern Information Retrieval Course, Spring 2006

85 More Evidence in Video Retrieval
Video Database User Text Information Keyword Information Need Video Structure Image Information Query Images Motion Information Motion Audio Information Audio Sharif University, Modern Information Retrieval Course, Spring 2006

86 Sharif University, Modern Information Retrieval Course, Spring 2006
MPEG-7: The Objective Standardize object-based description tools for various types of audiovisual information, allowing fast and efficient content searching, filtering and identification, and addressing a large range of applications. New objective for MPEG: MPEG-1, -2 and -4 represent the content itself (‘the bits’) MPEG-7 should represent information about the content (‘the bits about the bits’) Sharif University, Modern Information Retrieval Course, Spring 2006

87 This is the scope of MPEG-7
Description creation Description consumption description Not the description creation Not the description consumption Just the description ! The goal is to define the minimum that enables interoperability. Sharif University, Modern Information Retrieval Course, Spring 2006

88 MPEG-7 Terminology: Descriptor
Descriptor (D) : A Descriptor is a representation of a Feature. A Descriptor defines the syntax and the semantics of the Feature representation. Examples: Feature Descriptor Color Histogram of Y,U,V components Shape ART moments Motion Motion field, coefficients of a model Audio frequency Average frequency components Title Text Annotation Text Genre Text, index in as thesaurus Sharif University, Modern Information Retrieval Course, Spring 2006

89 Sharif University, Modern Information Retrieval Course, Spring 2006
Outline Introduction Text-Based MMIR Content-Based Retrieval Multimedia IR Model Image Retrieval Audio Retrieval Video Retrieval Conclusions Sharif University, Modern Information Retrieval Course, Spring 2006

90 Sharif University, Modern Information Retrieval Course, Spring 2006
Conclusions Simple image retrieval is commercially available Color histograms, texture, limited shape information Segmentation-based retrieval is still in the lab Keep an eye on the Berkeley group Limited audio indexing is practical now Audio feature matching, answering machine detection Sharif University, Modern Information Retrieval Course, Spring 2006

91 Sharif University, Modern Information Retrieval Course, Spring 2006
Conclusions Multimedia IR Text: good solutions exist Video, Image, Sound – a lot of work to do. Sharif University, Modern Information Retrieval Course, Spring 2006

92 Sharif University, Modern Information Retrieval Course, Spring 2006
Conclusions The goal of content-based video retrieval is to build more intelligent video retrieval engine via semantic meaning Many applications in daily life Combine evidence from different aspects Hot research topic, few business system State-of-the-art performance is still unacceptable for normal users, space to improve Sharif University, Modern Information Retrieval Course, Spring 2006

93 Sharif University, Modern Information Retrieval Course, Spring 2006
Conclusions Problems with Content-Based MMIR Must have an example image Example image is 2-D Hence only that view of the object will be returned Large amount of image data Similar colour histogram does not equal similar image Usually the best results come from a combination of both text and content searching For example if we give in a side view image of a horse it will not return images from the front or behind Sharif University, Modern Information Retrieval Course, Spring 2006

94 Sharif University, Modern Information Retrieval Course, Spring 2006
Conclusions Combination of multi-modal results Difference characteristics between multi-modal information Text-based Information: better for middle and high level queries Image-based Information: better for low and middle level queries Combination of multi-modal information Sharif University, Modern Information Retrieval Course, Spring 2006

95 Sharif University, Modern Information Retrieval Course, Spring 2006
Conclusions Challenging research questions Draws on computer vision, audio processing, natural language analysis, unstructured document analysis, information retrieval, information visualisation, computer human interaction, artificial intelligence Sharif University, Modern Information Retrieval Course, Spring 2006


Download ppt "Multimedia Information Retrieval"

Similar presentations


Ads by Google