Multi-Media Retrieval by Paul McGlade Modified by Shinta P.
What is Multi-Media Retrieval? The searching and retrieval of various multi- media (image, video, web). The searching and retrieval of various multi- media (image, video, web). Typically consists of a query search against a database, usually called either digital libraries or digital archives. Typically consists of a query search against a database, usually called either digital libraries or digital archives. Generally, multimedia databases also consist of textual data types. Generally, multimedia databases also consist of textual data types.
Tasks Multimedia systems must solve at least two different tasks: Multimedia systems must solve at least two different tasks: First, relevant items have to be identified. First, relevant items have to be identified. Second, they have to be presented in such a way that the user can relate them to each other, and what is often more complicated, to the query. Second, they have to be presented in such a way that the user can relate them to each other, and what is often more complicated, to the query.
Problems Multimedia data comparison is more difficult than textual data. Multimedia data comparison is more difficult than textual data. Different types of querying raises different types of problems. Different types of querying raises different types of problems. The relevance of each aspect in the multimedia data must be weighted. The relevance of each aspect in the multimedia data must be weighted.
Approaches / Solutions Different approaches are explored for the comparison process: Different approaches are explored for the comparison process: Text-based Text-based Region-based Region-based Object-based Object-based Various solutions have been created: Various solutions have been created: Query formulation Query formulation MRML MRML Image indexing Image indexing
Text-based Index images using keywords or descriptions. Index images using keywords or descriptions. Advantages: Advantages: Easier to design and implement. Easier to design and implement. Uses surrounding text in a web page. Uses surrounding text in a web page. Disadvantages: Disadvantages: Often too expensive. Often too expensive. A picture can sometimes require many words. A picture can sometimes require many words. Surrounding text may not describe picture. Surrounding text may not describe picture.
Region-based Queries images using regions of the image. Queries images using regions of the image. Advantages: Advantages: Handles low-level queries. Handles low-level queries. Many features can be extracted. Many features can be extracted. Disadvantages: Disadvantages: Cannot handle high-level queries. Cannot handle high-level queries.
Region-based Good Bad
Object-based Extracts objects from images first. Extracts objects from images first. Advantages: Advantages: Handles object-based queries. Handles object-based queries. Reduce feature storage adaptively. Reduce feature storage adaptively. Disadvantages: Disadvantages: Object segmentation is very difficult. Object segmentation is very difficult. User interface is complicated and not easily implemented. User interface is complicated and not easily implemented.
Object-based (cont’d)
Blobworld Blobworld is a system for content-based image retrieval. Blobworld is a system for content-based image retrieval. By automatically segmenting each image into regions which roughly correspond to objects or parts of objects, we allow users to query for photographs based on the objects they contain. By automatically segmenting each image into regions which roughly correspond to objects or parts of objects, we allow users to query for photographs based on the objects they contain. Blobworld Site Blobworld Site Blobworld Site Blobworld Site
Query Formulation Formulates a query for comparison against a database. Query Formula example: SIMILARITY: look similar OBJECT: contains a bike OBJECT RELATIONSHIP: contains a dog near a person MOOD: a happy picture TIME/PLACE: Yosemite sunset
MRML Multimedia Retrieval Markup Language Multimedia Retrieval Markup Language MRML’s goal is to unify access to multimedia retrieval. MRML’s goal is to unify access to multimedia retrieval. XML-based communication protocol. XML-based communication protocol. Specified to standardize access to Multimedia Retrieval software components. Specified to standardize access to Multimedia Retrieval software components.
MRML (cont’d) Code example: Code example:
GIFT GNU Image-Finding Tool is a Content Based Image Retrieval System (CBIRS). GNU Image-Finding Tool is a Content Based Image Retrieval System (CBIRS). Uses MRML. Uses MRML. Enables the user to query by example on images. Enables the user to query by example on images. Relies purely on the content of the image. Relies purely on the content of the image. GIFT Site GIFT Site GIFT Site GIFT Site
Image Indexing Process which analyzes an image and selects aspects of the image to compare in order to index the image with little user input. Process which analyzes an image and selects aspects of the image to compare in order to index the image with little user input. Segments the image into various regions, and attaches words to each region. Segments the image into various regions, and attaches words to each region.
Image Indexing (cont’) Computer Predictions - male cloth female fashion environment people industry fire face man man-made Manual Category Annotation - super model people female cloth Computer Predictions - grass mare tiger horses cat buildings Manual Category Annotation - cat grass tiger
A-Lip Automatic Linguistic Indexing of Pictures system selects among 600 trained concepts to annotate images automatically. Automatic Linguistic Indexing of Pictures system selects among 600 trained concepts to annotate images automatically. On-line real-time image annotation demonstration is expected to be developed and made available later this year. On-line real-time image annotation demonstration is expected to be developed and made available later this year. When released, will be able to submit your own images for automatic annotation. When released, will be able to submit your own images for automatic annotation. A-Lip Site A-Lip Site A-Lip Site A-Lip Site
High-Level Tools Some technical approaches to image comparison: Some technical approaches to image comparison: Wavelet comparisons. Wavelet comparisons. Fast Image Segmentation. Fast Image Segmentation. IRM (Integrated Region Matching). IRM (Integrated Region Matching). Fuzzy Matching. Fuzzy Matching.
SIMPLIcity Semantics-sensitive Integrated Matching for Picture Libraries. Combine low-level statistical semantic classification with image retrieval. Wavelet-based feature extraction for fast segmentation. Integrated Region Matching (IRM). SIMPLIcity Site SIMPLIcity Site SIMPLIcity Site SIMPLIcity Site
Mengapa Image Retrieval Sulit? Text Retrieval Text Retrieval Kata Adalah suatu unit, mudah diindex Kata Adalah suatu unit, mudah diindex Kata Memiliki arti semantik Kata Memiliki arti semantik Image Retrieval Image Retrieval Unit pberupa piksel, sulit diindex Unit pberupa piksel, sulit diindex Piksel tak memiliki arti Piksel tak memiliki arti piksel membentuk pola representasi objek, kesulitan dalam segmentasi piksel membentuk pola representasi objek, kesulitan dalam segmentasi Objek gambar tergantung banyak faktor Objek gambar tergantung banyak faktor
Mengapa Image Retrieval Sulit? (Cont’) Image Retrieval Image Retrieval Objek gambar tergantung banyak faktor Objek gambar tergantung banyak faktor Sudut Pandang Sudut Pandang Iluminasi Iluminasi Bayangan Bayangan Dan komplikasi lainya (latar belakang, variasi warna, dll) Dan komplikasi lainya (latar belakang, variasi warna, dll)
Pencocokan Citra (Global Similarity) Histogram Warna Histogram Warna Karakteristik Tekstur (region) Karakteristik Tekstur (region)
Pencocokan Citra (Local Similarity) Query By Example Query By Example Segmentasi Objek Segmentasi Objek Pencocokan Pencocokan Caption Text Caption Text Similarity (warna, tekstur, bentuk) Similarity (warna, tekstur, bentuk) Susunan Spatial (orientasi, posisi) Susunan Spatial (orientasi, posisi) Teknik Khhusus (eg. Pengenalan Wajah) Teknik Khhusus (eg. Pengenalan Wajah)
Conclusion Since one query will return many false results, I believe more emphasis should be placed on the weighting of certain aspects of each image. Since one query will return many false results, I believe more emphasis should be placed on the weighting of certain aspects of each image. Some ideas: Some ideas: Artistic tendencies could be taken into account when determining the relevance of an object in an image. Artistic tendencies could be taken into account when determining the relevance of an object in an image. A textual comparison of an images indexed words, could help in determining how common certain objects are found together. A textual comparison of an images indexed words, could help in determining how common certain objects are found together.