Current Topics in Information Access: Image Retrieval Chad Carson EECS Department UC Berkeley SIMS 296a-3 December 2, 1998.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Image Retrieval: Current Techniques, Promising Directions, and Open Issues Yong Rui, Thomas Huang and Shih-Fu Chang Published in the Journal of Visual.
Content-Based Image Retrieval
Presented By: Vennela Sunnam
Discriminative Relevance Feedback With Virtual Textual Representation For Efficient Image Retrieval Suman Karthik and C.V.Jawahar.
Image Retrieval Basics Uichin Lee KAIST KSE Slides based on “Relevance Models for Automatic Image and Video Annotation & Retrieval” by R. Manmatha (UMASS)
Image Information Retrieval Shaw-Ming Yang IST 497E 12/05/02.
Chapter 8 Content-Based Image Retrieval. Query By Keyword: Some textual attributes (keywords) should be maintained for each image. The image can be indexed.
Based on the collection of documentaries of the Russian State Film & Photo Document Archive The greatest collection of silent and sound newsreels and other.
1 Building a Dictionary of Image Fragments Zicheng Liao Ali Farhadi Yang Wang Ian Endres David Forsyth Department of Computer Science, University of Illinois.
Quadtrees, Octrees and their Applications in Digital Image Processing
Lecture 12 Content-Based Image Retrieval
1 Content-Based Retrieval (CBR) -in multimedia systems Presented by: Chao Cai Date: March 28, 2006 C SC 561.
Broadcast News Parsing Using Visual Cues: A Robust Face Detection Approach Yannis Avrithis, Nicolas Tsapatsoulis and Stefanos Kollias Image, Video & Multimedia.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Morris LeBlanc.  Why Image Retrieval is Hard?  Problems with Image Retrieval  Support Vector Machines  Active Learning  Image Processing ◦ Texture.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
ADVISE: Advanced Digital Video Information Segmentation Engine
SWE 423: Multimedia Systems Chapter 4: Graphics and Images (4)
A Study of Approaches for Object Recognition
SWE 423: Multimedia Systems
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
Stockman MSU CSE1 Image Database Access  Find images from personal collections  Find images on the web  Find images from medical cases  Find images.
Quadtrees, Octrees and their Applications in Digital Image Processing
T.Sharon 1 Internet Resources Discovery (IRD) Introduction to MMIR.
A. Frank Multimedia Multimedia/Video Search. 2 A. Frank Contents Multimedia (MM) and search/retrieval Text-based MM search in General SEs Text-based MM.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Presented by Zeehasham Rasheed
Stockman MSU Fall Computing Motion from Images Chapter 9 of S&S plus otherwork.
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Content-based Image Retrieval (CBIR)
Image searching on the Web Qunyan Mao SIMS, UC Berkeley.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Information Retrieval in Practice
Presenting by, Prashanth B R 1AR08CS035 Dept.Of CSE. AIeMS-Bidadi. Sketch4Match – Content-based Image Retrieval System Using Sketches Under the Guidance.
The MPEG-7 Color Descriptors
Multimedia Databases (MMDB)
Image and Video Retrieval INST 734 Doug Oard Module 13.
Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Multimedia Information Retrieval
Università degli Studi di Modena and Reggio Emilia Dipartimento di Ingegneria dell’Informazione Prototypes selection with.
Watch, Listen and Learn Sonal Gupta, Joohyun Kim, Kristen Grauman and Raymond Mooney -Pratiksha Shah.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Content-Based Image Retrieval
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.
IBM QBIC: Query by Image and Video Content Jianping Fan Department of Computer Science University of North Carolina at Charlotte Charlotte, NC 28223
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Non-Photorealistic Rendering and Content- Based Image Retrieval Yuan-Hao Lai Pacific Graphics (2003)
Competence Centre on Information Extraction and Image Understanding for Earth Observation 29th March 2007 Category - based Semantic Search Engine 1 Mihai.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Content-Based Image Retrieval QBIC Homepage The State Hermitage Museum db2www/qbicSearch.mac/qbic?selLang=English.
Semantic Extraction and Semantics-Based Annotation and Retrieval for Video Databases Authors: Yan Liu & Fei Li Department of Computer Science Columbia.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Miguel Tavares Coimbra
Yixin Chen and James Z. Wang The Pennsylvania State University
Query by Image and Video Content: The QBIC System M. Flickner et al. IEEE Computer Special Issue on Content-Based Retrieval Vol. 28, No. 9, September 1995.
VISUAL INFORMATION RETRIEVAL Presented by Dipti Vaidya.
Relevance Feedback in Image Retrieval System: A Survey Tao Huang Lin Luo Chengcui Zhang.
Ontology-based Automatic Video Annotation Technique in Smart TV Environment Jin-Woo Jeong, Hyun-Ki Hong, and Dong-Ho Lee IEEE Transactions on Consumer.
Digital Video Library - Jacky Ma.
Visual Information Retrieval
Introduction Multimedia initial focus
Content-Based Image Retrieval Readings: Chapter 8:
Color-Texture Analysis for Content-Based Image Retrieval
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Brief Review of Recognition + Context
Multimedia Information Retrieval
Presentation transcript:

Current Topics in Information Access: Image Retrieval Chad Carson EECS Department UC Berkeley SIMS 296a-3 December 2, 1998

Outline Background –What do collections look like? –What do users want? Sample image retrieval systems –Piction: retrieve photos (of people) based on captions –WebSeek: Web search based (mostly) on text –QBIC: retrieve general photos using content, text –Blobworld: rely exclusively on image content Discussion

Image collections are diverse

What are the tasks? Large, diverse image collections are common Text is sometimes unreliable or unavailable Users want “things,” not “stuff” –More than overall image properties –Traditional object recognition won’t work –Two choices: Rely on text to identify objects Look at regions (objects or parts of objects)

Large-scale systems: simple features Mainly overall color and texture, plus text Some segmentation QBIC [Flickner et al. ’95] WebSeek [Smith and Chang ’96] Virage [Gupta and Jain ’97] Photobook [Pentland et al. ’96] ImageRover [Sclaroff et al.’97] WebSeer [Frankel et al. ’96]

Other approaches Iconic matching [Jacobs et al. ’95] –Match user sketch to coarse representation –Look at overall image layout Spatial relationships [Lipson et al. ’97] –Relationships between superpixels for querying and classification Region-based approach [Ma & Manjunath ’98] –Semi-automatic segmentation

Classical object recognition won’t work Match geometric object model to image points Doesn’t apply to our domain –Designed for fixed, geometric objects –Relies on clean segmentation Survey: Binford ’82

Sample image retrieval systems Piction –News photos of people –Relies heavily on text –Captions guide image understanding WebSeek –Index images and videos on the Web –Relies heavily on text –Some content-based searching

Sample image retrieval systems (cont’d.) QBIC –General image/video collections –Use both content and textual information –User labeling is important Blobworld –Relies exclusively on image content –Find images regions corresponding to objects –Querying based on regions’ color, texture, simple shape

Piction: retrieval based on captions Collection: several hundred news photographs, mostly of people Use captions to identify who’s in the picture and where, then find matching faces Relies heavily on text

NLP derives constraints from caption Determine four types of information: –Object classes: Which proper-noun complexes (PNCs) are people? –Who/what is in the picture: Which people are in the picture? Which aren’t? –Spatial constraints: Where are these people in the picture? –Significant visual characteristics: What do they look like (gender, hair color, glasses, etc.)?

Use constraints in image interpretation Example: “Lillian Halasinski of Dunkirk stands with a portrait of Mother Mary Angela Truszkowka at a news conference in the Villa Maria Convent on Doat Street.”  First find frame, then look for faces inside and outside

Image understanding: locate and describe faces Locate human faces by finding three contours (hairline, left contour of face, right contour of face) Look at multiple scales Confirm by looking for face-specific features Neural net discriminates between male and female

Discussion How well does the NLP module work? –“Tom Smith and his wife Mary Jane prepare for the visit of President Clinton on Tuesday” How well does the face finder work? Does this system really need to use image content? –How much would we get from captions alone?

WebSeek: search for images on the Web Collection: 500,000 Web images and videos Filename and surrounding text determine subject Simple content-based techniques –Global color histograms –Color in localized regions

Collect and process images and videos Retrieve image or video Store visual features and textual information Generate icon (motion icon) to represent image (video) compactly

Classify images based on associated text Extract terms –URL: –Alt text: –Hyperlink text: Sally the poodle Map terms to subjects using a semi-automatically constructed key-term dictionary Store terms, location in subject taxonomy

Content-based information Extract information from image –Global color histogram –Size, location, relationship of color regions Automatically determine image type: –Color photo, color graphic, gray image, B&W image Use content-based information when browsing within category hierarchy

Discussion How good is the text-based classification? Can we always count on having text? How much does the content-based information add in this case?

QBIC: image content plus annotation Collections: –General collections –Video clips –Fabric samples –etc. Retrieval based on color, texture, shape, sketch Relies (sometimes) on human annotation of image or objects

Populating the database Use any available text (title, subject, caption, etc.) User selects and labels important regions Store information about image content Extract key frames and describe video content

Video processing Detect shots –Pan over skyline  pan over Bay  zoom to bridge Generate representative frames –Capture all the background from the shot in one image Extract motion layers to facilitate segmentation

Query based on text and content Textual information: title, subject, object labels Global image properties: color and texture Object properties: color and shape Sketch: use the sketch as a template to find matches

Discussion QBIC’s shape description doesn’t capture the essence of objects, just the appearance of one view How much does QBIC rely on textual information?

Blobworld: represent image regions We want to find objects  look for coherent regions

Expectation-Maximization (EM) –Assign each pixel to a Gaussian in color/texture space  segmentation Model selection: Minimum Description Length –Prefer fewer Gaussians if performance is comparable Group pixels into regions

Describe regions’ color, texture, shape Color –Color histogram within region Texture –Contrast and directionality  stripes vs. spots vs. smooth Shape –Fourier descriptors of contour

Querying: let user see the representation Current systems act like black boxes –User can’t see what the computer sees –It’s not clear how parameters relate to the image User should interact with the representation –Helps in query formulation –Makes results understandable –Minimizes disappointment

Sample queries Query in collection of 10,000 imagesQuery Shape gives lower precision, closer matches

Experimental results Blobworld (two blobs) vs. global histograms (top 40 images)

Blobworld vs. global histograms Distinctive objects (tigers, cheetahs, zebras): –Blobworld does better Distinctive scenes (airplanes): –Global histograms do better –Adding blob size would help (  “large blue blob”) Others: –Both do somewhat poorly, but Blobworld has room to grow (shape, etc.)

Discussion How much can we accomplish using image content alone? –Today –In the future

Overall Discussion What are the tasks? Who are the actual users? What can we expect of the users? How much can we (should we) rely on text? How useful is content-based information? How good will image understanding need to be before it’s generally useful?