Alexander Gelbukh www.Gelbukh.com Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11): Multimedia.

Slides:



Advertisements
Similar presentations
1. XP 2 * The Web is a collection of files that reside on computers, called Web servers. * Web servers are connected to each other through the Internet.
Advertisements

1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
Getting Started with Microsoft Office 2007
Chapter 1: The Database Environment
Copyright © 2003 Pearson Education, Inc. Slide 7-1 Created by Cheryl M. Hughes, Harvard University Extension School Cambridge, MA The Web Wizards Guide.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Chapter 1 The Study of Body Function Image PowerPoint
Special Topics in Computer Science The Art of Information Retrieval Chapter 10: User Interfaces and Visualization Alexander Gelbukh
Alexander Gelbukh Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 4 (book chapter 8): Indexing.
Special Topics in Computer Science The Art of Information Retrieval Chapter 8: Indexing and Searching Alexander Gelbukh
Alexander Gelbukh Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 6 (book chapter 12): Multimedia.
Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 2: Modeling Alexander Gelbukh
Special Topics in Computer Science Advanced Topics in Information Retrieval Chapter 1: Introduction Alexander Gelbukh
Special Topics in Computer Science The Art of Information Retrieval Chapter 1: Introduction Alexander Gelbukh
1 Alexander Gelbukh Moscow, Russia. 2 Mexico 3 Computing Research Center (CIC), Mexico.
OvidSP Flexible. Innovative. Precise. Introducing OvidSP Resources.
1 Use of Electronic Resources in Research Prof. Dr. Khalid Mahmood Department of Library & Information Science University of the Punjab.
Library Electronic Resources in the EUI Library Veerle Deckmyn, Library Director Aimee Glassel, Electronic Resources Librarian 07 September
Electronic Resources in the EUI Library
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Introduction to HTML, XHTML, and CSS
Relational data objects 1 Lecture 6. Relational data objects 2 Answer to last lectures activity.
Word Lesson 6 Working with Graphics
Website Design What is Involved?. Web Design ConsiderationsSlide 2Bsc Web Design Stage 1 Website Design Involves Interface Design Site Design –Organising.
Introduction Lesson 1 Microsoft Office 2010 and the Internet
Microsoft Office 2010 Basics and the Internet
Intel VTune Yukai Hong Department of Mathematics National Taiwan University July 24, 2008.
Configuration management
Information Systems Today: Managing in the Digital World
ITEC200 Week04 Lists and the Collection Interface.
ABC Technology Project
Chapter Information Systems Database Management.
XP New Perspectives on Introducing Microsoft Office 2003 Tutorial 1 1 Using Common Features of Microsoft Office 2003 Tutorial 1.
1 What is JavaScript? JavaScript was designed to add interactivity to HTML pages JavaScript is a scripting language A scripting language is a lightweight.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
The World Wide Web. 2 The Web is an infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that.
IONA Technologies Position Paper Constraints and Capabilities for Web Services
Traditional IR models Jian-Yun Nie.
Database System Concepts and Architecture
Lecture plan Outline of DB design process Entity-relationship model
/ faculty of mathematics and informatics TU/e eindhoven university of technology 1 Adaptive Authoring of Adaptive Educational Hypermedia Alexandra Cristea.
Introduction to Databases
Executional Architecture
Chapter 5 Test Review Sections 5-1 through 5-4.
Macromedia Dreamweaver MX 2004 – Design Professional Dreamweaver GETTING STARTED WITH.
25 seconds left…...
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
Chapter 12 Analyzing Semistructured Decision Support Systems Systems Analysis and Design Kendall and Kendall Fifth Edition.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Chapter 13 The Data Warehouse
Use the buttons on the top to navigate through the presentation 1 PrevNext Menu.
WEB OF KNOWLEDGE 5.2
Know About E-CTLT Teachers Panel and working area.
South Dakota Library Network MetaLib User Interface South Dakota Library Network 1200 University, Unit 9672 Spearfish, SD © South Dakota.
Chapter 8 Improving the User Interface
Multimedia Database Systems
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
ADVISE: Advanced Digital Video Information Segmentation Engine
Modern Information Retrieval Chapter 1 Introduction.
CH 11 Multimedia IR: Models and Languages
Overview of Search Engines
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Information Retrieval and Web Search Lecture 1. Course overview Instructor: Rada Mihalcea Class web page:
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
BIT 3193 MULTIMEDIA DATABASE CHAPTER 4 : QUERING MULTIMEDIA DATABASES.
Introduction Multimedia initial focus
Multimedia Content-Based Retrieval
Multimedia Information Retrieval
Presentation transcript:

Alexander Gelbukh www.Gelbukh.com Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 5 (book chapter 11): Multimedia IR: Models and Languages Alexander Gelbukh www.Gelbukh.com

Previous Chapter: Conclusions Inverted files seem to be the best option Other structures are good for specific cases Genetic databases Sequential searching is an integral part of many indexing-based search techniques Many methods to improve sequential searching Compression can be integrated with search

Previous Chapter: Research topics Perhaps, new details in integration of compression and search “Linguistic” indexing: allowing linguistic variations Search in plural or only singular Search with or without synonyms

Motivation Applications: Example: office, CAD, medical, Internet Artists sings a melody and sees all the songs with similar melody

What’s different Different from text IR: Aspects: Structure of data is more complex. Efficiency is an issue Using of metadata Characteristics of multimedia data Operations to be performed Aspects: Data modeling: Extract and maintain the features of objects Data retrieval: based not only on description but on content

Retrieval process Query specification fuzzy predicates: similar to content predicates: images containing an apple data type predicates: video, ... Query processing and optimization Parsed, compiled, optimized for order of execution Problem: many data types, different processing for each Answer Relevance: similarity to query Iteration Bad quality, so need to refine

Modeling

Data modeling To model is to simplify, in order to make manageable. “We will represent an image as...” From the user’s point of view From the system’s point of view (technically) A problem: very large storage size. Modeling needed Objects are represented as feature vectors Images / Video: shape. House, car, ... Sound: style. Music: Merry, sad, ... Features are defined directly or by comparison Degree of certainty is stored

Multimedia support in commercial DBMSs (1999) Variable length data. Non-standard Different and usually very limited sets of operations SQL3: provides user-extensible data types Object-oriented Implemented partially in many systems Example: data blades of Informix Content-based functions on text and images E.g.: date = 1997 AND contains (car)

Spatial data types Informix: 2D, 3D data blades Boxes, vectors, ... Operations: intersect, contains, center, ... Text: containWords, .... Supports query images by content

Example: MULTOS Multimedia document server Documents are described by: logical structure: title, into, chapter, ... layout structure: pages, frames, ... conceptual structure: allows content-based queries Docs similar in conceptual structures are grouped into conceptual types Example: Generic_Letter

Example of conceptual structure...

...continued

Image data in MULTOS Analysis Result of analysis: low level: detect objects and positions high level: image interpretation Result of analysis: description of objects found and their classes certainty values Indices are used for fast access to this info Object index. Includes pointers to objects and certainty values Cluster index, with fuzzy clusters of similar images

Internet How Google does it? No image processing. Textual context! File names, nearby words Distance from image to words “give me images with flower in the file name or near the image”

Languages

Query languages As a query, either a description of the object or an example object is submitted “show me images similar to this one” in what respects similar?! Exact match is inadequate. Additional means are needed Content is not a single feature

What defines query language Interface. How to enter the query Types of conditions to specify Handling of uncertainty, proximity, weights

Interface Browsing and navigation Search: description or query by example Query by example: specify what features are important. Give me all houses with similar shape but different colors Libraries of examples can be provided

Conditions... Attribute predicates Structural predicates structured content – the predefined types extracted beforehand Exact match. E.g.: size, type (video, audio, ...) Structural predicates structure: title, sections, ... metadata are used. Find objects containing an image and a video clip Semantic predicates unrestricted content. Find all red houses: red = ?, house = ? Fuzzy

... conditions Predicates Spatial: contain, intersect, is contained in, is adjacent to ... Temporal: Find audio where first politics and then economy is discussed Spatial and temporal predicates can be combined: Find clips where the logo disappears and then a graph appears at the same place A predicate can be applied to a part of document As path expressions in OO databases

Uncertainty, proximity, weights Similarity function The user can assign importance weights to individual predicates in a complex query This gives ranking, as in text IR The same models can be used, e.g., probabilistic model

Examples of query languages: SQL3 Functions and stored procedures: user-defined data manipulation Active database support: database reacts on the events, not only commands. This enforces integrity constraints Good news: rather standard Bad news: no ranking supported! Effort to integrate SQL3 with IR techniques. SQL MM Full Text and other similar languages

... examples: MULTOS One of design goals: easy navigation Paths are supported Identification of components by type, not by position All images in the document, not the image in 3rd chapter Types of predicates: on data attributes, on textual components, on images (image type, objects contained, ...) Example:

MULTOS example

Another example of MULTOS

Research topics How similarity function can be defined? What features of images (video, sound) there are? How to better specify the importance of individual features? (Give me similar houses: similar = size? color? strructure? Architectural style?) How to determine the objects in an image? Integration with DBMSs and SQL for fast access and rich semantics Integration with XML Ranking: by similarity, taking into account history, profile

Conclusions Basically, images are handled as text described them Namely, feature vectors (or feature hierarchies) Context can be used when available to determine features Also, queries by example are common From the point of view of DBMS, integration with IR and multimedia-specific techniques is needed Object-oriented technology is adequate

Thank you! Till ??, 6 pm