Multimedia Databases Prepared by Pradeep Konduri Instructor: Dr. Yingshu Li Georgia State University.

Slides:



Advertisements
Similar presentations
MediaView -- Towards a “Semantic” Multimedia Database Model
Advertisements

Three-Step Database Design
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Image Retrieval: Current Techniques, Promising Directions, and Open Issues Yong Rui, Thomas Huang and Shih-Fu Chang Published in the Journal of Visual.
Image Information Retrieval Shaw-Ming Yang IST 497E 12/05/02.
1 Content-Based Retrieval (CBR) -in multimedia systems Presented by: Chao Cai Date: March 28, 2006 C SC 561.
Information Retrieval in Practice
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Video Table-of-Contents: Construction and Matching Master of Philosophy 3 rd Term Presentation - Presented by Ng Chung Wing.
 2009 Qing Li Application of Object Orientation.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
ADVISE: Advanced Digital Video Information Segmentation Engine
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
File Systems and Databases
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Presented by Zeehasham Rasheed
CS292 Computational Vision and Language Visual Features - Colour and Texture.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
CH 11 Multimedia IR: Models and Languages
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Database System Concepts and Architecture Dr. Ali Obaidi.
Overview of Search Engines
OIL: An Ontology Infrastructure for the Semantic Web D. Fensel, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider Presenter: Cristina.
Information Retrieval in Practice
Content-Based Video Retrieval System Presented by: Edmund Liang CSE 8337: Information Retrieval.
Data Mining Techniques
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Multimedia and Time-series Data
Week 1 Lecture MSCD 600 Database Architecture Samuel ConnSamuel Conn, Asst. Professor Suggestions for using the Lecture Slides.
Multimedia Databases: Where are we?
Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.
Multimedia Databases (MMDB)
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Multimedia Information Retrieval
Image Retrieval Part I (Introduction). 2 Image Understanding Functions Image indexing similarity matching image retrieval (content-based method)
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
Content-Based Image Retrieval
Event-Based Fusion of Distributed Multimedia Data Sources Vincent Oria Department of Computer Science New Jersey Institute of Technology Newark, NJ
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
Content-Based Image Retrieval Using Fuzzy Cognition Concepts Presented by Tienwei Tsai Department of Computer Science and Engineering Tatung University.
2005/12/021 Content-Based Image Retrieval Using Grey Relational Analysis Dept. of Computer Engineering Tatung University Presenter: Tienwei Tsai ( 蔡殿偉.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
M4 / September Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
1/12/ Multimedia Data Mining. Multimedia data types any type of information medium that can be represented, processed, stored and transmitted over.
An MPEG-7 Based Semantic Album for Home Entertainment Presented by Chen-hsiu Huang 2003/08/12 Presented by Chen-hsiu Huang 2003/08/12.
VISUAL INFORMATION RETRIEVAL Presented by Dipti Vaidya.
Relevance Feedback in Image Retrieval System: A Survey Tao Huang Lin Luo Chengcui Zhang.
Content-Based Image Retrieval Using Color Space Transformation and Wavelet Transform Presented by Tienwei Tsai Department of Information Management Chihlee.
Introduction to MPEG  Moving Pictures Experts Group,  Geneva based working group under the ISO/IEC standards.  In charge of developing standards for.
MPEG 7 &MPEG 21.
Information Retrieval in Practice
Visual Information Retrieval
Search Engine Architecture
Introduction Multimedia initial focus
Multimedia Content-Based Retrieval
Multimedia Information Retrieval
Image Segmentation Techniques
Database System Concepts and Architecture
Presentation transcript:

Multimedia Databases Prepared by Pradeep Konduri Instructor: Dr. Yingshu Li Georgia State University

Plan of Attack Introduction Architecture Image Content Analysis Modeling Constructs Logical Implementation Real-World Applications Conclusion

Types of multimedia data Text: using a standard language (SGML, HTML) Graphics: encoded in CGM, postscript Images: bitmap, JPEG, MPEG Video: sequenced image data at specified rates Audio: aural recordings in a string of bits in digitized form

Nature of Multimedia Applications Repositories: central location for data maintained by DBMS, organized in storage levels Presentations: delivery of audio and video data, temporarily stored. Collaborative: complex design, analyzing data

Management Issues Modeling: complex objects, wide range of types Design: still in research Storage: representation, compression, buffering during I/O, mapping Queries: techniques need to be modified Performance: physical limitations, parallel processing

Research Problems Information Retrieval in Queries: Modeling the content of documents Multimedia/Hypermedia Data Modeling and Retrieval: Hyperlinks, Used in WWW Text Retrieval: Use of a thesaurus

Multimedia Database Applications Documentation and keeping Records Knowledge distribution Education and Training Marketing, Advertisement, Entertainment, Travel Real-time Control, Monitoring

A Generic Architecture of MMDBMS Media organization: organize the features for retrieval (i.e., indexing the features with effective structures) Media query processing: accommodated with indexing structure, efficient search algorithm with similarity function should be designed

Multimedia Database Architecture Multimedia Data Preprocessing System Database Processing MM Data Pre- processor Additional Information..... Multimedia DBMS Users Query Interface MM Data Instance..... Recognized components MM Data Instance MM Data Meta-Data

Document Database Architecture Document Processing System Database Processing DTD files DTD Parser..... DTD Manager Type Generator Multimedia DBMS Users Query Interface DTD Document content SGML/XML Documents XML or SGML Document Instance Parse Tree DTD C++ Types C++ Objects SGML/XML Parser Instance Generator

Image Database Architecture Image Processing System Database Processing Image Analysis and Pattern Recognition Image Annotation Multimedia DBMS Users Query Interface Image Content Description Image Syntactic Objects Semantic Objects..... Meta-Data

Video Database Architecture Video Processing System Database Processing Video Analysis and Pattern Recognition Video Annotation Multimedia DBMS Users Query Interface Video Key Frames Meta-Data..... Video Content Description Video

Image Content Analysis Image content analysis can be categorized in 2 groups: ◦ Low-level features: vectors in a multi- dimensional space  Color  Texture  Shape ◦ Mid- to high-level features: Try to infer semantics ◦ Semantic Gap

Image Content Analysis: Color Color space: ◦ Multidimensional space ◦ A dimension is a color component ◦ Examples of color space: RGB, HSV ◦ RGB space: A color is a linear combination of 3 primary colors (Red, Green and Blue) Color Quantization ◦ Used to reduce the color resolution of an image Three widely used color features ◦ Global color histogram ◦ Local color histogram ◦ Dominant color

Color Histograms Color histograms indicate color distribution without spatial information ◦ Color histogram distance metrics

Image Content Analysis: Texture Refers to visual patterns with properties of homogeneity that do not result from the presence of only a single color Examples of texture: Tree barks, clouds, water, bricks and fabrics Texture features: Contrast, uniformity, coarseness, roughness, frequency, density and directionality Two types of texture descriptors ◦ Statistical model-based  Explores the gray level spatial dependence of texture and extracts meaningful statistics as texture representation ◦ Transform-based  DCT transform, Fourier-Mellin transform, Polar Fourier transform, Gabor and wavelet transform

Image Content Analysis: Shape Object segmentation ◦ Approaches:  Global threshold-based approach  Region growing,  Split and merge approach,  Edge detection app ◦ Still a difficult problem in computer vision. Generally speaking it is difficult to achieve perfect segmentation

Salient Objects vs. Salient Points Original images Segmented images with region boundaries Extracted salient points Generic low-level description of images into salient objects and salient points

Modeling Images – Principles Support for multiple representations of an image Support for user-defined categorization of images Well-defined set of operations on images An image can have (semantic, functional, spatial) relationships with other images (or documents) which should be represented in the DBMS An image is composed of salient objects (meaningful image components)

Salient Object Modeling Multiple representations of a salient object (grid, vector) are allowed A salient object O is of a particular type which belongs to a user defined salient object types hierarchy An image component may have some (semantic, functional, spatial) relationships with other salient objects

“ Semantic Gap ” semantics-intensive multimedia systems & applications non-semantic multimedia data models requiremodel semantic meaning of the data raw data, primitive properties (size, format, etc) Semantic Gap

Semantic modeling of multimedia -- Why hard? Context-dependency ◦ Semantics is not a static and intrinsic property ◦ The semantics of an object often depends on:  the application/user who manipulate the object  the role that the object plays  other objects in the same “ context ” Van Gogh ’ s paintings flower Example:

Why hard? (cont.) Modality-independency ◦ Media objects of different modalities may suggest the similar/related semantic meanings. ◦ Example: Harry Potter has never been the star of a Quidditch team, scoring points while riding a broom far above the ground. He knows no spells, has never helped to hatch a dragon, and has never worn a cloak of invisibility. Query: Results: image videotext

MediaView – A “ Semantic Bridge ” An object-oriented view mechanism that bridges the semantic gap between multimedia systems and databases Core concept – media view (MV) ◦ a customized context for semantic interpretation of media objects (text docs, images, video, etc) ◦ collectively constitute the conceptual infrastructure of a multimedia system & application

Architecture MediaView Mechanism

Basic Concepts  A media view MVi is a virtual class that has a unique view name, a type description, and a set of objects associated with it.  A base class Ci is defined as a subclass of another base class Cj if and only if the following two conditions hold: (1) properties(Cj)  properties(Ci), and (2) extent(Ci)  extent(Cj). If Ci is the subclass of Cj, we also say that there is an is-a relationship from Ci to Cj. A base schema (BS) is a directed acylic graph G=(V, E), where V is a finite set of vertices and E is a finite set of edges as a binary relation defined on V×V. Each element in V corresponds to a base class Ci. Each edge in the form of e=  E represents an is-a relationship from Ci to Cj (or Ci is a subclass of Cj).

Basic Concepts A media view MVi is a subview of another media view MVj (or there is an is-a relationship from MVi to MVj) if and only if properties(MVj)  properties(MVi) and extent(MVi)  extent(MVj). A view schema (VS) is a directed acyclic graph G={V, E}, where a vertex in V corresponds to a media view MVi, and an edge e=  E represents an is-a relationship from MVi to MVj (or MVi is a subview of MVj).

Basic Concepts An example …

Basic Concepts Semantics-based data reorganization via media views

View Operators A set of operators that take media views and view instances as operands. Focus on the operators that are indispensable in supporting queries and navigation over multimedia objects.

View Operators type-level V-overlap syntax := v-overlap ( ) semantics true, if and only if (  o  O)(o  extent( ) and o  extent( )) Cross syntax{ }:= cross ( ) semantics{ } := {o  O | o  extent( ) and o  extent( )} Sum syntax{ }:= sum ( ) semantics{ } := {o  O | o  extent( ) or o  extent( )} Subtract syntax{ }:= subtract ( ) semantics{ }:= {o  O | o  extent( ) and o  extent( )}

View Operators instance-level Class syntax := class( ) semantics is a instance of components syntax{ } := components ( ) semantics { } := { o  O | o is a component (direct or indirect) of } i-overlap syntax := i-overlap (, ) semantics true, if and only if (  o  O) (o  components ( ) and o  components( ))

View Algebra Functions -- derivation of new MVs from existing MVs Heuristic Enumeration 1.Blind enumeration 2.Content-based enumeration 3.Semantics-based enumeration

View Algebra Algebra Operators ◦ select from src-MV where ◦ project from src-MV ◦ intersect (src-MV1, src-MV2) ◦ union (src-MV1, src-MV2) ◦ difference (src-MV1, src-MV2)

Comparison (vs. class) media viewobject class membership heterogeneous objectsuniform objects member acquisition dynamic inclusion/exclusion of existing objects of other classes creating new objects mapping one object can belong to multiple media views one object has exactly one class relationship inter-member semantic relationshipN/A

Comparison (vs. traditional object view) media viewobject view membership heterogeneous objectsuniform objects relationship inter-member semantic relationship N/A member properties instance-level properties (user-defined) inherited or derived properties (for view instances) global properties MV-level properties (user- defined) N/A

Logical Implementation MediaView Construction MediaView Customization MediaView Evolution

MediaViews Construction Work with CBIR systems to acquire the knowledge from queries ◦ Learn from previously performed queries ◦ A multi-system approach to support multi- modality of media objects Organize the semantics by following WordNet

Why WordNet? Different queries may greatly vary with the liberty of choosing query keywords We need an approach to organize those knowledge into a logic structure ◦ A simple “ context ” : a concept in WordNet ◦ Common media views: corresponds to simple contexts ◦ We provide all common media views, based on which users can build complex ones.

Navigating the Multimedia Database Navigating via semantic relationships of WordNet Semantic RelationshipExamples Synonymy (similar)pipe, tube Antonymy (opposite)fast, slow Hyponymy (subordinate)tree, plant Meronymy (part) chimney, house Troponomy (manner)march, walk Entailmentdrive, ride

Navigating the Multimedia Database

MediaViews Construction

“ Multi-dimensional ” Semantic Space “ IS-A ” relationship in thesaurus For example, “ Season ” has a 4-dimension semantic space [ “ spring ”, “ summer ”, “ autumn ”, “ winter ” ]

Encoding with Probabilistic Tree A Probabilistic Tree specifies the probability of one media object semantically matching a certain concept in thesaurus.

Evolution through Feedback A progressive approach ◦ MediaView is accumulated along with the processes of user interactions Two phases of feedback ◦ System-feedback ◦ User-feedback

Evolution through Feedback

Procedure: 1. Record each feedback performed by users. 2. For each CBIR system i involved, calculate its accuracy rate of retrieval. That is, simply divide the total number of retrieved results by the number of correct results according to user feedback. 3. Reset the value of to its accuracy rate respectively. 4. Wait for next session of user feedback.

Fuzzy Logic based Evolution Approach Due to the uncertainty of the semantics, can not make an absolute assertion that a media object is relevant or irrelevant to a “ context ” A media object in a database may be retrieved as a relevant result to a “ context ” several times: the more times a media object is retrieved, the more confidence it has to be considered as relevant to the context.

Fuzzy Logic based Evolution Approach For a media object e, a context c, - the accumulation of historial feedback information (from both system and user ’ s) - the adjustment of after each feedback session

Inverse Propagation of Feedback The drawback of up-down fashion of calculating the probability ◦ E.g. Whether a media object matches “ season ” can not leverage from that the media object was a match of “ spring ” Solution: propagate the confidence value of a media object being relevant to a concept along the hierarchical structure from bottom-up

Inverse Propagation of Feedback Procedure: 1. Wait for a feedback session. 2. For each positive feedback, namely, stating a concept C is relevant to a media object. Following the thesaurus, trace from C to the root concept Root in thesaurus. Assume the path is:. 3. Append Ci as also positive feedback to that media object, where i=1 to n.

MediaView Customization Two level MediaView Framework

MediaView Customization Dynamically construct complex-context- based media views based on simple ones ◦ An example complex context: “ the Grand Hall in City University ” Several user-level operators are devised to support more complex/advanced contexts, besides the basic operators

User-level Operators INHERIT_MV(N: mv-name, NS: set-of-mv- refs, VP: set-of-property-ref, MP: set-of- property-ref): mv-ref UNION_MV(N: mv-name, NS: set-of-mv- refs): mv-ref INTERSECTION_MV(N: mv-name, NS: set-of-mv-refs): mv-ref DIFFERENCE_MV(N1: mv-ref, N2: mv- ref): mv-ref

Build a MediaView in Run-time Example: find out info about "Van Gogh" ◦ Who is "Van Gogh"? ◦ What is his work? ◦ Know more about his whole life. ◦ Know more about his country. ◦ See his famous painting "sunflower"

Build a MediaView in Run-time Who is “ Van Gogh ” ? ◦ INHERIT_MV( “ V. Gogh “, { },name= ” Van Gogh ”,); What is his work? ◦ INTERSECTION_MV( “ work ”, {, vg}); Know more about his whole life. ◦ INTERSECTION_MV( “ life ”, {, vg}); Know more about his country. ◦ INTERSECTION_MV( “ country ”, {, vg}); See his famous painting “ sunflower ” ◦ Set sunflower = INTERSECTION_MV( “ sunflower ”, {, }); Set vg_sunflower = INTERSECTION_MV( “ vg_sunflower ”, {vg_work, sunflower});

Authoring Scenario Creates a new media view named after the subject ◦ All multimedia materials used in the document would be put into this MediaView for further reference. To collect the most relevant materials for authoring, the user performs the MediaView building process. ◦ Import suitable media objects by browsing media views Reference the manner and style of authoring, to find other media views with similar topics. ◦ Drag & Drop ◦ “ learning-from-references ”

Summary Types of multimedia data: Text, Audio, Video, Images. Management issues: Design, Storage, Modeling, Queries Image Content Analysis: Color, Texture, Shape MediaView – a semantic multimedia database modeling mechanism ◦ to bridge the semantic gap between conventional database and semantics-intensive multimedia applications A set of user-level operators to accommodate the specialization/generalization relationships among the media views

Summary (contd..) MediaView promises more effective access to the content of media databases ◦ Users could get the right stuff and tailor it to the context of their application easily. Providing the most relevant content from pre- learnt semantic links between media and context  high performance database browsing and multimedia authoring tools can enable more comprehensive applications to the user.  Users could customize specific media view according to their tasks, by using user-level operators

Further Issues The development and transition of MediaView to a fully-fledged multimedia database system supporting “ declarative ” queries Intensive and extensive performance studies Advanced semantic relations (eg. temporal and spatial ones) can also be incorporated in combining individual media views

Thank you! Q & A