A presentation by Modupe Omueti For CMPT 820:Multimedia Systems

Slides:



Advertisements
Similar presentations
Distinctive Image Features from Scale-Invariant Keypoints
Advertisements

Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,
電腦視覺 Computer and Robot Vision I
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 4 – Digital Image Representation Klara Nahrstedt Spring 2009.
November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.
Presented By: Vennela Sunnam
Automatic Video Shot Detection from MPEG Bit Stream Jianping Fan Department of Computer Science University of North Carolina at Charlotte Charlotte, NC.
July 27, 2002 Image Processing for K.R. Precision1 Image Processing Training Lecture 1 by Suthep Madarasmi, Ph.D. Assistant Professor Department of Computer.
Chapter 8 Content-Based Image Retrieval. Query By Keyword: Some textual attributes (keywords) should be maintained for each image. The image can be indexed.
1 Overview of Image Retrieval Hui-Ying Wang. 2/42 Reference Smeulders, A. W., Worring, M., Santini, S., Gupta, A.,, and Jain, R “Content-based.
Image Representation.
The Global Digital Elevation Model (GTOPO30) of Great Basin Location: latitude 38  15’ to 42  N, longitude 118  30’ to 115  30’ W Grid size: 925 m.
DL:Lesson 11 Multimedia Search Luca Dini
MPEG-4 Objective Standardize algorithms for audiovisual coding in multimedia applications allowing for Interactivity High compression Scalability of audio.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
ICIP 2000, Vancouver, Canada IVML, ECE, NTUA Face Detection: Is it only for Face Recognition?  A few years earlier  Face Detection Face Recognition 
Young Deok Chun, Nam Chul Kim, Member, IEEE, and Ick Hoon Jang, Member, IEEE IEEE TRANSACTIONS ON MULTIMEDIA,OCTOBER 2008.
1 Content Based Image Retrieval Using MPEG-7 Dominant Color Descriptor Student: Mr. Ka-Man Wong Supervisor: Dr. Lai-Man Po MPhil Examination Department.
Real-time Embedded Face Recognition for Smart Home Fei Zuo, Student Member, IEEE, Peter H. N. de With, Senior Member, IEEE.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
A Study of Approaches for Object Recognition
SWE 423: Multimedia Systems
Visual Information Systems Image Content. Visual cues to recover 3-D information There are number of cues available in the visual stimulus There are number.
Real-time and Retrospective Analysis of Video Streams and Still Image Collections using MPEG-7 Ganesh Gopalan, College of Oceanic and Atmospheric Sciences,
A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes ( ) A thousand bytes - modern translation.
MPEG-7 Motion Descriptors. Reference ISO/IEC JTC1/SC29/WG11 N4031 ISO/IEC JTC1/SC29/WG11 N4062 MPEG-7 Visual Motion Descriptors (IEEE Transactions on.
Stockman MSU Fall Computing Motion from Images Chapter 9 of S&S plus otherwork.
Visual Standard for Content Description
CS292 Computational Vision and Language Visual Features - Colour and Texture.
Content-based Image Retrieval (CBIR)
Computer vision.
Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.
The MPEG-7 Color Descriptors
MPEG: (Moving Pictures Expert Group) A Video Compression Standard for Multimedia Applications Seo Yeong Geon Dept. of Computer Science in GNU.
1 Seminar Presentation Multimedia Audio / Video Communication Standards Instructor: Dr. Imran Ahmad By: Ju Wang November 7, 2003.
Image and Video Retrieval INST 734 Doug Oard Module 13.
Università degli Studi di Modena and Reggio Emilia Dipartimento di Ingegneria dell’Informazione Prototypes selection with.
Characterizing activity in video shots based on salient points Nicolas Moënne-Loccoz Viper group Computer vision & multimedia laboratory University of.
Image Retrieval Part I (Introduction). 2 Image Understanding Functions Image indexing similarity matching image retrieval (content-based method)
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Digital Image Processing Lecture notes – fall 2008 Lecturer: Conf. dr. ing. Mihaela GORDAN Communications Department
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
Content-Based Image Retrieval QBIC Homepage The State Hermitage Museum db2www/qbicSearch.mac/qbic?selLang=English.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Miguel Tavares Coimbra
MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University.
Video Compression and Standards
VISUAL INFORMATION RETRIEVAL Presented by Dipti Vaidya.
Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.
Introduction to MPEG  Moving Pictures Experts Group,  Geneva based working group under the ISO/IEC standards.  In charge of developing standards for.
Ontology-based Automatic Video Annotation Technique in Smart TV Environment Jin-Woo Jeong, Hyun-Ki Hong, and Dong-Ho Lee IEEE Transactions on Consumer.
Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.
Another Example: Circle Detection
MPEG-7 What is MPEG-7 ? MPEG-7 is a multimedia content description standard. These descriptions are based on catalogue (e.g., title, creator, rights),
A. M. R. R. Bandara & L. Ranathunga
Visual Information Retrieval
Author : Sang Hwa Lee, Junyeong Choi, and Jong-Il Park
Automatic Video Shot Detection from MPEG Bit Stream
Multimedia Content-Based Retrieval
Content-Based Image Retrieval Readings: Chapter 8:
Common Classification Tasks
Local Binary Patterns (LBP)
Multimedia Content Description Interface
Feature descriptors and matching
Presentation transcript:

The MPEG-7 Visual Standard for Content Description-An Overview Thomas Sikora, Senior Member, IEEE A presentation by Modupe Omueti For CMPT 820:Multimedia Systems Spring 2005

Contents Introduction Scope Methodology Visual Descriptors Conclusion

Introduction Moving Pictures Expert Group MPEG-1 for interactive video (1992) MPEG-2 for digital television (1994) MPEG-4 for multimedia with emphasis on visual objects (1998 v1, 1999 v2) MPEG-7 for multimedia content description (2001)

Trends Initially few sources of audio, image and video Increase in volume of digitized audio, images and video Still images digital video

MPEG-7 Formally named Multimedia Content Description Interface Supports some degree of interpretation of the information’s meaning Interpretation can be passed on to or accessed by a device or computer code Not aimed at one application in particular

Scope Goals Elements Figure 1: Scope of MPEG-7 Standardized descriptions Meaningful descriptions Elements Description tools: visual decriptors and description schemes Description Definition Language System tools Figure 1: Scope of MPEG-7 Figure 2: MPEG-7 main elements

Normative part of MPEG-7 standard Figure 1: Scope of MPEG-7

Figure 2: MPEG-7 main elements

Applications Digital libraries (image catalogue, film) Broadcast media selection (TV channels) Investigation services (human characteristics recognition, forensics) Multimedia editing (personalized electronic news service) Figure 3: Abstract Representation

Figure 3: Abstract representation of possible applications using MPEG-7

Methodology Standard Development Specification for Technology Requirements Technology Request Proposal Evaluation Experimentation Model Definition Core Experiments

Visual Descriptors General visual descriptors Domain specific Color, texture, shape, and motion features Domain specific Identification of human faces and face recognition

Visual Color Descriptors Color Spaces (HSV, HMMD) Supports above for normative purposes Also supports RGB, YCbCr color spaces Scalable color descriptor Figure 4 Global color Distribution of Images in color histograms HSV space, uniformly quantized into 255 bins Haar Transform used to encode histogram Histogram bin non-uniformly quantized color coefficients or histogram bin values for matching

Visual Color Descriptors Dominant color descriptor Global + local spatial color distribution Colors clustered into a small no of representative colors representative color, %age, spatial coherency, variance Color layout descriptor Spatial distribution of color in an arbitrarily shaped region Color structure descriptor HMMD, local color feature, sliding window Histogram on color appearance count Group of Frames/Group of Pictures SCD for a collection of similar images (frames) or video frames Average, median, intersection histograms of GoF or GoP

Figure 4: Three color images and their MPEG-7 histogram color distribution, depicted using a simplified color histogram. Based on the color distribution, the two left images would be recognized as more similar compared to the one on the right.

Visual Texture Descriptors Texture Features Visual patterns (homogenous or non-homogenous) Multiple colors in images Multiple intensities in images Surface structural information Figure 5

Figure 5: Examples of grayscale images with different textures Figure 5: Examples of grayscale images with different textures. Using the MPEG-7 Visual texture descriptors, the two images on the bottom would be rated of similar texture, while less similar in texture compared to the two images on the top.

Visual Texture Descriptors Homogenous texture descriptor Figure 6 Scale and orientation sensitive filters Mean and SD of frequency coefficients (RT-FT) Scale and rotation-invariant description and matching 2D Gabor functions for filtering feature channels Non homogenous texture descriptor (Edge Histogram) Spatial distribution of edges Division of image into 16 non overlapping blocks of equal size Five edge categories: vertical, horizontal, 45 , 135 , and non directional edge. Rotation-sensitive and rotation-invariant Non uniform quantization using 3 bits, descriptor size of 240 bits (16x5x3)

Figure 6: Frequency layout for MPEG-7 Homogenous Texture Descriptor frequency extraction. Energy and energy deviation values are extracted from this frequency division into 30 channels.

Visual Shape Descriptors Provides a powerful visual clue Invariant to scaling, rotation, and translation 2-D or 3-D in nature For 2-D there are two categories Contour based which uses only boundary information of objects Region-based which the entire shape region

Visual Shape Descriptors 3-D Shape Descriptor—Shape Spectrum Based on a shape spectrum concept Histogram of a shape index Measures local convexity of each local 3-D surface Histograms with 100 bins are used—each quantized by 12 bits. Region Based Shape Descriptor (Art) Figure 7 Uses all pixels constituting a shape within a frame Region-based moments invariant to transformations Coefficients of ART basis functions quantized

Figure 7: Examples of various shapes that can be indexed using MPEG-7 Region-Based Shape Descriptor. Images contained in either of the sets (a)–(d) would be rated similar and dissimilar to the ones in the remaining sets. For example, images in set (a) would be identified being similar and dissimilar to the ones in set (b), (c), or (d).

Types of Visual Shape Descriptors Contour based shape descriptor Figure 9 Curvature scale-space (CCS) Eccentricity and circularity values Robust to non-rigid motion partial occlusion of the shape and perspective transformations 2-D/3-D shape descriptor Representation of 3-D objects using multiple 2-D snapshots

Figure 8: Examples of shapes that can be indexed using MPEG-7 Contour-Based Shape Descriptor.

Motion Descriptors for Video Motion Activity Descriptors Activity level and pace of motion in a scene Motion activity intensity descriptor SD of motion vector magnitude SDs quantized into five activity levels Optional Features motion direction spatial distribution of motion activity Temporal distribution of motion activity Camera Motion Descriptor Figure 9 Global motion parameters in time zoom activity translatory motion Motion similarity matching in particular time periods

Figure 9: Camera model for MPEG-7 Camera Motion Descriptor Figure 9: Camera model for MPEG-7 Camera Motion Descriptor. Perspective projection to image plane p and camera motion parameters. The (virtual) camera is located in O.

Motion Descriptors for Video Warping Parameters Parametric motion descriptor Object description using 2-D parametric models translations, rotations, scaling and combination of them planar perspective models quadratic models Arbitrary objects, defined as regions (group of pixels) in the image over a specified time interval Global sprite or mosaic Motion Trajectory Description for independently moving objects Object displacement over time

Conclusion Identify, filter and browse images using visual content Specification to allow interoperability and flexibility Other MPEG-7 standards Storage, access and transmission of descriptors and descriptors schemes in system specification

Thank you