MPEG-7 Motion Descriptors. Reference ISO/IEC JTC1/SC29/WG11 N4031 ISO/IEC JTC1/SC29/WG11 N4062 MPEG-7 Visual Motion Descriptors (IEEE Transactions on.

Slides:



Advertisements
Similar presentations
Kien A. Hua Division of Computer Science University of Central Florida.
Advertisements

Content-Based Image Retrieval
Presented By: Vennela Sunnam
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
Neurocomputing,Neurocomputing, Haojie Li Jinhui Tang Yi Wang Bin Liu School of Software, Dalian University of Technology School of Computer Science,
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
A Novel Scheme for Video Similarity Detection Chu-Hong Hoi, Steven March 5, 2003.
1 Content-Based Retrieval (CBR) -in multimedia systems Presented by: Chao Cai Date: March 28, 2006 C SC 561.
A presentation by Modupe Omueti For CMPT 820:Multimedia Systems
ICME 2008 Huiying Liu, Shuqiang Jiang, Qingming Huang, Changsheng Xu.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
1 Content Based Image Retrieval Using MPEG-7 Dominant Color Descriptor Student: Mr. Ka-Man Wong Supervisor: Dr. Lai-Man Po MPhil Examination Department.
Efficient Moving Object Segmentation Algorithm Using Background Registration Technique Shao-Yi Chien, Shyh-Yih Ma, and Liang-Gee Chen, Fellow, IEEE Hsin-Hua.
Object-based Image Representation Dr. B.S. Manjunath Sitaram Bhagavathy Shawn Newsam Baris Sumengen Vision Research Lab University of California, Santa.
Multimedia Search and Retrieval Presented by: Reza Aghaee For Multimedia Course(CMPT820) Simon Fraser University March.2005 Shih-Fu Chang, Qian Huang,
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2005 with a lot of slides stolen from Steve Seitz and.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Detecting and Tracking of Mesoscale Oceanic Features in the Miami Isopycnic Circulation Ocean Model. Ramprasad Balasubramanian, Amit Tandon*, Bin John,
CS :: Fall 2003 MPEG-1 Video (Part 1) Ketan Mayer-Patel.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
Scale Invariant Feature Transform (SIFT)
Visual Standard for Content Description
CS292 Computational Vision and Language Visual Features - Colour and Texture.
1 Motion in 2D image sequences Definitely used in human vision Object detection and tracking Navigation and obstacle avoidance Analysis of actions or.
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Object Tracking for Retrieval Application in MPEG-2 Lorenzo Favalli, Alessandro Mecocci, Fulvio Moschetti IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR.
Metadata Presentation by Rick Pitchford Chief Engineer, School of Communication COM 633, Content Analysis Methods Fall 2009.
Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia.
Real Time Abnormal Motion Detection in Surveillance Video Nahum Kiryati Tammy Riklin Raviv Yan Ivanchenko Shay Rochel Vision and Image Analysis Laboratory.
Image and Video Compression
Information Retrieval in Practice
Computer vision.
The MPEG-7 Color Descriptors
1 Seminar Presentation Multimedia Audio / Video Communication Standards Instructor: Dr. Imran Ahmad By: Ju Wang November 7, 2003.
Multimedia Databases (MMDB)
Image and Video Retrieval INST 734 Doug Oard Module 13.
The MPEG-7 Standard - A Brief Tutorial - Ali Tabatabai Sony US Research Laboratories February 27, 2001.
Characterizing activity in video shots based on salient points Nicolas Moënne-Loccoz Viper group Computer vision & multimedia laboratory University of.
Reporter: Fei-Fei Chen. Wide-baseline matching Object recognition Texture recognition Scene classification Robot wandering Motion tracking.
2004, 9/1 1 Optimal Content-Based Video Decomposition for Interactive Video Navigation Anastasios D. Doulamis, Member, IEEE and Nikolaos D. Doulamis, Member,
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
1 Applications of video-content analysis and retrieval IEEE Multimedia Magazine 2002 JUL-SEP Reporter: 林浩棟.
Bachelor of Engineering In Image Processing Techniques For Video Content Extraction Submitted to the faculty of Engineering North Maharashtra University,
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
Content-Based Image Retrieval QBIC Homepage The State Hermitage Museum db2www/qbicSearch.mac/qbic?selLang=English.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Miguel Tavares Coimbra
October 16, 2014Computer Vision Lecture 12: Image Segmentation II 1 Hough Transform The Hough transform is a very general technique for feature detection.
Multimedia Systems and Communication Research Multimedia Systems and Communication Research Department of Electrical and Computer Engineering Multimedia.
VISUAL INFORMATION RETRIEVAL Presented by Dipti Vaidya.
1/39 Motion Adaptive Search for Fast Motion Estimation 授課老師:王立洋老師 製作學生: M 蔡鐘葳.
Introduction to MPEG  Moving Pictures Experts Group,  Geneva based working group under the ISO/IEC standards.  In charge of developing standards for.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
MPEG-7 What is MPEG-7 ? MPEG-7 is a multimedia content description standard. These descriptions are based on catalogue (e.g., title, creator, rights),
Visual Information Retrieval
CS262: Computer Vision Lect 09: SIFT Descriptors
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Histogram—Representation of Color Feature in Image Processing Yang, Li
V. Mezaris, I. Kompatsiaris, N. V. Boulgouris, and M. G. Strintzis
Fitting Curve Models to Edges
Computer Vision Lecture 16: Texture II
ENEE 631 Project Video Codec and Shot Segmentation
Multimedia Content Description Interface
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Example of Event-Based Video Data (Touch-down Scenario)
Presentation transcript:

MPEG-7 Motion Descriptors

Reference ISO/IEC JTC1/SC29/WG11 N4031 ISO/IEC JTC1/SC29/WG11 N4062 MPEG-7 Visual Motion Descriptors (IEEE Transactions on Circuits and Systems for Video Technology) Video Indexing using Descriptors of Spatial Distribution of Motion Activity (submitted to IEEE Transactions on Circuits and Systems for Video Technology)

Introduction MPEG-7, formally named “ Multimedia Content Description Interface ”, is a standard for describing features of multimedia content. Users can search, browse, and retrieve that content more efficiently and effectively than they could using today ’ s mainly text-based search engines. We describes tools and techniques for representing motion information in the context of MPEG-7.

Overview of MPEG-7 Motion Descriptors

Camera Motion This descriptor characterizes 3-D camera motion parameters. It supports the following well-known basic camera operations:fixed,panning,tracking, tilting,booming,zooming,dollying and rolling.

Motion Trajectory Motion trajectory is a high-level feature, defined as the spatio-temporal localization, of one of its representative point of this object. The descriptor is essentially a list of keypoints along with a set of optional interpolating function that describe the path of the object between keypoints.

In surveillance,alarms can be triggered if some object has a trajectory identified as dangerous; in sports, specific actions can be recognized.

Parametric Motion This descriptor addresses the motion of objects in video sequences as a 2D parametric model. Translational models:v x (x, y) = a 1 v y (x, y) = a 2 Rotation/scaling models: v x (x, y) = a 1 + a 3 x + a 4 y v y (x, y) = a 2 - a 4 x + a 3 y Affine models:v x (x, y) = a 1 + a 3 x + a 4 y v y (x, y) = a 2 + a 5 x + a 6 y Perspective models: v x (x, y) = (a 1 + a 3 x +a 4 y) / (1 + a 7 x +a 8 y) v y (x, y) = (a 2 + a 5 x +a 6 y) / (1 + a 7 x +a 8 y) Quadratic models: v x (x, y) = a 1 + a 3 x + a 4 y + a 7 xy + a 9 x 2 + a 10 y 2 v y (x, y) = a 2 + a 5 x + a 6 y + a 8 xy + a 11 x 2 + a 12 y 2

Motion Activity Video content in general spans the gamut from high to low activity, therefore we need a descriptor that enables us to accurately express the activity of a given sequence/shot. The activity descriptor includes the following attributes: Intensity of Activity Direction of Activity (optional) Spatial Distribution of Activity (optional) Temporal Distribution of Activity (optional)

Intensity of Activity Expressed by an 3-bit integer lying in the range 1~5. A high value of intensity indicates high activity while a low value of intensity indicates low activity. For example, a still shot has a low intensity of activity while a “ fast break ” basketball shot has a high intensity of activity.

Intensity is defined as the standard deviation of motion vector magnitudes, appropriately normalized by the frame resolution.

1 – very low activity 2 – low activity 3 – medium activity 4 – high activity 5 – very high activity if(std_dev<t 1 ) intensity = 1; else if(std_dev<t 2 ) intensity = 2; else if(std_dev<t 3 ) intensity = 3; else if(std_dev<t 4 ) intensity = 4; else intensity = 5; t 1 = 0.257*l/F t 2 = 0.706*l/F t 3 = 1.280*l/F t 4 = 2.111*l/F diagonal length l = sqrt(w*w + h*h) F is the frame rate in frames/second.

Spatial distribution of Activity The descriptor indicate whether the activity is spread across many regions or restricted to one large region. It is an indication of the number and size of “ active ” regions in a frame. For example, a talking head sequence would have one large active region, while an shot of busy street would have many small active regions.

Thresholded motion vector magnitude matrix

Recording the length of zero runs in a raster scan order over the thresholded motion vector magnitude matrix. Short runs are defined as runs that are less than 1/3 of the frame width. 1/3 < medium runs < 2/3 Long runs > 2/3 The element consists of three field: Nsr, Nmr, Nlr,which contain the numbers of short, medium, and long runs of zeros,respectively.

The dark area consists of macroblocks that get non-zero values after thresholding. The remaining area consists of macroblocks that get “ zero out ” after thresholding.

With smaller,widely spaced objects note that there are more long-run lengths and medium run-lengths

Direction of Activity While a video shot may give several objects with different activities, we can often identify a dominant direction. /* quantize angle using uniform 3 bit quantization over degrees i.e. 0,45,90,135,180,225,270,315 */ if((f_angle>=-22.5)&&(f_angle =22.5)&&(f_angle =67.5)&&(f_angle =112.5)&&(f_angle =157.5)&&(f_angle =202.5)&&(f_angle =247.5)&&(f_angle =292.5)&&(f_angle<337.5)) direction=7;

Temporal Distribution of Activity Express the variation of activity over the duration of the video segment/shot. A histogram consisting of 5 bin, where histogram bins N0,N1,N2,N3,and N4 correspond to intensity value of 1,2,3,4,and 5 respectively. Each value is the percentage of occurrences of each quantized intensity level.

Usage and Applications Video browsing: The motion-activity intensity descriptor enables selection of the video segments of a program based on intensity of motion activity. Content-based querying of video databse: We can use motion activity to separate the high and low motion parts of the video sequence and or as a first stage content filter.