Presentation is loading. Please wait.

Presentation is loading. Please wait.

Authors: Guanghan Ning, Zhi Zhang, Xiaobo Ren, Haohong Wang,

Similar presentations


Presentation on theme: "Authors: Guanghan Ning, Zhi Zhang, Xiaobo Ren, Haohong Wang,"— Presentation transcript:

1 RATE-COVERAGE ANALYSIS AND OPTIMIZATION FOR JOINT AUDIO-VIDEO MULTIMEDIA RETRIEVAL
Authors: Guanghan Ning, Zhi Zhang, Xiaobo Ren, Haohong Wang, Zhihai(Henry) He

2 The Problem What is Joint Audio-Video Multimedia Retrieval?
ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

3 The Problem What problem is this paper trying to solve in Joint Audio-Video Multimedia Retrieval? In content-based multimedia retrieval, video and audio data are often represented by feature vectors, such as SIFT, SURF and GLOH. Compared to the raw video data, the feature description is much smaller in size and much more efficient for storage and retrieval. For example, the feature description of a typical video is about 10% of the original video size. With the massive amount of videos to be processed using feature description, the amount of features generated is still enormous. ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

4 Related Works To address this issue, a number of methods have been developed to further compress the feature description and cut down the database overhead. manifold learning explores correlation among data cluster similar features to reduce redundancy descriptor compression aims at generating compact descriptors individually, therefore reducing the overall demand for storage space. The two types of approaches are complementary to each other and can be used jointly to minimize the fingerprint size of multimedia databases. ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

5 Our Approach How do we approach the problem?
We focus on how to balance the query accuracy and the size of fingerprint, and how to allocate the fingerprint bits to video and audio frames to maximize the query accuracy. We introduce a novel concept called coverage, which is highly correlated to the query accuracy. We are then able to construct a rate-coverage model and formulate the joint audio-video fingerprint bit rate allocation into a dynamic programming optimization problem. ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

6 Idea Behind Our Approach
Preserving only a subset of the overall joint video-audio descriptors Given an arbitrary budget of storage space, this model aims to optimize the retrieval accuracy while preserving only a subset of the overall joint video-audio descriptors, therefore reducing the overall audio-video fingerprint size. We propose a dynamic programming method to solve this rate-coverage optimization problem. ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

7 System Overview ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

8 System Overview Feature Extraction Details
ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

9 System Overview Feature Extraction Details
ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

10 System Overview Feature Extraction Details
ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

11 Problem Formulation Some Important Concepts and Notations
Representatives: the subset of fingerprints that is chosen to represent the original video frames and audio segments (to be stored on the server database) Coverage: C: Coverage Bv: the constant number of bits for feature of each video frame Ba: the constant number of bits for feature of each audio segment Nv: the number of video representatives chosen for the database Na: the number of audio representatives chosen for the database R: the total data rate of the database —— R = Bv ×Nv + Ba ×Na ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

12 Problem Formulation Formulate into an optimization problem
ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

13 Problem Formulation A Dynamic Programming Solution
ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

14 Coverage-Accuracy Relationship
Reason Behind Optimizing Coverage Instead of Accuracy ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

15 Results ICASSP 2017: Rate-Coverage Analysis and Optimization for Joint Audio-Video Multimedia Retrieval 11/20/2018 2:09 PM

16 THANK YOU! ICIP2015: Scene Text Detection Based on Component-Level Fusion and Region-Level Verification 11/20/2018 2:09 PM


Download ppt "Authors: Guanghan Ning, Zhi Zhang, Xiaobo Ren, Haohong Wang,"

Similar presentations


Ads by Google