Tomohiko TAKAHASHL Masaru SUGANO, Keiichiro HOASHL and Sei NAITO International Conference on Multimedia and Expo 2011 Arbitrary Product Detection from.

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

CSCE643: Computer Vision Bayesian Tracking & Particle Filtering Jinxiang Chai Some slides from Stephen Roth.

DONG XU, MEMBER, IEEE, AND SHIH-FU CHANG, FELLOW, IEEE Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment.

Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,

Caroline Rougier, Jean Meunier, Alain St-Arnaud, and Jacqueline Rousseau IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5,

Learning Techniques for Video Shot Detection Under the guidance of Prof. Sharat Chandran by M. Nithya.

Automatic Video Shot Detection from MPEG Bit Stream Jianping Fan Department of Computer Science University of North Carolina at Charlotte Charlotte, NC.

SmartPlayer: User-Centric Video Fast-Forwarding K.-Y. Cheng, S.-J. Luo, B.-Y. Chen, and H.-H. Chu ACM CHI 2009 (international conference on Human factors.

Segmentation of Floor in Corridor Images for Mobile Robot Navigation Yinxiao Li Clemson University Committee Members: Dr. Stanley Birchfield (Chair) Dr.

M.S. Student, Hee-Jong Hong

Parsing Clothing in Fashion Photographs

Multiple People Detection and Tracking with Occlusion Presenter: Feifei Huo Supervisor: Dr. Emile A. Hendriks Dr. A. H. J. Stijn Oomes Information and.

A KLT-Based Approach for Occlusion Handling in Human Tracking Chenyuan Zhang, Jiu Xu, Axel Beaugendre and Satoshi Goto 2012 Picture Coding Symposium.

A Novel Scheme for Video Similarity Detection Chu-Hong Hoi, Steven March 5, 2003.

Broadcast News Parsing Using Visual Cues: A Robust Face Detection Approach Yannis Avrithis, Nicolas Tsapatsoulis and Stefanos Kollias Image, Video & Multimedia.

ICIP 2000, Vancouver, Canada IVML, ECE, NTUA Face Detection: Is it only for Face Recognition?  A few years earlier  Face Detection Face Recognition 

ICME 2008 Huiying Liu, Shuqiang Jiang, Qingming Huang, Changsheng Xu.

Chinese University of Hong Kong Department of Information Engineering A Capacity Estimate Technique for JPEG-to-JPEG Image Watermarking Peter Hon Wah Wong.

1 Static Sprite Generation Prof ︰ David, Lin Student ︰ Jang-Ta, Jiang

On the Use of Computable Features for Film Classification Zeeshan Rasheed,Yaser Sheikh Mubarak Shah IEEE TRANSCATION ON CIRCUITS AND SYSTEMS FOR VIDEO.

Scale Invariant Feature Transform (SIFT)

Effective Gaussian mixture learning for video background subtraction Dar-Shyang Lee, Member, IEEE.

1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman

Scalable Rate Control for MPEG-4 Video Hung-Ju Lee, Member, IEEE, Tihao Chiang, Senior Member, IEEE, and Ya-Qin Zhang, Fellow, IEEE IEEE TRANSACTIONS ON.

WP -6: Human Tracking and Modelling Year–I Objectives: Simple upper-body models and articulated tracks from test videos. Year-I Achievements: Tracking.

Joint Histogram Based Cost Aggregation For Stereo Matching Dongbo Min, Member, IEEE, Jiangbo Lu, Member, IEEE, Minh N. Do, Senior Member, IEEE IEEE TRANSACTION.

Object detection, tracking and event recognition: the ETISEO experience Andrea Cavallaro Multimedia and Vision Lab Queen Mary, University of London

Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.

3D Fingertip and Palm Tracking in Depth Image Sequences

Object Based Video Coding - A Multimedia Communication Perspective Muhammad Hassan Khan

Introduction to Visible Watermarking IPR Course: TA Lecture 2002/12/18 NTU CSIE R105.

1 Recognition of Multi-Fonts Character in Early-Modern Printed Books Chisato Ishikawa(1), Naomi Ashida(1)*, Yurie Enomoto(1), Masami Takata(1), Tsukasa.

Recognition using Regions (Demo) Sudheendra V. Outline Generating multiple segmentations –Normalized cuts [Ren & Malik (2003)] Uniform regions –Watershed.

Quality Assessment for LIDAR Point Cloud Registration using In-Situ Conjugate Features Jen-Yu Han 1, Hui-Ping Tserng 1, Chih-Ting Lin 2 1 Department of.

2004, 9/1 1 Optimal Content-Based Video Decomposition for Interactive Video Navigation Anastasios D. Doulamis, Member, IEEE and Nikolaos D. Doulamis, Member,

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

IEEE Int'l Symposium on Signal Processing and its Applications 1 An Unsupervised Learning Approach to Content-Based Image Retrieval Yixin Chen & James.

BING: Binarized Normed Gradients for Objectness Estimation at 300fps

Performance Characterization of Video-Shot-Change Detection Methods U. Gargi, R. Kasturi, S. Strayer Presented by: Isaac Gerg.

Character Identification in Feature-Length Films Using Global Face-Name Matching IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 11, NO. 7, NOVEMBER 2009 Yi-Fan.

An Efficient Linear Time Triple Patterning Solver Haitong Tian Hongbo Zhang Zigang Xiao Martin D.F. Wong ASP-DAC’15.

2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )

Case Study 1 Semantic Analysis of Soccer Video Using Dynamic Bayesian Network C.-L Huang, et al. IEEE Transactions on Multimedia, vol. 8, no. 4, 2006 Fuzzy.

Image-Based Segmentation of Indoor Corridor Floors for a Mobile Robot

Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.

VIP: Finding Important People in Images Clint Solomon Mathialagan Andrew C. Gallagher Dhruv Batra CVPR

Image-Based Segmentation of Indoor Corridor Floors for a Mobile Robot Yinxiao Li and Stanley T. Birchfield The Holcombe Department of Electrical and Computer.

Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,

GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.

Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.

Shadow Detection in Remotely Sensed Images Based on Self-Adaptive Feature Selection Jiahang Liu, Tao Fang, and Deren Li IEEE TRANSACTIONS ON GEOSCIENCE.

Ontology-based Automatic Video Annotation Technique in Smart TV Environment Jin-Woo Jeong, Hyun-Ki Hong, and Dong-Ho Lee IEEE Transactions on Consumer.

Shape2Pose: Human Centric Shape Analysis CMPT888 Vladimir G. Kim Siddhartha Chaudhuri Leonidas Guibas Thomas Funkhouser Stanford University Princeton University.

Technological Uncanny K. S'hell, C Kurtz, N. Vincent et E. André et M. Beugnet 1.

Face recognition using Histograms of Oriented Gradients

Automatic Video Shot Detection from MPEG Bit Stream

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Presenter: Ibrahim A. Zedan

Approximate Models for Fast and Accurate Epipolar Geometry Estimation

Video Summarization by Spatial-Temporal Graph Optimization

BlueScan: Boosting Wi-Fi Scanning Efficiency Using Bluetooth Radio

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Outline Announcement Texture modeling - continued Some remarks

Text Detection in Images and Video

Radio Propagation Simulation Based on Automatic 3D Environment Reconstruction D. He A novel method to simulate radio propagation is presented. The method.

Implementation on video object segmentation algorithm

PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD

Paper Reading Dalong Du April.08, 2011.

Presented By: Firas Gerges (fg92)

Shengcong Chen, Changxing Ding, Minfeng Liu 2018

Presentation transcript:

Tomohiko TAKAHASHL Masaru SUGANO, Keiichiro HOASHL and Sei NAITO International Conference on Multimedia and Expo 2011 Arbitrary Product Detection from Advertisement Video by Using Object Independent Features

Outline Introduction Structure Implement Preliminary Experiment Experiment Result Conclusions

Introduction automatic annotation of TV content is essential for TV viewers, in order to enable efficient search from such large scaled TV content data. Hundreds of objects appear in typical TV programs, but it is unrealistic and inefficient to annotate all appearing objects in TV content. The extraction of important object from TV content is indispensable for the practical application. This paper focus on extracting the advertised product as the important object, from TV advertisement video.

Introduction 1) Movie producer intentionally emphasizes the product. there is large difference between the product and its neighboring area. 2) The target product is filmed with sharp focus. The product area shows more detailed visual feature compared with the neighboring area.

Structure Representative frames are extracted from an input video every 10th frame. The method is developed mainly for MPEG-2 compressed video.

Face and Telop Detection Using the method in [2]. Haar-like features-based face extractor. [2] M. Naito, et al. "High-level feature extraction experiments for TRECVID 2001", Proc of TRECVID 2007

Feature Point Density

DCT AC Energy Acquisition

Edge Detection The edges are detected by Canny's filter from representative frames. Then, approximate rectangles are extracted from each frame by the Douglas-Peucker method [7]. [7] D. Douglas and T. Peucker, "Algorithms for the reduction of the number of points required to represent a digitized line or its caricature", The Canadian Cartographer 10(2), (1973) Color Histogram Acquisition The color histogram is used to evaluate the importance of each product candidate area.

Area Segmentation When high density of the feature point area is shown as a shaded grid, by grouping each neighboring high density feature point, the candidate areas are detected. Regarding DCT AC energy, candidate areas are also detected in the same manner.

Important Object Evaluation(1/2) The feature point density difference diff fp_all is: The DCT AC energy difference diff AC_all is:

Important Object Evaluation(2/2) Color difference diff color : Difference between the candidate area and its marginal areas is calculated as:

Score Weighting The important object is shot in proper location and size within the frame, we adopted the one-third rule [8] ; The most important frames are the full shot [8]. The object importance value is cumulated as per the duration of object appearance. [8] D. Arijon, "Grammar of the film language", Focal Press Ltd. 1976

Preliminary Experiment 30 frames including 30 different types of the products from the advertisement movies (Test Set A). Another 30 frames including the actor/actress(Test Set B). 30 frames including telops (Test Set C), and 30 frames including non-important object (Test Set D) diff fp_all : diff DCT_all : diff color is approximately 4: 1 :8, thus to normalize the each parameter, we defmed t 1 =2.0, t 2 =8.0, and t 3 =1.0.

Experimental Result (1/3) 132 advertisement videos, 57 advertisements are from Japanese terrestrial channels and the others are free content. Each is seconds long and its resolution is 720x540. Set ground truth: Products are in wide variety such as handbag, sweets, coffee, jewelry, medicine, bicycle, cosmetics, TV set, and soon. If one advertisement video includes two products, we selected the top two objects by the proposed method.

Experimental Result (2/3) If the size difference of the detected area and that of the ground truth is smaller than 50% of the ground truth, we counted as the correct detection. This work achieved almost the equivalent accuracy with [6]. [6] T. Takahashi, M. Sugano, S. Sakazawa, "Automatic Thumbnail Extraction for DVR Based on Product Technique Estimation" IEEE Trans on CE, Vol. 56, No.2, May 2010

Experimental Result (3/3) More than 40% of the false detections were the false- positive detections. More accurate human's body detection should be applied. when a neighboring object contains many feature points, multiple objects may form one large candidate area. One of the possible solutions to this issue is to introduce graph cut algorithm.

Conclusions Evaluated a novel method to extract important objects from TV content. Adopted the proposed method for the product detection from advertisement video. In the experiment, our method achieved F-measure of : 79.4.