Video Shot Boundary Detection at RMIT University Timo Volkmer, Saied Tahaghoghi, and Hugh E. Williams School of Computer Science & IT, RMIT University.

Slides:



Advertisements
Similar presentations
A probabilistic model for retrospective news event detection
Advertisements

Image Retrieval With Relevant Feedback Hayati Cam & Ozge Cavus IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK Hayati CAM Ozge CAVUS.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Fast Algorithms For Hierarchical Range Histogram Constructions
Learning Techniques for Video Shot Detection Under the guidance of Prof. Sharat Chandran by M. Nithya.
CIS 581 Course Project Heshan Lin
The evaluation and optimisation of multiresolution FFT Parameters For use in automatic music transcription algorithms.
Content-based retrieval of audio Francois Thibault MUMT 614B McGill University.
Chapter 4: Image Enhancement
A Novel Scheme for Video Similarity Detection Chu-Hong Hoi, Steven March 5, 2003.
3/6/2015 PortoICIAR’20041 Adaptive Methods for Motion Characterization and Segmentation of MPEG Compressed Frame Sequences C. Doulaverakis, S. Vagionitis,
HCI Final Project Robust Real Time Face Detection Paul Viola, Michael Jones, Robust Real-Time Face Detetion, International Journal of Computer Vision,
1 Content Based Image Retrieval Using MPEG-7 Dominant Color Descriptor Student: Mr. Ka-Man Wong Supervisor: Dr. Lai-Man Po MPhil Examination Department.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
Segmentation Divide the image into segments. Each segment:
Local Affine Feature Tracking in Films/Sitcoms Chunhui Gu CS Final Presentation Dec. 13, 2006.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
Stockman MSU/CSE Fall 2009 Finding region boundaries.
MULTIPLE MOVING OBJECTS TRACKING FOR VIDEO SURVEILLANCE SYSTEMS.
Robust Real-Time Object Detection Paul Viola & Michael Jones.
ICME 2004 Tzvetanka I. Ianeva Arjen P. de Vries Thijs Westerveld A Dynamic Probabilistic Multimedia Retrieval Model.
05/06/2005CSIS © M. Gibbons On Evaluating Open Biometric Identification Systems Spring 2005 Michael Gibbons School of Computer Science & Information Systems.
Video Trails: Representing and Visualizing Structure in Video Sequences Vikrant Kobla David Doermann Christos Faloutsos.
Foundations of Computer Vision Rapid object / face detection using a Boosted Cascade of Simple features Presented by Christos Stoilas Rapid object / face.
Improving web image search results using query-relative classifiers Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy.
Tal Mor  Create an automatic system that given an image of a room and a color, will color the room walls  Maintaining the original texture.
October 8, 2013Computer Vision Lecture 11: The Hough Transform 1 Fitting Curve Models to Edges Most contours can be well described by combining several.
Face Detection using the Viola-Jones Method
SoundSense: Scalable Sound Sensing for People-Centric Application on Mobile Phones Hon Lu, Wei Pan, Nocholas D. lane, Tanzeem Choudhury and Andrew T. Campbell.
FEATURE EXTRACTION FOR JAVA CHARACTER RECOGNITION Rudy Adipranata, Liliana, Meiliana Indrawijaya, Gregorius Satia Budhi Informatics Department, Petra Christian.
Annex I: Methods & Tools prepared by some members of the ICH Q9 EWG for example only; not an official policy/guidance July 2006, slide 1 ICH Q9 QUALITY.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
Active Learning for Class Imbalance Problem
MINING RELATED QUERIES FROM SEARCH ENGINE QUERY LOGS Xiaodong Shi and Christopher C. Yang Definitions: Query Record: A query record represents the submission.
Shape Matching with Occlusion in Image Databases Aristeidis Diplaros Euripides G.M. Petrakis Evangelos Milios Technical University of Crete.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Lecture 20: Cluster Validation
An Empirical Study of Choosing Efficient Discriminative Seeds for Oligonucleotide Design Won-Hyong Chung and Seong-Bae Park Dept. of Computer Engineering.
Automatic Minirhizotron Root Image Analysis Using Two-Dimensional Matched Filtering and Local Entropy Thresholding Presented by Guang Zeng.
Phase Congruency Detects Corners and Edges Peter Kovesi School of Computer Science & Software Engineering The University of Western Australia.
Performance Characterization of Video-Shot-Change Detection Methods U. Gargi, R. Kasturi, S. Strayer Presented by: Isaac Gerg.
Multiple alignment: Feng- Doolittle algorithm. Why multiple alignments? Alignment of more than two sequences Usually gives better information about conserved.
1 Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition Qi Li, Senior Member, IEEE, Jinsong Zheng, Augustine.
Stable Multi-Target Tracking in Real-Time Surveillance Video
1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University July 21, 2008WODA.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
AISTATS 2010 Active Learning Challenge: A Fast Active Learning Algorithm Based on Parzen Window Classification L.Lan, H.Shi, Z.Wang, S.Vucetic Temple.
Finding Near-Duplicate Web Pages: A Large-Scale Evaluation of Algorithms Author: Monika Henzinger Presenter: Chao Yan.
1 Overview Importing data from generic raster files Creating surfaces from point samples Mapping contours Calculating summary attributes for polygon features.
Boosted Particle Filter: Multitarget Detection and Tracking Fayin Li.
By: David Gelbendorf, Hila Ben-Moshe Supervisor : Alon Zvirin
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Adam Blake, June 9 th Results Quick Review Look at Some Data In Depth Look at One Anomalous Event Conclusion.
A Framework for Detection and Measurement of Phishing Attacks Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 2/25/2016 Slide.
Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.
Compression and Security of Surveillance Videos Exercise 6 – Shot Change Detection M 陳威佑.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ; Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.
Using the Fisher kernel method to detect remote protein homologies Tommi Jaakkola, Mark Diekhams, David Haussler ISMB’ 99 Talk by O, Jangmin (2001/01/16)
LEARNING IN A PAIRWISE TERM-TERM PROXIMITY FRAMEWORK FOR INFORMATION RETRIEVAL Ronan Cummins, Colm O’Riordan (SIGIR’09) Speaker : Yi-Ling Tai Date : 2010/03/15.
Student Gesture Recognition System in Classroom 2.0 Chiung-Yao Fang, Min-Han Kuo, Greg-C Lee, and Sei-Wang Chen Department of Computer Science and Information.
Automatic Classification of Audio Data by Carlos H. L. Costa, Jaime D. Valle, Ro L. Koerich IEEE International Conference on Systems, Man, and Cybernetics.
Face Detection EE368 Final Project Group 14 Ping Hsin Lee
Presenter: Ibrahim A. Zedan
PROGRAMME 27 STATISTICS.
Evaluation of UMD Object Tracking in Video
Presented by Xu Miao April 20, 2005
Presentation transcript:

Video Shot Boundary Detection at RMIT University Timo Volkmer, Saied Tahaghoghi, and Hugh E. Williams School of Computer Science & IT, RMIT University {tvolkmer, saied,

Overview Our general approach The moving query window Details of the approach How we measure frame similarity Improvements for 2004 cut detection Detection of gradual transitions Evaluation Experimental results Conclusions

The Moving Query Window A moving query window consists of two equal-sized half windows, surrounding a current frame The moving query window is advanced through the video frame-by- frame Cut detection and gradual transition detection is performed with separate decision stages during a single pass

Frame feature representation We use one-dimensional, localised histograms with 4x4 regions in the HSV colour space (16 bins per colour component) A colour histogram represents each frame region. Corresponding regions are compared Different weights can be applied to each region during comparison

Cut detection We disregard the four central regions of each frame to avoid the effect of rapid activity (that is, their weight = 0) Using the remaining regions, each frame in the moving window is ranked by decreasing similarity to the current frame Frame similarity is the sum of the inter-region similarities The number of pre-frames that are ranked in the top half of the rankings is monitored When a cut is passed, the number of top ranked pre-frames (usually) rises to a maximum and falls to a minimum within a few frames We have determined an optimum window size and optimum thresholds that are effective for all our training sets Our cut detection is (now) parameter free

Gradual transition detection Pre-frames and post-frames are combined into two distinct sets of frames. The average distance of each set to the current frame is computed We use all frame regions (with identical weights) The ratio between the pre-frame set distance and the post-frame set distance, the PrePostRatio, is monitored The end of most gradual transitions is indicated by a peak in the PrePostRatio curve We maintain a moving average PrePostRatio for calculating a dynamic threshold to detect transitions As a final decision step, we require a minimum difference between the last frame of the previous shot and the first frame of the new shot

PrePostRatio in detail A schematised dissolve between a shot A and a shot B: The PrePostRatio is usually minimal at the beginning of a gradual transition and rises up to a maximum at the end of the transition

PrePostRatio curve example The curve shows two short gradual transitions and two cuts within a range of 1000 frames

Training and Evaluation We have trained on the TRECVID 2003 shot boundary test set Main parameters for gradual transition detection are The query window size The size of the history buffer for dynamic thresholding A threshold level factor Results are discussed on the next slides. (We achieve similar and better results on the 2002 and 2001 test sets in blind runs.)

Results at TRECVID 2004 AllCutsGradual Transitions SysIDRecallPrecisionRecallPrecisionRecallPrecision rmit rmit rmit rmit rmit rmit rmit rmit rmit rmit

Overall results

Frame recall and precision for gradual transitions

Discussion Cut detection is highly effective This year, recall is 94% and precision is 92%. Improvements from 2003 due to ignoring centre region Gradual detection has improved significantly since 2003: Recall now between 68%--85%, precision 67%--84% High detection threshold favours precision, low favours recall Short detection threshold history length was found to be preferable Final decision step reduces false positives For television news, we are able to use a fixed moving query window size of 24 frames Experimented with a simple ASR technique in 10 additional runs, which removed detected transitions that coincided with spoken words. Ad hoc, very unsuccessful…

Conclusions Disregarding the focus area of frames for cut detection has improved our results by 3% in recall and 9% in precision Our parameter-free ranking scheme is highly effective in cut detection on a wide variety of footage Our gradual transition detection method is relatively simple and needs only few parameters The additional, final preprocessing step reduces false positives and improved results significantly The use of localised histograms and more dynamic thresholding also improved results in gradual transition detection Our approach is computationally inexpensive, simple to implement, and effective 15,500 seconds to process the video (around 4 hours, 18 minutes)

Questions?