Local Affine Feature Tracking in Films/Sitcoms Chunhui Gu CS 294-6 Final Presentation Dec. 13, 2006.

Slides:



Advertisements
Similar presentations
Distinctive Image Features from Scale-Invariant Keypoints
Advertisements

A Unified Framework for Context Assisted Face Clustering
RGB-D object recognition and localization with clutter and occlusions Federico Tombari, Samuele Salti, Luigi Di Stefano Computer Vision Lab – University.
BRISK (Presented by Josh Gleason)
Learning Techniques for Video Shot Detection Under the guidance of Prof. Sharat Chandran by M. Nithya.
MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.
Patch to the Future: Unsupervised Visual Prediction
Feature/Model Selection by Linear Programming SVM, Combined with State-of-Art Classifiers: What Can We Learn About the Data Erinija Pranckeviciene, Ray.
Vision Based Control Motion Matt Baker Kevin VanDyke.
Multiple People Detection and Tracking with Occlusion Presenter: Feifei Huo Supervisor: Dr. Emile A. Hendriks Dr. A. H. J. Stijn Oomes Information and.
Video Shot Boundary Detection at RMIT University Timo Volkmer, Saied Tahaghoghi, and Hugh E. Williams School of Computer Science & IT, RMIT University.
Vision-Based Analysis of Small Groups in Pedestrian Crowds Weina Ge, Robert T. Collins, R. Barry Ruback IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE.
Computer Vision REU Week 2 Adam Kavanaugh. Video Canny Put canny into a loop in order to process multiple frames of a video sequence Put canny into a.
A Novel Scheme for Video Similarity Detection Chu-Hong Hoi, Steven March 5, 2003.
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Video summarization by video structure analysis and graph optimization M. Phil 2 nd Term Presentation Lu Shi Dec 5, 2003.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.
Adapted from: CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware Naga K. Govindaraju, Stephane.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Jianke Zhu From Haibin Ling’s ICCV talk Fast Marching Method and Deformation Invariant Features.
MULTIPLE MOVING OBJECTS TRACKING FOR VIDEO SURVEILLANCE SYSTEMS.
Viola and Jones Object Detector Ruxandra Paun EE/CS/CNS Presentation
1 Invariant Local Feature for Object Recognition Presented by Wyman 2/05/2006.
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
Jason Li Jeremy Fowers Ground Target Following for Unmanned Aerial Vehicles.
EE392J Final Project, March 20, Multiple Camera Object Tracking Helmy Eltoukhy and Khaled Salama.
Overview Introduction to local features
Yuping Lin and Gérard Medioni.  Introduction  Method  Register UAV streams to a global reference image ▪ Consecutive UAV image registration ▪ UAV to.
Deep Green System for real-time tracking and playing the board game Reversi Nadav Erell Intro to Computational and Biological Vision, CS department, Ben-Gurion.
CS55 Tianfan Xue Adviser: Bo Zhang, Jianmin Li.
“Hello! My name is... Buffy” Automatic Naming of Characters in TV Video Mark Everingham, Josef Sivic and Andrew Zisserman Arun Shyam.
A General Framework for Tracking Multiple People from a Moving Camera
Overview Harris interest points Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points Evaluation and comparison of different.
Recognition using Regions (Demo) Sudheendra V. Outline Generating multiple segmentations –Normalized cuts [Ren & Malik (2003)] Uniform regions –Watershed.
Learning to Associate: HybridBoosted Multi-Target Tracker for Crowded Scene Present by 陳群元.
Performance Characterization of Video-Shot-Change Detection Methods U. Gargi, R. Kasturi, S. Strayer Presented by: Isaac Gerg.
Haibin Ling and David Jacobs, Deformation Invariant Image Matching, ICCV, Oct. 20, 2005 Deformation Invariant Image Matching Haibin Ling and David W. Jacobs.
21 June 2009Robust Feature Matching in 2.3μs1 Simon Taylor Edward Rosten Tom Drummond University of Cambridge.
Stable Multi-Target Tracking in Real-Time Surveillance Video
Robust Object Tracking by Hierarchical Association of Detection Responses Present by fakewen.
Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT Scale & affine invariant interest point detectors Evaluation and comparison.
Lecture 8: Feature matching CS6670: Computer Vision Noah Snavely.
CSE 185 Introduction to Computer Vision Feature Matching.
Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang.
Using Cross-Media Correlation for Scene Detection in Travel Videos.
Notes on HW 1 grading I gave full credit as long as you gave a description, confusion matrix, and working code Many people’s descriptions were quite short.
Matching of Objects Moving Across Disjoint Cameras Eric D. Cheng and Massimo Piccardi IEEE International Conference on Image Processing
Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Compression and Security of Surveillance Videos Exercise 6 – Shot Change Detection M 陳威佑.
SIFT.
SIFT Scale-Invariant Feature Transform David Lowe
Nearest-neighbor matching to feature database
CSSE463: Image Recognition Day 11
Video Google: Text Retrieval Approach to Object Matching in Videos
Nearest-neighbor matching to feature database
CSSE463: Image Recognition Day 11
Geometric Hashing: An Overview
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Mentor: Salman Khokhar
SIFT.
Video Google: Text Retrieval Approach to Object Matching in Videos
Evaluation of UMD Object Tracking in Video
Lecture 6: Feature matching
Color Image Retrieval based on Primitives of Color Moments
Presentation transcript:

Local Affine Feature Tracking in Films/Sitcoms Chunhui Gu CS Final Presentation Dec. 13, 2006

Objective Automatically detect and track local affine features in film/sitcom frame sequences. –Current Dataset: Sex and the City –Why sitcom? Simple daily environment Few or no special effects Repeated scenes

Outline Preprocessing Tracking Algorithm –Pairwise local matching –Robust features Feature Matching across Shots Results –Feature matching vs baseline color histogram –Time complexity –When does tracking fail

Preprocessing Frame Extraction (i-1)’th shoti’th shot Shot Detection MSER Interest Point Detection SIFT Feature Extraction

Tracking Algorithm Basic: Pairwise Matching Frame iFrame j=i+1

Tracking Algorithm Basic: Pairwise Matching Frame iFrame j=i+1

Tracking Algorithm Basic: Pairwise Matching Frame iFrame j=i+1 Thresholding on both minimum distance and ratio

Tracking Algorithm Basic: Pairwise Matching Frame iFrame j=i+1

Tracking Algorithm Basic: Pairwise Matching Frame iFrame j=i+1

Tracking Algorithm Problem of Pairwise Matching –Sensitive to occlusion and feature misdetection Solutions: –Use multiple overlapping windows –Backward Matching Match features in current frame to features in all previous frames within the shot Pruning process (reduce computation time) Select a proportion of features that have longer tracking length as robust features

Shot grouping/Scene Retrieval Shot Shot Shot 56 Shot Scene 5

Inter-Shot Matching Shot I Shot J

“Confusion Table”

ROC

When Does Tracking Fail? Tracking feature outside local window –Rare when continuous tracking –Happens when occlusion occurs Same feature splitting to two or more groups –Long occlusion –Multiple matching in a single frame Frame iFrame j=i+1

Computation Complexity Everything except for MSER and SIFT algorithms are implemented in Matlab (slow…) ComplexityTime Frame ExtractionO(N)~0.3s/frame Shot DetectionO(N*f(B))~0.07s/frame (B=16) MSER DetectionO(N)~0.3s/frame SIFT DetectionO(N)~0.9s/frame Feature TrackingO(N*F*W*L)~0.5s/frame Matching across shots O(S 2 *T 2 )~1s/shot pair N: # of frames; (30,000) B: # of bins for color hist (16) F: ave. # of features per frame; (400) W: Local window size; (15) L: tracking length; (20) T: ave. # of robust trackers per shot; (300) S: # of shots; (35)

Conclusion We successfully implemented local affine feature tracking in sitcom “sex and the city”. The tracking method is robust to occlusion and feature misdetection. Although no quantitative precision/recall curve (hard to find ground truth), the demonstration shows that precision is almost perfect with good recall performance. We show one successful application of using robust features to associate similar shots together for scene retrieval.

Future Work Implement algorithm in real-time (C/C++) Search unique shots in films/sitcoms Separate indoor scenes from outdoor scenes Determine context of the scene

Acknowledgement