Multi-view Synchronization of Human Actions and Dynamic Scenes Emilie Dexter, Patrick Pérez, Ivan Laptev INRIA Rennes - Bretagne Atlantique

Slides:



Advertisements
Similar presentations
3D Model Matching with Viewpoint-Invariant Patches(VIP) Reporter :鄒嘉恆 Date : 10/06/2009.
Advertisements

Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.
Patch to the Future: Unsupervised Visual Prediction
Image and video descriptors
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
EVENTS: INRIA Work Review Nov 18 th, Madrid.
Adviser : Ming-Yuan Shieh Student ID : M Student : Chung-Chieh Lien VIDEO OBJECT SEGMENTATION AND ITS SALIENT MOTION DETECTION USING ADAPTIVE BACKGROUND.
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang 2, Dimitris Metaxas 1 Rutgers University 1, Lehigh University.
74 th EAGE Conference & Exhibition incorporating SPE EUROPEC 2012 Automated seismic-to-well ties? Roberto H. Herrera and Mirko van der Baan University.
Matching with Invariant Features
Local Descriptors for Spatio-Temporal Recognition
Shape and Dynamics in Human Movement Analysis Ashok Veeraraghavan.
X From Video - Seminar By Randa Khayr Eli Shechtman, Yaron Caspi & Michal Irani.
Active Calibration of Cameras: Theory and Implementation Anup Basu Sung Huh CPSC 643 Individual Presentation II March 4 th,
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Motion based Correspondence for Distributed 3D tracking of multiple dim objects Ashok Veeraraghavan.
Automatic Panoramic Image Stitching using Local Features Matthew Brown and David Lowe, University of British Columbia.
Feature matching and tracking Class 5 Read Section 4.1 of course notes Read Shi and Tomasi’s paper on.
Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.
Automatic Matching of Multi-View Images
Learning the space of time warping functions for Activity Recognition Function-Space of an Activity Ashok Veeraraghavan Rama Chellappa Amit K. Roy-Chowdhury.
Miguel Lourenço, João P. Barreto, Abed Malti Institute for Systems and Robotics, Faculty of Science and Technology University of Coimbra, Portugal Feature.
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
Hand Signals Recognition from Video Using 3D Motion Capture Archive Tai-Peng Tian Stan Sclaroff Computer Science Department B OSTON U NIVERSITY I. Introduction.
Image Subtraction for Real Time Moving Object Extraction Shahbe Mat Desa, Qussay A. Salih, CGIV’04.
Lecture 12: Structure from motion CS6670: Computer Vision Noah Snavely.
Overview Introduction to local features
Recognizing Action at a Distance A.A. Efros, A.C. Berg, G. Mori, J. Malik UC Berkeley.
Jitter Camera: High Resolution Video from a Low Resolution Detector Moshe Ben-Ezra, Assaf Zomet and Shree K. Nayar IEEE CVPR Conference June 2004, Washington.
IRISA / INRIA Rennes Computational Vision and Active Perception Laboratory (CVAP) KTH (Royal Institute of Technology)
Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,
Online Tracking of Outdoor Lighting Variations for Augmented Reality with Moving Cameras Yanli Liu 1,2 and Xavier Granier 2,3,4 1: College of Computer.
Similarity measuress Laboratory of Image Analysis for Computer Vision and Multimedia Università di Modena e Reggio Emilia,
Characterizing activity in video shots based on salient points Nicolas Moënne-Loccoz Viper group Computer vision & multimedia laboratory University of.
Overview Harris interest points Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points Evaluation and comparison of different.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Tzu ming Su Advisor : S.J.Wang MOTION DETAIL PRESERVING OPTICAL FLOW ESTIMATION 2013/1/28 L. Xu, J. Jia, and Y. Matsushita. Motion detail preserving optical.
3D SLAM for Omni-directional Camera
Periodic Motion Detection via Approximate Sequence Alignment Ivan Laptev*, Serge Belongie**, Patrick Perez* *IRISA/INRIA, Rennes, France **Univ. of California,
Reconstructing 3D mesh from video image sequences supervisor : Mgr. Martin Samuelčik by Martin Bujňák specifications Master thesis
Dynamic 3D Scene Analysis from a Moving Vehicle Young Ki Baik (CV Lab.) (Wed)
K. Selçuk Candan, Maria Luisa Sapino Xiaolan Wang, Rosaria Rossini
Correspondence-Free Determination of the Affine Fundamental Matrix (Tue) Young Ki Baik, Computer Vision Lab.
Dynamic Refraction Stereo 7. Contributions Refractive disparity optimization gives stable reconstructions regardless of surface shape Require no geometric.
Computer Vision Lecture #10 Hossam Abdelmunim 1 & Aly A. Farag 2 1 Computer & Systems Engineering Department, Ain Shams University, Cairo, Egypt 2 Electerical.
Communication Systems Group Technische Universität Berlin S. Knorr A Geometric Segmentation Approach for the 3D Reconstruction of Dynamic Scenes in 2D.
Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Reconstruction the 3D world out of two frames, based on camera pinhole model : 1. Calculating the Fundamental Matrix for each pair of frames 2. Estimating.
Joint Tracking of Features and Edges STAN BIRCHFIELD AND SHRINIVAS PUNDLIK CLEMSON UNIVERSITY ABSTRACT LUCAS-KANADE AND HORN-SCHUNCK JOINT TRACKING OF.
Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT Scale & affine invariant interest point detectors Evaluation and comparison.
3D reconstruction from uncalibrated images
Features, Feature descriptors, Matching Jana Kosecka George Mason University.
Demosaicking for Multispectral Filter Array (MSFA)
Lecture 9 Feature Extraction and Motion Estimation Slides by: Michael Black Clark F. Olson Jean Ponce.
Tracking Groups of People for Video Surveillance Xinzhen(Elaine) Wang Advisor: Dr.Longin Latecki.
ICCV 2007 Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang 2, Dimitris Metaxas 1 Rutgers University 1,
ENTERFACE 08 Project 9 “ Tracking-dependent and interactive video projection ” Mid-term presentation August 19th, 2008.
Learning Image Statistics for Bayesian Tracking Hedvig Sidenbladh KTH, Sweden Michael Black Brown University, RI, USA
Reconstruction of a Scene with Multiple Linearly Moving Objects Mei Han and Takeo Kanade CISC 849.
Signal and Image Processing Lab
C. Canton1, J.R. Casas1, A.M.Tekalp2, M.Pardàs1
Lecture 26 Hand Pose Estimation Using a Database of Hand Images
L-infinity minimization in geometric vision problems.
Presented by Omer Shakil
Combining Geometric- and View-Based Approaches for Articulated Pose Estimation David Demirdjian MIT Computer Science and Artificial Intelligence Laboratory.
Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu
Presentation transcript:

Multi-view Synchronization of Human Actions and Dynamic Scenes Emilie Dexter, Patrick Pérez, Ivan Laptev INRIA Rennes - Bretagne Atlantique

E. DEXTER - BMVC September 2009 Introduction Purpose –Synchronization of image sequences of the same dynamic event or similar dynamic events –Generally hard problem Unknown camera position or motion Unknown temporal offset or time warping –Applications 3D dynamic reconstruction Analysis of multi-view dynamic scenes

Emilie DEXTER - BMVC September 2009 Outline Temporal descriptor for image sequences Proposed method Experimental Results

Emilie DEXTER - BMVC September 2009 Temporal description of image sequences Inputs: –Sequence of T images denoted I={I 1, …,I T } –Features extracted from each image: real point trajectories –Dominant motion estimation and compensation Self-similarity matrix definition (SSM): SSM for a golf swing action where i and j are the instant indexes, k is the trajectory index and N ij the number of trajectories that span the time intervals between frames I i and I j.

Emilie DEXTER - BMVC September 2009 Temporal description of image sequences SSM Characteristics: –Stable across views –Exhibit dynamics of the scenes SSMs of bend action seen from side and top views SSMs of bend and jump actions both seen from side view

Emilie DEXTER - BMVC September 2009 Temporal description of image sequences SSM description: –Matrix as an image: HOG-based description along diagonal Point scale automatic selection –Sequence of local descriptors H = (h 1,…,h T ) T

Emilie DEXTER - BMVC September 2009 Synchronization Method Common framework for: –A same scene seen from different views –Similar dynamic events as actions Proposed Framework Do for each sequence Extract input data : trajectories Build SSM Compute descriptor H i end Apply DTW between H 1 et H 2

Emilie DEXTER - BMVC September 2009 Fixed vs. Adaptive Size of Local Descriptor Evaluation based on synchronization error estimation: the average distance of points on the estimated warp to the ground truth Descriptors : –[6] = fixed and identical size of the support for both sequences [Junejo et al. ECCV08] –[6]* = fixed size of the support but different for each sequence –Proposed = automatic adapted size

Emilie DEXTER - BMVC September 2009 Fixed vs. Adaptive Size of Local Descriptor

Emilie DEXTER - BMVC September 2009 Natural Sequences – static cameras Estimation (red) and Gr. Truth (blue)

Emilie DEXTER - BMVC September 2009 Unconstrained Natural Sequences Estimation (red) and Gr. Truth (blue)

Emilie DEXTER - BMVC September 2009 Drinking action from Coffee and Cigarettes

Emilie DEXTER - BMVC September 2009 Smoking action from Coffee and Cigarettes

Emilie DEXTER - BMVC September 2009 Conclusion Automatic method for synchronizing image sequences –Generic solution (few constraints) –Descriptors stable under view-changes –DTW can handle non-linear time warping Comparison with others state-of-art methods Address higher level problem: matching of sequences

Emilie DEXTER - BMVC September 2009 Thank you

Emilie DEXTER - BMVC September 2009 Previous Works and Characteristics –Caspi and Irani [PAMI 2002] Linear time warping Correspondences between views –Rao et al. [ICIP 2003] Correspondences between trajectories across views –Tuytelaars and Van Gool [CVPR 2004] Tracked point manually chosen –Wolf and Zomet [IJCV 2006] Time-shift assumption

Emilie DEXTER - BMVC September 2009 Adaptive size descriptor For each diagonal point : Scale detection : maximizing the normalized Laplacian (  i) Size choice : 2  i