Download presentation
Presentation is loading. Please wait.
Published byLeon Hensley Modified over 9 years ago
1
Spatio-temporal constraints for recognizing 3D objects in videos http://slipguru.disi.unige.it Nicoletta Noceti Università degli Studi di Genova
2
2 Outline of the presentation 3D object recognition View based approaches Local descriptors for object recognition Our approach Spatio temporal models for 3D objects recognition Modeling sequences Representation of video sequences w.r.t. the model 2-stage matching procedure Recognizing objects: experiments and results
3
3 Object recognition Localisation means to determine the pose of each object relative to a sensor Categorization means recognising the class to which an object belongs instead of recognising that particular object The goal of recognition systems is to identify which objects are present in a scene Unlike ”merely” perceiving a shape, recognising it involves memory, that is accessing at representations of shapes seen in the past [Wittgenstein73]
4
4 Object recognition
5
5 View based object recognition View based approaches to 3D object recognition gained attention as a way to deal with appearance variation [Murase et al. 95, Pontil et al. 98] no explicit model is required Local approaches produce relatively compact descriptions of the image content and do not suffer from the presence of cluttered background and occlusions [Mikolajczyk et al. 03 ] Local object models are often inspired by text categorization [Cristianini et al. 02] Many view based local approach to recognition have been proposed [Leibe et al. 04, Csurka et al. 04]
6
6 Our approach to recognition We observe an object from slightly different viewpoints and exploit local features distinctive in space and stable in time to perform recognition Our approach shares some similarities with codebook methods but our method extends this concept also in the temporal domain
7
7 Our approach to recognition View-based recognition systems do not need explicit computation of 3D object models Local approaches produce compact descriptions and do not suffer from cluttered background and occlusions Spatial constraints improve quality of recognition [Ferrari et al. 06] Biological vision systems gather information by means of motion to include important cues for depth perception and object recognition [Stringer et al.06]
8
8 Outline of the presentation 3D object recognition View based approaches Local descriptors for object recognition Our approach Spatio temporal models for 3D objects recognition Modeling sequences Representation of video sequences w.r.t. the model 2-stage matching procedure Recognizing objects: experiments and results
9
9 Drawbacks of locality His eyes would dart from one thing to another, picking up tiny features, individual features, as they had done with my face. A striking brightness, a colour, a shape would arrest his attention and elicit comment – but in no case did he get the scene-as-a-whole. He failed to see the whole, seeing only details, which he spotted like blips on a radar screen. He never entered into relation with the picture as a whole - never faced, so to speak, its physiognomy. He had no sense whatever of a landscape or a scene. ”The Man Who Mistook His Wife For A Hat: And Other Clinical Tales”, by Oliver Sacks, 1970
10
10 Ideas Obtain a 3D object recognition method based on a compact description of image sequences Exploit spatial information on proximity of features appearing contemporaneously Exploit temporal continuity both on training and test E. Delponte, N. Noceti, F. Odone and A. Verri Spatio temporal constraints for matching view-based descriptions of 3D objects In WIAMIS 2007
11
11 Recognizing objects with ST models Video sequences Keypoints detection and description Keypoints tracking Cleaning procedure Building the spatio temporal model 2-stage matching procedure Object recognition Spatio-temporal model for training Spatio-temporal model for test
12
12 From sequence to spatio-temporal model
13
13 For each image of the sequence extract Harris corners assign them a scale and a principal direction assign them a SIFT descriptor Tracking of keypoints with Kalman filter cleaning procedure based on length of trajectories and robustness of descriptors Computation of time invariant features From sequence to spatio-temporal model Video sequences Keypoints detection and description Keypoints tracking Cleaning procedure Building the spatio temporal model
14
14 From sequence to spatio-temporal model
15
15 Time invariant feature We obtain a set of time-invariant features: a spatial appearance descriptor, that is the average of all SIFT vectors of its trajectory a temporal descriptor, that contains information on when the feature first appeared in the sequence and on when it was last observed
16
16 The spatio-temporal model The collection of time- invariant features constitutes a spatio- temporal model that we use to train our system We emphasise the temporal coherence of the model and we exploit features appearing simultaneously
17
17 Matching spatio-temporal models 2-stage matching procedure Object recognition Spatio-temporal model for training Spatio-temporal model for test
18
18 Matching of sequence models For each video sequence we compute its spatio-temporal model Given a test sequence, we perform a two stage matching procedure by exploiting spatial and temporal coherence of time- invariant features we compute a first set of matches we reinforce the procedure by analising spatial and temporal matches neighborhood
19
19 Matching of sequence models
20
20 Outline of the presentation 3D object recognition View based approaches Local descriptors for object recognition Our approach Spatio temporal models for 3D objects recognition Modeling sequences Representation of video sequences w.r.t. the model 2-stage matching procedure Recognizing objects: experiments and results
21
21 Experiments and results Matching assessment Illumination, scale and background changes Changes in motion Increasing the number of objects Object recognition on a 20 objects dataset Recognition on a video streaming
22
22 3D objects
23
23 Matching assessment Matches obtained on sequences with simple changes compared w.r.t. ST models of 4 objects
24
24 Changing motion Matches obtained w.r.t. ST models of 4 objects Matches obtained in the first step of matching
25
25 Models of 3D objects Test sequences Matching assessment Models of 3D objects Test sequences
26
26 Recognizing 20 objects Book: Bambi: + + Dewey: О О X Dewey: Sully: X X Box: О О Donald: Dewey:++ Scrooge:О О Book: Bambi: + + Dewey: О О X Dewey: Sully: X X Box: О О Donald: Dewey:++ Scrooge:О О Book: Bambi: + + Dewey: О О X Dewey: Sully: X X Box: О О Donald: Dewey:++ Scrooge:О О Book: Bambi: + + Dewey: О О X Dewey: Sully: X X Box: О О Donald: Dewey:++ Scrooge:О О Number of experiments: 840 TP=51FN=13 FP=11TN=765
27
27 Recognition on a video stream
28
28 Conclusion and future work We exploited the compactness and the expressiveness of local image descriptions to address the problem of 3D object recognition We devised a system based on the use of spatial and temporal information and we have proved how the model of a 3D object benefit of both these information The system could benefit from adding information on the image context [Tor03]
29
Thanks for your attention!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.