Download presentation
Presentation is loading. Please wait.
1
Modeling Scene and Object Contexts for Human Action Retrieval with Few Examples Yu-Gang Jiang Zhenguo Li Shih-Fu Chang IEEE Transactions on CSVT 2011
2
Outline Context-based Action Retrieval Framework Experiment Result Conclusion
3
Framework A.Video Representation and Negative Sample Selection B.Obtaining Action Context 1.Scene Recognition 2.Object Recognition C.Estimating Action-Scene-Object Relationship D.Incorporationg Multiple Contextual Cues
4
Context-Based Action Retrival Framework
5
A. Video Representation and Negative Sample Selection Use the bag-of-features framework
6
A. Video Representation and Negative Sample Selection Use the bag-of-features framework Use k-means clustering to generate 4000 visual words
7
A. Video Representation and Negative Sample Selection Use the bag-of-features framework Use k-means clustering to generate 4000 visual words Quantize each video clip into two 4000-D histograms of visual words
8
A. Video Representation and Negative Sample Selection Use the bag-of-features framework Use k-means clustering to generate 4000 visual words Quantize each video clip into two 4000-D histograms of visual words Apply Local and Global Consistency(LGC) [27] Pick negative samples after propagation [27] D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Scholkopf, “Learning with local and global consistency,” in Proc. Neural Inform. Process. Syst., 2004, pp. 321–328.
9
Context-Based Action Retrival Framework
10
B. Scene Recognition Train different classifiers for two bag-of- features and simply average their probability predictions The scene models are learned by SVM Adopt 10 scene classes HouseRoadBedroomCar InteriorHotel KitchenLiving RoomOfficeRestaurantShop
11
B. Object Recognition It can only detect person, chair and car Define actions – Track objects based on location and box size – Discard isolated detections Compute average spatial distance between different types of object
12
B. Object Recognition
13
Context-Based Action Retrival Framework
14
C. Estimating Action-Scene-Object Relationship Define context-based inference score – Well distinguish samples from P and N – Produce similar scores if two samples are close
15
C. Estimating Action-Scene-Object Relationship... … m contextual cues n training samples c F...
16
C. Estimating Action-Scene-Object Relationship Constraint 1Constraint 2
17
Context-Based Action Retrival Framework
18
D. Incorporating Multiple Contextual Cues Given an action a and a test sample x AnswerPhoneDriveCarEatKissGetOutCarHandShake FightPersonHugPersonRunSitDownSitUPStandUp
19
Experiment Results Mean average precision(mAP) Retrieval Performance by Raw Features
20
Experiment Results Scene vs. Object
21
Experiment Results Scene vs. Object
22
Experiment Results Comparison to the state of art – SVM learning – Movie script-mining
23
Conclusion An algorithm based on semi-supervised learning paradigm is used to model action- scene-object dependency from limited samples This algorithm can be applied to other types of action videos
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.