Retrieving Actions in Group Contexts Tian Lan, Yang Wang, Greg Mori, Stephen Robinovitch Simon Fraser University Sept. 11, 2010
Outline Action Retrieval as Ranking Results and Future Work Contextual Representation of Actions
Nursing Home Fall analysis in nursing home surveillance videos – a system automatically rank the videos according to the relevance to fall action is expected
Action-Action Context Context What other people are doing ?
Actions in Group Context Motivation – human actions are rarely performed in isolation, the actions of individuals in a group can serve as context for each other. Goal – explore the benefit of contextual information in action retrieval in challenging real-world applications
Action Context Descriptor τ action τ z + Focal personContext
Action Context Descriptor Feature Descriptor Multi-class SVM action class score action class score … action class score max action class score e.g. HOG by Dalal & Triggs
Outline Action Retrieval as Ranking Results and Future Work Contextual Representation of Actions
Classification or Retrieval Previous Work – Most work in human action understanding focuses on action classification.
Classification or Retrieval Most surveillance tasks are typical retrieval tasks – retrieve a small video segment contains a particular action from thousands of hours of videos. The “action of interest” is rare event – Extremely imbalanced classes
Action Retrieval Rank according to the relevance to falls Query : fall
Learning Input: document-rank pair (x i,y i ) Optimization Joachims, KDD 06
Ranking SVM Ranking function h(x) h(x) Rank r1 Rank r2 Rank r3
Action Retrieval - training irrelevant very relevant
Outline Action Retrieval as Ranking Results and Future Work Contextual Representation of Actions
Dataset Nursing Home Dataset 5 action categories: walking, standing, sitting, bending and falling. (per person) 18 video clips. Query: fall Collective Activity Dataset (Choi et al. VS. 09) 5 action categories: crossing, waiting, queuing, walking, talking. (per person) 44 video clips. Query: each of the five actions
Nursing Home Dataset Dataset
Collective Activity Dataset
System Overview Person Detector Person Detector Person Descriptor Person Descriptor Video u v Rank SVM Rank SVM Pedestrian Detection by Felzenszwalb et al. Background Subtraction HOG by Dalal & Triggs LST by Loy et al. at cvpr 09
Baselines Context vs No Context – Action Context Descriptor – Original feature descriptors, e.g. HOG (Dalal & Triggs at CVPR 05), LST (Loy et al. at CVPR 09) RankSVM vs SVM Methods – Context + RankSVM (our method) – Context + SVM – No Context + RankSVM – No Context + SVM
Retrieval Results Nursing Home Dataset
Retrieval Results Collective Activity Dataset
Retrieval Results Collective Activity Dataset
Retrieval Results Collective Activity Dataset
Action Classification [10] Choi et al. in VS. 09 Collective Activity Dataset
Conclusion A new contextual feature descriptor to represent actions – action context (AC) descriptor Formulate our problem as a retrieval task.
Future Work Contextual Feature Descriptors – How to only encode useful context? Rank-SVM loss, optimize the NDCG score
Thank you!