Retrieving Actions in Group Contexts Tian Lan, Yang Wang, Greg Mori, Stephen Robinovitch Simon Fraser University Sept. 11, 2010.

Slides:



Advertisements
Similar presentations
A Support Vector Method for Optimizing Average Precision
Advertisements

Date: 2013/1/17 Author: Yang Liu, Ruihua Song, Yu Chen, Jian-Yun Nie and Ji-Rong Wen Source: SIGIR12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Adaptive.
Learning Shared Body Plans Ian Endres University of Illinois work with Derek Hoiem, Vivek Srikumar and Ming-Wei Chang.
Diversified Retrieval as Structured Prediction Redundancy, Diversity, and Interdependent Document Relevance (IDR ’09) SIGIR 2009 Workshop Yisong Yue Cornell.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Limin Wang, Yu Qiao, and Xiaoou Tang
Ľubor Ladický1 Phil Torr2 Andrew Zisserman1
CVPR2013 Poster Representing Videos using Mid-level Discriminative Patches.
Lecture 31: Modern object recognition
Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
Learning Structural SVMs with Latent Variables Xionghao Liu.
1 Building a Dictionary of Image Fragments Zicheng Liao Ali Farhadi Yang Wang Ian Endres David Forsyth Department of Computer Science, University of Illinois.
Object-centric spatial pooling for image classification Olga Russakovsky, Yuanqing Lin, Kai Yu, Li Fei-Fei ECCV 2012.
1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.
Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.
Large-Scale Object Recognition with Weak Supervision
Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.
Detecting Pedestrians by Learning Shapelet Features
Optimizing Estimated Loss Reduction for Active Sampling in Rank Learning Presented by Pinar Donmez joint work with Jaime G. Carbonell Language Technologies.
Quantifying and Transferring Contextual Information in Object Detection Professor: S. J. Wang Student : Y. S. Wang 1.
Beyond Actions: Discriminative Models for Contextual Group Activities Tian Lan School of Computing Science Simon Fraser University August 12, 2010 M.Sc.
Training Regimes Motivation  Allow state-of-the-art subcomponents  With “Black-box” functionality  This idea also occurs in other application areas.
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Generic object detection with deformable part-based models
Yu-Gang Jiang, Yanran Wang, Rui Feng Xiangyang Xue, Yingbin Zheng, Hanfang Yang Understanding and Predicting Interestingness of Videos Fudan University,
Object Bank Presenter : Liu Changyu Advisor : Prof. Alex Hauptmann Interest : Multimedia Analysis April 4 th, 2013.
Loss-based Learning with Latent Variables M. Pawan Kumar École Centrale Paris École des Ponts ParisTech INRIA Saclay, Île-de-France Joint work with Ben.
Object Detection Sliding Window Based Approach Context Helps
Thesis Proposal PrActive Learning: Practical Active Learning, Generalizing Active Learning for Real-World Deployments.
Jifeng Dai 2011/09/27.  Introduction  Structural SVM  Kernel Design  Segmentation and parameter learning  Object Feature Descriptors  Experimental.
“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)
Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.
1 Action Classification: An Integration of Randomization and Discrimination in A Dense Feature Representation Computer Science Department, Stanford University.
Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.
Pedestrian Detection and Localization
Reduction of Training Noises for Text Classifiers Rey-Long Liu Dept. of Medical Informatics Tzu Chi University Taiwan.
INTRODUCTION Heesoo Myeong and Kyoung Mu Lee Department of ECE, ASRI, Seoul National University, Seoul, Korea Tensor-based High-order.
Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Chao-Yeh Chen and Kristen Grauman University of Texas at Austin Efficient Activity Detection with Max- Subgraph Search.
Object detection, deep learning, and R-CNNs
Histograms of Oriented Gradients for Human Detection(HOG)
Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions Bangpeng Yao and Li Fei-Fei Computer Science Department, Stanford.
Human Detection Method Combining HOG and Cumulative Sum based Binary Pattern Jong Gook Ko', Jin Woo Choi', So Hee Park', Jang Hee You', ' Electronics and.
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
ACADS-SVMConclusions Introduction CMU-MMAC Unsupervised and weakly-supervised discovery of events in video (and audio) Fernando De la Torre.
Proximity-based Ranking of Biomedical Texts Rey-Long Liu * and Yi-Chih Huang * Dept. of Medical Informatics Tzu Chi University Taiwan.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Musical Genre Categorization Using Support Vector Machines Shu Wang.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
SUMMERY 1. VOLUMETRIC FEATURES FOR EVENT DETECTION IN VIDEO correlate spatio-temporal shapes to video clips that have been automatically segmented we.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
A Hierarchical Deep Temporal Model for Group Activity Recognition
Ranking and Learning 290N UCSB, Tao Yang, 2014
Object detection with deformable part-based models
Data Driven Attributes for Action Detection
An Empirical Study of Learning to Rank for Entity Search
Learning to Rank Shubhra kanti karmaker (Santu)
Object detection as supervised classification
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
An HOG-LBP Human Detector with Partial Occlusion Handling
Fine-Grained Visual Categorization
AHED Automatic Human Emotion Detection
Weakly Supervised Action Recognition
Feature Selection for Ranking
Jonathan Elsas LTI Student Research Symposium Sept. 14, 2007
Jointly Generating Captions to Aid Visual Question Answering
Presentation transcript:

Retrieving Actions in Group Contexts Tian Lan, Yang Wang, Greg Mori, Stephen Robinovitch Simon Fraser University Sept. 11, 2010

Outline Action Retrieval as Ranking Results and Future Work Contextual Representation of Actions

Nursing Home Fall analysis in nursing home surveillance videos – a system automatically rank the videos according to the relevance to fall action is expected

Action-Action Context Context What other people are doing ?

Actions in Group Context Motivation – human actions are rarely performed in isolation, the actions of individuals in a group can serve as context for each other. Goal – explore the benefit of contextual information in action retrieval in challenging real-world applications

Action Context Descriptor τ action τ z + Focal personContext

Action Context Descriptor Feature Descriptor Multi-class SVM action class score action class score … action class score max action class score e.g. HOG by Dalal & Triggs

Outline Action Retrieval as Ranking Results and Future Work Contextual Representation of Actions

Classification or Retrieval Previous Work – Most work in human action understanding focuses on action classification.

Classification or Retrieval Most surveillance tasks are typical retrieval tasks – retrieve a small video segment contains a particular action from thousands of hours of videos. The “action of interest” is rare event – Extremely imbalanced classes

Action Retrieval Rank according to the relevance to falls Query : fall

Learning Input: document-rank pair (x i,y i ) Optimization Joachims, KDD 06

Ranking SVM Ranking function h(x) h(x) Rank r1 Rank r2 Rank r3

Action Retrieval - training irrelevant very relevant

Outline Action Retrieval as Ranking Results and Future Work Contextual Representation of Actions

Dataset Nursing Home Dataset 5 action categories: walking, standing, sitting, bending and falling. (per person) 18 video clips. Query: fall Collective Activity Dataset (Choi et al. VS. 09) 5 action categories: crossing, waiting, queuing, walking, talking. (per person) 44 video clips. Query: each of the five actions

Nursing Home Dataset Dataset

Collective Activity Dataset

System Overview Person Detector Person Detector Person Descriptor Person Descriptor Video u v Rank SVM Rank SVM Pedestrian Detection by Felzenszwalb et al. Background Subtraction HOG by Dalal & Triggs LST by Loy et al. at cvpr 09

Baselines Context vs No Context – Action Context Descriptor – Original feature descriptors, e.g. HOG (Dalal & Triggs at CVPR 05), LST (Loy et al. at CVPR 09) RankSVM vs SVM Methods – Context + RankSVM (our method) – Context + SVM – No Context + RankSVM – No Context + SVM

Retrieval Results Nursing Home Dataset

Retrieval Results Collective Activity Dataset

Retrieval Results Collective Activity Dataset

Retrieval Results Collective Activity Dataset

Action Classification [10] Choi et al. in VS. 09 Collective Activity Dataset

Conclusion A new contextual feature descriptor to represent actions – action context (AC) descriptor Formulate our problem as a retrieval task.

Future Work Contextual Feature Descriptors – How to only encode useful context? Rank-SVM loss, optimize the NDCG score

Thank you!