Human Action Recognition by Learning Bases of Action Attributes and Parts Bangpeng Yao, Xiaoye Jiang, Aditya Khosla, Andy Lai Lin, Leonidas Guibas, and.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Introduction Problem: Classifying attributes and actions in still images Model:  Collection of part templates  Specific scale space locations (human.
Contributions A people dataset of 8035 images. Three layer attribute classification framework using poselets. 1 2.
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Rapid Object Detection using a Boosted Cascade of Simple Features Paul Viola, Michael Jones Conference on Computer Vision and Pattern Recognition 2001.
Multi-Label Prediction via Compressed Sensing By Daniel Hsu, Sham M. Kakade, John Langford, Tong Zhang (NIPS 2009) Presented by: Lingbo Li ECE, Duke University.
A generic model to compose vision modules for holistic scene understanding Adarsh Kowdle *, Congcong Li *, Ashutosh Saxena, and Tsuhan Chen Cornell University,
3 Small Comments Alex Berg Stony Brook University I work on recognition: features – action recognition – alignment – detection – attributes – hierarchical.
Structured Sparse Principal Component Analysis Reading Group Presenter: Peng Zhang Cognitive Radio Institute Friday, October 01, 2010 Authors: Rodolphe.
Human Action Recognition by Learning Bases of Action Attributes and Parts Bangpeng Yao, Xiaoye Jiang, Aditya Khosla, Andy Lai Lin, Leonidas Guibas, and.
Bangpeng Yao Li Fei-Fei Computer Science Department, Stanford University, USA.
Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
Human Action Recognition by Learning Bases of Action Attributes and Parts.
Large-Scale, Real-World Face Recognition in Movie Trailers Week 2-3 Alan Wright (Facial Recog. pictures taken from Enrique Gortez)
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
Knowing a Good HOG Filter When You See It: Efficient Selection of Filters for Detection Ejaz Ahmed 1, Gregory Shakhnarovich 2, and Subhransu Maji 3 1 University.
Biased Normalized Cuts 1 Subhransu Maji and Jithndra Malik University of California, Berkeley IEEE Conference on Computer Vision and Pattern Recognition.
DISCRIMINATIVE DECORELATION FOR CLUSTERING AND CLASSIFICATION ECCV 12 Bharath Hariharan, Jitandra Malik, and Deva Ramanan.
Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.
Landmark Classification in Large- scale Image Collections Yunpeng Li David J. Crandall Daniel P. Huttenlocher ICCV 2009.
Speaker Adaptation for Vowel Classification
Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.
1 Removal of Impulse Noise in Images by Means of the Use of Support Vector Machines H. Gómez-Moreno, S. Maldonado-Bascón, F. López-Ferreras, and P. Gil-Jiménez.
Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 WHO IS HERE: LOCATION AWARE FACE RECOGNITION.
School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping.
Crowdsourcing Game Development for Collecting Benchmark Data of Facial Expression Recognition Systems Department of Information and Learning Technology.
Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.
Unsupervised Object Segmentation with a Hybrid Graph Model (HGM) Reporter: 鄭綱 (6/14)
Human Gesture Recognition Using Kinect Camera Presented by Carolina Vettorazzo and Diego Santo Orasa Patsadu, Chakarida Nukoolkit and Bunthit Watanapa.
Why Categorize in Computer Vision ?. Why Use Categories? People love categories!
1 Action Classification: An Integration of Randomization and Discrimination in A Dense Feature Representation Computer Science Department, Stanford University.
Generalized Fuzzy Clustering Model with Fuzzy C-Means Hong Jiang Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, US.
Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)
Deformable Part Model Presenter : Liu Changyu Advisor : Prof. Alex Hauptmann Interest : Multimedia Analysis April 11 st, 2013.
Locality-constrained Linear Coding for Image Classification
Dynamic Captioning: Video Accessibility Enhancement for Hearing Impairment Richang Hong, Meng Wang, Mengdi Xuy Shuicheng Yany and Tat-Seng Chua School.
Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions Bangpeng Yao and Li Fei-Fei Computer Science Department, Stanford.
Face recognition via sparse representation. Breakdown Problem Classical techniques New method based on sparsity Results.
Recognition Using Visual Phrases
Color-Attributes-Related Image Retrieval Student: Kylie Gorman Mentor: Yang Zhang.
Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations ZUO ZHEN 27 SEP 2011.
Describing People: A Poselet-Based Approach to Attribute Classification.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Carl Vondrick, Aditya Khosla, Tomasz Malisiewicz, Antonio Torralba Massachusetts Institute of Technology
9.913 Pattern Recognition for Vision Class9 - Object Detection and Recognition Bernd Heisele.
6.S093 Visual Recognition through Machine Learning Competition Image by kirkh.deviantart.com Joseph Lim and Aditya Khosla Acknowledgment: Many slides from.
Statistical techniques for video analysis and searching chapter Anton Korotygin.
Extending linear models by transformation (section 3.4 in text) (lectures 3&4 on amlbook.com)
Present by: Fang-Hui Chu Large Margin Gaussian Mixture Modeling for Phonetic Classification and Recognition Fei Sha*, Lawrence K. Saul University of Pennsylvania.
What is an Object? —— an experimental evaluation Presented by: Yao Pan.
CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.
Bangpeng Yao1, Xiaoye Jiang2, Aditya Khosla1,
UN Core Pre-Deployment Training Materials 2017
Deeply learned face representations are sparse, selective, and robust
Mammogram Analysis – Tumor classification
Article Review Todd Hricik.
Perceptual Loss Deep Feature Interpolation for Image Content Changes
A New Classification Mechanism for Retinal Images
Action Recognition ECE6504 Xiao Lin.
© 2013 ExcelR Solutions. All Rights Reserved An Introduction to Creating a Perfect Decision Tree.
Outline Perceptual organization, grouping, and segmentation
Boosting Nearest-Neighbor Classifier for Character Recognition
Image Classification.
Domingo Mery Department of Computer Science
Supervised vs. unsupervised Learning
Domingo Mery Department of Computer Science
Presentation transcript:

Human Action Recognition by Learning Bases of Action Attributes and Parts Bangpeng Yao, Xiaoye Jiang, Aditya Khosla, Andy Lai Lin, Leonidas Guibas, and Li Fei-Fei Stanford University

Outline Introduction Action Bases Learning the Dual-Sparse Action Bases and Reconstruction Coefficients Experiments

Introduction Human action recognition in still images Contributions A general image classification problem Human-object interaction Parts + Attributes Contributions Represent each image by using a sparse set of action bases that are meaningful to the content of the image Effectively learn these bases given far-from-perfect detections of action attributes and parts without meticulous human labeling

Action Bases Attributes and parts Attributes: verb, learned by discriminative classifiers Parts: object parts and poselets, learned by pre-trained object detectors and poselet detectors A vector of the normalized confidence scores obtained from these classifiers and detectors is used to represent this image.

Action Bases High-order interactions of image attributes and parts is used to represent each image and SVMs are trained for action classification

Dual-sparsity Learning

Experiments PASCAL actions Stanford 40 actions

PASCAL

Stanford 40 actions