Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.

Slides:



Advertisements
Similar presentations
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Advertisements

A Discriminative Key Pose Sequence Model for Recognizing Human Interactions Arash Vahdat, Bo Gao, Mani Ranjbar, and Greg Mori ICCV2011.
A Unified Framework for Context Assisted Face Clustering
Random Forest Predrag Radenković 3237/10
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Human Action Recognition across Datasets by Foreground-weighted Histogram Decomposition Waqas Sultani, Imran Saleemi CVPR 2014.
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
Proposed concepts illustrated well on sets of face images extracted from video: Face texture and surface are smooth, constraining them to a manifold Recognition.
Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
Transferable Dictionary Pair based Cross-view Action Recognition Lin Hong.
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
Quantifying Generalization from Trial-by-Trial Behavior in Reaching Movement Dan Liu Natural Computation Group Cognitive Science Department, UCSD March,
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.
Active Learning Strategies for Drug Screening 1. Introduction At the intersection of drug discovery and experimental design, active learning algorithms.
Selective Sampling on Probabilistic Labels Peng Peng, Raymond Chi-Wing Wong CSE, HKUST 1.
Latent Boosting for Action Recognition Zhi Feng Huang et al. BMVC Jeany Son.
Bag of Video-Words Video Representation
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Identifying Computer Graphics Using HSV Model And Statistical Moments Of Characteristic Functions Xiao Cai, Yuewen Wang.
Person-Specific Domain Adaptation with Applications to Heterogeneous Face Recognition (HFR) Presenter: Yao-Hung Tsai Dept. of Electrical Engineering, NTU.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
We introduce the use of Confidence c as a weighted vote for the voting machine to avoid low confidence Result r of individual expert from affecting the.
Miguel Reyes 1,2, Gabriel Dominguez 2, Sergio Escalera 1,2 Computer Vision Center (CVC) 1, University of Barcelona (UB) 2
Table 3:Yale Result Table 2:ORL Result Introduction System Architecture The Approach and Experimental Results A Face Processing System Based on Committee.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Jifeng Dai 2011/09/27.  Introduction  Structural SVM  Kernel Design  Segmentation and parameter learning  Object Feature Descriptors  Experimental.
OPTIMIZATION OF FUNCTIONAL BRAIN ROIS VIA MAXIMIZATION OF CONSISTENCY OF STRUCTURAL CONNECTIVITY PROFILES Dajiang Zhu Computer Science Department The University.
Group Sparse Coding Samy Bengio, Fernando Pereira, Yoram Singer, Dennis Strelow Google Mountain View, CA (NIPS2009) Presented by Miao Liu July
Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags Sung Ju Hwang and Kristen Grauman University of Texas at Austin Jingnan.
Beauty is Here! Evaluating Aesthetics in Videos Using Multimodal Features and Free Training Data Yanran Wang, Qi Dai, Rui Feng, Yu-Gang Jiang School of.
Prediction of Molecular Bioactivity for Drug Design Experiences from the KDD Cup 2001 competition Sunita Sarawagi, IITB
Semantic Embedding Space for Zero ­ Shot Action Recognition Xun XuTimothy HospedalesShaogang GongAuthors: Computer Vision Group Queen Mary University of.
Pattern Discovery of Fuzzy Time Series for Financial Prediction -IEEE Transaction of Knowledge and Data Engineering Presented by Hong Yancheng For COMP630P,
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.
Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Online Multiple Kernel Classification Steven C.H. Hoi, Rong Jin, Peilin Zhao, Tianbao Yang Machine Learning (2013) Presented by Audrey Cheong Electrical.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Consensus Group Stable Feature Selection
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Improving Support Vector Machine through Parameter Optimized Rujiang Bai, Junhua Liao Shandong University of Technology Library Zibo , China { brj,
Learning video saliency from human gaze using candidate selection CVPR2013 Poster.
Object Recognition by Discriminative Combinations of Line Segments and Ellipses Alex Chia ^˚ Susanto Rahardja ^ Deepu Rajan ˚ Maylor Leung ˚ ^ Institute.
Max-Confidence Boosting With Uncertainty for Visual tracking WEN GUO, LIANGLIANG CAO, TONY X. HAN, SHUICHENG YAN AND CHANGSHENG XU IEEE TRANSACTIONS ON.
Matching References to Headers in PDF Papers Tan Yee Fan 2007 December 19 WING Group Meeting.
Next, this study employed SVM to classify the emotion label for each EEG segment. The basic idea is to project input data onto a higher dimensional feature.
Improving compound–protein interaction prediction by building up highly credible negative samples Toward more realistic drug-target interaction predictions.
LEARNING IN A PAIRWISE TERM-TERM PROXIMITY FRAMEWORK FOR INFORMATION RETRIEVAL Ronan Cummins, Colm O’Riordan (SIGIR’09) Speaker : Yi-Ling Tai Date : 2010/03/15.
Guillaume-Alexandre Bilodeau
Compact Bilinear Pooling
Bag-of-Visual-Words Based Feature Extraction
Data Driven Attributes for Action Detection
Supervised Time Series Pattern Discovery through Local Importance
Chen Jimena Melisa Parodi Menashe Shalom
Recognizing Humans: Action Recognition
Learning with information of features
CVPR 2014 Orientational Pyramid Matching for Recognizing Indoor Scenes
Topic Oriented Semi-supervised Document Clustering
Objects as Attributes for Scene Classification
CSCI N317 Computation for Scientific Applications Unit Weka
Y2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences 1, Zhizhong.
An Infant Facial Expression Recognition System Based on Moment Feature Extraction C. Y. Fang, H. W. Lin, S. W. Chen Department of Computer Science and.
Xin Qi, Matthew Keally, Gang Zhou, Yantao Li, Zhen Ren
Human-object interaction
Deep Object Co-Segmentation
Presentation transcript:

Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University of Michigan

Outline Introduction Our Contributions Attribute-Based Action Representation Learning Data-Driven Attributes Knowledge Transfer Across Classes Experiments and Discussion

Introduction the traditional approaches for human action recognition the action golf-swinging human actions are better described by action attributes

manually specified attributes ◦ Subjective ◦ 2 – problem Complete ◦ Data – driven Intra-class variably ◦ Latent – variable, SVM

Our Contributions action attributes can be used to improve human action recognition manually-specified attributes latent variables integrates manually-specified and data- driven attributes

useful for recognizing novel action classes without training examples significantly boost traditional action classification

Attribute-Based Action Representation previous works represent actions with low-level features define an action attribute space

Example ◦ five attributes  “translation of torso”, “updown torso motion”, “arm motion”, “arm over shoulder motion”, “leg motion” ◦ action class “walking”  represented by a binary vector {1, 0, 1, 0, 1}

By introducing the attribute layer between the low-level features and action class labels, classifier f which maps x to a class label

Attributes as Latent Variables want to learn a classification model for recognizing an unknown action x Treating attributes as latent variables consider each attribute in the space as latent variables ai ∈ [0, 1]

Goal : learn a classifier fw to predict a new video x

Raw feature : x Class label : y Attributes : a Weight for each feature : w

provides the score measuring how well the raw feature matches the action class

provides the score of an individual attribute, and is used to indicate the presence of an attribute in the video x

captures the co-occurrence of pair of attributes aj and ak

parameter vector w is learned from a training dataset

Learning Data-Driven Attributes manual specification of attributes is subjective data-driven attributes

The Mutual Information (MI) ◦ a good measurement to evaluate the quality of grouping Given two random variables ◦ X ∈ X = {x1, x2,..., xn} ◦ Y ∈ Y = {y1, y2,..., ym} ◦ where X represents a set of visual-words, and Y is a set of action videos MI(X; Y )

Given a set of features Wish to obtain a set of clusters The quality of clustering is measured by the loss of MI

integrate the discovery of data-driven attributes into the framework of latent SVM h ∈ H H is the data-driven attribute space

Knowledge Transfer Across Classes transferring knowledge from known classes (with training examples) to a novel class (without training examples) using this knowledge to recognize instances of the novel class

Experiments and Discussion Datasets and Action Attributes Experimental Results Experiments on Olympic Sports Dataset

Datasets and Action Attributes UIUC Dataset ◦ 532 videos of 14 actions  such as walk, hand-clap, jump-forward … Combining existing datasets into a larger one ◦ KTH dataset  six classes and about 2,300 videos ◦ Weizmann dataset  10 classes and about 100 videos ◦ UIUC Olympic Sports dataset ◦ it is collected from YouTube, it contains realistic human actions

Experimental Results Recognizing novel action classes Attributes boosting traditional action recognition

Recognizing novel action classes use the leave-two-classes-out-cross- validation strategy in experiments on the UIUC dataset each run leave two classes out as novel classes (|Z| = 2)

The average accuracy of leave-two-classes- out-cross-validation on the UIUC dataset for recognizing novel action classes.

Divide the UIUC dataset into two disjoint sets ◦ Y : training set  contains 10 action classes ◦ Z : testing set  contains four classes the testing and training classes share some common attributes

Example (a)

Attributes boosting traditional action recognition using our proposed framework to prove that action attributes do improve performance of traditional action recognition Our results demonstrate that a significant improvement occurs with the use of manually-specified attributes.

To further demonstrate the correlation between manually-specified attributes and data-driven attributes This map is constructed from the training data

Dissimilarity between 100 data-driven attributes (rows) and 34 manually-specified attributes (columns) Colder color has lower value

The effect of removing a set of human-specified attributes some specified attributes (e.g., the human- specified attribute set a = {1, 8, 9, 10, 11}, columns ) are more correlated with data- driven attributes.

◦ “Specified attributes” means only using this type of attributes for recognition ◦ “B” indicates the performance before attributes removal ◦ “A” indicates the performance after removing the attributes. ◦ “Mixed Attributes” means using both manually-specified and data-driven attributes for recognition

Using manually-specified attributes only Remove human-specified attribute set a = {1, 8, 9, 10, 11} the performance from 72% to 64%

Using both manually-specified and data-driven attributes Remove human-specified attribute set a = {1, 8, 9, 10, 11} doesn’t cause an obvious performance decrease

Experiments on Olympic Sports Dataset using the Olympic Sports dataset, which contains 16 action classes and about 781 videos, for recognizing novel action classes and traditional training based recognition

The performance of recognizing novel testing classes Five cases 4 classes are used for testing 12 classes used for training

THANK YOU !