Attribute Learning for Understanding Unstructured Social Activity

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
CSI :Florida A BAYESIAN APPROACH TO LOCALIZED MULTI-KERNEL LEARNING USING THE RELEVANCE VECTOR MACHINE R. Close, J. Wilson, P. Gader.
Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.
Learning Specific-Class Segmentation from Diverse Data M. Pawan Kumar, Haitherm Turki, Dan Preston and Daphne Koller at ICCV 2011 VGG reading group, 29.
Ke Chen 1, Shaogang Gong 1, Tao Xiang 1, Chen Change Loy 2 1. Queen Mary, University of London 2. The Chinese University of Hong Kong VGG reading group.
Adding Unlabeled Samples to Categories by Learned Attributes Jonghyun Choi Mohammad Rastegari Ali Farhadi Larry S. Davis PPT Modified By Elliot Crowley.
Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.
Random Forest Predrag Radenković 3237/10
Co Training Presented by: Shankar B S DMML Lab
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
ICDM, Shenzhen, 2014 Flu Gone Viral: Syndromic Surveillance of Flu on Twitter using Temporal Topic Models Liangzhe Chen, K. S. M. Tozammel Hossain, Patrick.
Supervised Learning Techniques over Twitter Data Kleisarchaki Sofia.
KE CHEN 1, SHAOGANG GONG 1, TAO XIANG 1, CHEN CHANGE LOY 2 1. QUEEN MARY, UNIVERSITY OF LONDON 2. THE CHINESE UNIVERSITY OF HONG KONG CUMULATIVE ATTRIBUTE.
Simultaneous Image Classification and Annotation Chong Wang, David Blei, Li Fei-Fei Computer Science Department Princeton University Published in CVPR.
Partitioned Logistic Regression for Spam Filtering Ming-wei Chang University of Illinois at Urbana-Champaign Wen-tau Yih and Christopher Meek Microsoft.
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
Middle Term Exam 03/01 (Thursday), take home, turn in at noon time of 03/02 (Friday)
Naïve Bayes Classifier
Discriminative Segment Annotation in Weakly Labeled Video Kevin Tang, Rahul Sukthankar Appeared in CVPR 2013 (Oral)
Personalized Search Result Diversification via Structured Learning
Self-Supervised Segmentation of River Scenes Supreeth Achar *, Bharath Sankaran ‡, Stephen Nuske *, Sebastian Scherer *, Sanjiv Singh * * ‡
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Exploration & Exploitation in Adaptive Filtering Based on Bayesian Active Learning Yi Zhang, Jamie Callan Carnegie Mellon Univ. Wei Xu NEC Lab America.
Scalable Text Mining with Sparse Generative Models
Jacinto C. Nascimento, Member, IEEE, and Jorge S. Marques
Text Classification With Labeled and Unlabeled Data Presenter: Aleksandar Milisic Supervisor: Dr. David Albrecht.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Forecasting with Twitter data Presented by : Thusitha Chandrapala MARTA ARIAS, ARGIMIRO ARRATIA, and RAMON XURIGUERA.
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
Computer vision: models, learning and inference Chapter 6 Learning and Inference in Vision.
Naïve Bayes Classifier Ke Chen Extended by Longin Jan Latecki COMP20411 Machine Learning.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Learning with Positive and Unlabeled Examples using Weighted Logistic Regression Wee Sun Lee National University of Singapore Bing Liu University of Illinois,
Example 16,000 documents 100 topic Picked those with large p(w|z)
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
Building Face Dataset Shijin Kong. Building Face Dataset Ramanan et al, ICCV 2007, Leveraging Archival Video for Building Face DatasetsLeveraging Archival.
DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
One-class Training for Masquerade Detection Ke Wang, Sal Stolfo Columbia University Computer Science IDS Lab.
Semantic Embedding Space for Zero ­ Shot Action Recognition Xun XuTimothy HospedalesShaogang GongAuthors: Computer Vision Group Queen Mary University of.
Transfer Learning for Image Classification Group No.: 15 Group member : Feng Cai Sauptik Dhar Sauptik.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Bayesian Generalized Kernel Mixed Models Zhihua Zhang, Guang Dai and Michael I. Jordan JMLR 2011.
Powerpoint Templates Page 1 Powerpoint Templates Scalable Text Classification with Sparse Generative Modeling Antti PuurulaWaikato University.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Latent Dirichlet Allocation
Multi-label Prediction via Sparse Infinite CCA Piyush Rai and Hal Daume III NIPS 2009 Presented by Lingbo Li ECE, Duke University July 16th, 2010 Note:
Introduction to Gaussian Process CS 478 – INTRODUCTION 1 CS 778 Chris Tensmeyer.
Debrup Chakraborty Non Parametric Methods Pattern Recognition and Machine Learning.
Text-classification using Latent Dirichlet Allocation - intro graphical model Lei Li
Shape2Pose: Human Centric Shape Analysis CMPT888 Vladimir G. Kim Siddhartha Chaudhuri Leonidas Guibas Thomas Funkhouser Stanford University Princeton University.
Collaborative Deep Learning for Recommender Systems
System for Semi-automatic ontology construction
An Additive Latent Feature Model
Non-Parametric Models
Lecture 15: Text Classification & Naive Bayes
Adversarially Tuned Scene Generation
Machine Learning Week 1.
Open-Category Classification by Adversarial Sample Generation
Generative Models and Naïve Bayes
Discriminative Frequent Pattern Analysis for Effective Classification
Michal Rosen-Zvi University of California, Irvine
Generative Models and Naïve Bayes
MAS 622J Course Project Classification of Affective States - GP Semi-Supervised Learning, SVM and kNN Hyungil Ahn
Unsupervised learning of visual sense models for Polysemous words
Jinwen Guo, Shengliang Xu, Shenghua Bao, and Yong Yu
Report 7 Brandon Silva.
Presentation transcript:

Attribute Learning for Understanding Unstructured Social Activity Yanwei Fu, Timothy M. Hospedales, Tao Xiang, and Shaogang Gong School of EECS, Queen Mary University of London, UK Presented by Amr El-Labban VGG Reading Group, Dec 5th 2012

Contributions Unstructured social activity attribute (USAA) dataset Semi-latent attribute space Topic model based attribute learning

Objective Automatic classification of unstructured group social activity Use an attribute based approach Start with sparse, user defined attributes Add latent ones Learn jointly

Dataset 1500 videos, 8 classes 69 visual/audio attributes manually labelled Weak labelling SIFT, STIP and MFCC features used Data available (features, attributes, YouTube IDs)

Classification Standard classification Attribute based 𝐹 : 𝑋 𝑑 𝑍 𝐹 : 𝑋 𝑑 𝑍 Attribute based 𝐹=𝑆 𝐿 . , 𝐿: 𝑋 𝑑 𝑌 𝑝 ,𝑆: 𝑌 𝑝 𝑍 Map raw data to intermediate, lower dimensional space, then to classes.

Semi-Latent Attribute Space Space consisting of: User defined attributes Discriminative latent attributes Non-discriminative (background) latent attributes

= Topic modelling 𝑃 𝑥 𝑖 𝑑 𝑗 = 𝑘 𝑃 𝑥 𝑖 | 𝑦 𝑘 𝑃( 𝑦 𝑘 | 𝑑 𝑗 ) d y d x x y 𝑃 𝑥 𝑖 𝑑 𝑗 = 𝑘 𝑃 𝑥 𝑖 | 𝑦 𝑘 𝑃( 𝑦 𝑘 | 𝑑 𝑗 ) d y d x x y = P(x|d) P(x|y) P(y|d) x – low level features (‘words’) y – attributes (‘topics’) d – ‘documents’

Latent Dirichlet Allocation y x x – low level features y – attributes (user defined and latent) θ – attribute distribution φ – word distribution α, β – Dirichlet parameters

Aside: Dirichlet disribution Distribution over multinomial distributions Parameterised by α α = (6,2,2) α = (3,7,5) α = (2,3,4) α = (6,2,6)

Aside: Dirichlet disribution Important things to know: α 0 = α 𝑖 𝐸 𝑋 𝑖 = α 𝑖 α 0 - peak is closer to larger α values 𝑉𝑎𝑟 𝑋 𝑖 = α 𝑖 ( α 0 − α 𝑖 ) α 0 2 ( α 0 +1) - large α gives small variance α<1 gives more sparse distributions

Latent Dirichlet Allocation y x x – low level features y – attributes (user defined and latent) θ – attribute distribution φ – word distribution α, β – Dirichlet parameters

Latent Dirichlet Allocation y x Generative model for each document: Choose θ ~ Dir(α) Choose φ ~ Dir(β) for each word: Choose y ~ Multinomial(θ) Choose x ~ Multinomial(φ y)

Latent Dirichlet Allocation y x 𝑃 𝐷 𝛼,𝛽 = 𝑘=1 𝐾 𝑃( 𝜑 𝑘 |𝛽) 𝑚=1 𝑀 𝑃( 𝜃 𝑚 |𝛼) 𝑛=1 𝑁 𝑃 𝑦 𝑚,𝑛 𝜃 𝑚 𝑃( 𝑥 𝑚,𝑛 | 𝜑 𝑦 𝑚,𝑛 )

Latent Dirichlet Allocation y x EM to learn Dirichlet parameters: α,β Approximate inference for posterior: 𝑃(θ,𝑦 |𝑥,α,β)

SLAS User defined part Latent part Per instance prior on α. Set to zero when attribute isn’t present in ground truth Latent part First half “class conditional” One α per class. All but one constrained to zero. Second half “background” Unconstrained

Classification Use SLAS posterior to map from raw data to attributes Use standard classifier (logistic regression) from attributes to classes

N-shot transfer learning Split data into two partitions – source and target Learn attribute models on source data Use N examples from target to learn attribute-class mapping

Zero-shot learning Detect novel class Manually defined attribute-class “prototype” Improve with self-training algorithm: Infer attributes for novel data NN matching in user defined space against protoype For each novel class: Find top K matches Train new prototype in full attribute space (mean of top K) NN matching in the full space

Experiments Compare three models: Direct: KNN or SVM on raw data SVM-UD+LR: SVM to map raw data to attributes, LR maps attributes to classes SLAS+LR: SLAS to map raw data to attributes, LR learns classes based on user-defined and class conditional attributes.

MASSIVE HACK “The UD part of the SLAS topic profile is estimating the same thing as the SVM attribute classifiers, however the latter are slightly more reliable due to being discriminatively optimised. As input to LR, we therefore actually use the SVM attribute classier outputs in conjunction with the latent part of our topic profile.”

Results - classification SLAS+LR better as number if training data and user defined attributes decreases Copes with 25% wrong attribute bits

Results - classification KNN and SVM have vertical bands – consistent misclassification

Results – N-shot transfer learning Vary number of user defined attributes SVM+LR cannot cope with zero attributes

Results – Zero-shot transfer learning Two cases: Continuous prototype – mean attribute profile Binary prototype – thresholded mean Tested without background latent attributes (SLAS(NF))

Conclusion Augmenting SVM and user defined attributes with latent ones definitely helps. Experimental hacks make it hard to say how good the model really is…