Group Norm for Learning Latent Structural SVMs Overview Daozheng Chen (UMD, College Park), Dhruv Batra (TTI Chicago), Bill Freeman (MIT), Micah K. Johnson.

Slides:



Advertisements
Similar presentations
Self-Paced Learning for Semantic Segmentation
Advertisements

Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
Curriculum Learning for Latent Structural SVM
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
Efficient Large-Scale Structured Learning
Support Vector Machines and Kernels Adapted from slides by Tim Oates Cognition, Robotics, and Learning (CORAL) Lab University of Maryland Baltimore County.
SVM—Support Vector Machines
Lecture 31: Modern object recognition
Learning Structural SVMs with Latent Variables Xionghao Liu.
Robust Multi-Kernel Classification of Uncertain and Imbalanced Data
Jun Zhu Dept. of Comp. Sci. & Tech., Tsinghua University This work was done when I was a visiting researcher at CMU. Joint.
Restrict learning to a model-dependent “easy” set of samples General form of objective: Introduce indicator of “easiness” v i : K determines threshold.
A 4-WEEK PROJECT IN Active Shape and Appearance Models
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
1 Transfer Learning Algorithms for Image Classification Ariadna Quattoni MIT, CSAIL Advisors: Michael Collins Trevor Darrell.
Copyright © Siemens Medical Solutions, USA, Inc.; All rights reserved. Polyhedral Classifier for Target Detection A Case Study: Colorectal Cancer.
Learning of Pseudo-Metrics. Slide 1 Online and Batch Learning of Pseudo-Metrics Shai Shalev-Shwartz Hebrew University, Jerusalem Joint work with Yoram.
Introduction to Machine Learning course fall 2007 Lecturer: Amnon Shashua Teaching Assistant: Yevgeny Seldin School of Computer Science and Engineering.
1 lBayesian Estimation (BE) l Bayesian Parameter Estimation: Gaussian Case l Bayesian Parameter Estimation: General Estimation l Problems of Dimensionality.
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
Bayesian Estimation (BE) Bayesian Parameter Estimation: Gaussian Case
Chapter 3 (part 1): Maximum-Likelihood & Bayesian Parameter Estimation  Introduction  Maximum-Likelihood Estimation  Example of a Specific Case  The.
Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar.
Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.
Purnamrita Sarkar (Carnegie Mellon) Deepayan Chakrabarti (Yahoo! Research) Andrew W. Moore (Google, Inc.)
Loss-based Learning with Weak Supervision M. Pawan Kumar.
Loss-based Learning with Latent Variables M. Pawan Kumar École Centrale Paris École des Ponts ParisTech INRIA Saclay, Île-de-France Joint work with Ben.
Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet KumarPritish MohapatraC. V. Jawahar.
计算机学院 计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知 计算机学院 Perceptron Revisited: Linear Separators Binary classification.
A Generalization of PCA to the Exponential Family Collins, Dasgupta and Schapire Presented by Guy Lebanon.
Modeling Latent Variable Uncertainty for Loss-based Learning Daphne Koller Stanford University Ben Packer Stanford University M. Pawan Kumar École Centrale.
Object Detection with Discriminatively Trained Part Based Models
Optimizing Average Precision using Weakly Supervised Data Aseem Behl IIIT Hyderabad Under supervision of: Dr. M. Pawan Kumar (INRIA Paris), Prof. C.V.
Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.
Stochastic Subgradient Approach for Solving Linear Support Vector Machines Jan Rupnik Jozef Stefan Institute.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.
Max-Margin Classification of Data with Absent Features Presented by Chunping Wang Machine Learning Group, Duke University July 3, 2008 by Chechik, Heitz,
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Unsupervised Learning: Kmeans, GMM, EM Readings: Barber
Tell Me What You See and I will Show You Where It Is Jia Xu 1 Alexander G. Schwing 2 Raquel Urtasun 2,3 1 University of Wisconsin-Madison 2 University.
Dd Generalized Optimal Kernel-based Ensemble Learning for HS Classification Problems Generalized Optimal Kernel-based Ensemble Learning for HS Classification.
Discriminative Sub-categorization Minh Hoai Nguyen, Andrew Zisserman University of Oxford 1.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Learning Chaotic Dynamics from Time Series Data A Recurrent Support Vector Machine Approach Vinay Varadan.
Locally Linear Support Vector Machines Ľubor Ladický Philip H.S. Torr.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
FUZZ-IEEE Kernel Machines and Additive Fuzzy Systems: Classification and Function Approximation Yixin Chen and James Z. Wang The Pennsylvania State.
Learning by Loss Minimization. Machine learning: Learn a Function from Examples Function: Examples: – Supervised: – Unsupervised: – Semisuprvised:
Loss-based Learning with Weak Supervision M. Pawan Kumar.
Spectral Algorithms for Learning HMMs and Tree HMMs for Epigenetics Data Kevin C. Chen Rutgers University joint work with Jimin Song (Rutgers/Palentir),
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Discriminative Machine Learning Topic 4: Weak Supervision M. Pawan Kumar Slides available online
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
Learning Deep Generative Models by Ruslan Salakhutdinov
Generative Adversarial Imitation Learning
Action Recognition in the Presence of One
Object Localization Goal: detect the location of an object within an image Fully supervised: Training data labeled with object category and ground truth.
Group Norm for Learning Latent Structural SVMs
Bilinear Classifiers for Visual Recognition
Multivariate Methods Berlin Chen, 2005 References:
Primal Sparse Max-Margin Markov Networks
An Efficient Projection for L1-∞ Regularization
Presentation transcript:

Group Norm for Learning Latent Structural SVMs Overview Daozheng Chen (UMD, College Park), Dhruv Batra (TTI Chicago), Bill Freeman (MIT), Micah K. Johnson (GelSight, Inc.) Data with complete annotation is rarely ever available. Latent variable models capture interaction between observed data and latent variables Parameter estimation involve a difficult non-convex optimization problem Our goal Estimate model parameters Learn the complexity of latent variable space. Our approach norm for regularization to estimate the parameters of a latent-variable model. Fully trained person models using the PASCAL VOC 2007 data. Each row is a component of the model. Felzenszwalb et al. [8] Root filtersPart filtersPart displacement

Latent Structural SVM Prediction Rule: Group Norm for Learning Latent Structural SVMs Daozheng Chen (UMD, College Park), Dhruv Batra (TTI Chicago), Bill Freeman (MIT), Micah K. Johnson (GelSight, Inc.) Learning objective: FeaturesLabelsLatent variables Joint feature vector

Induce Group Norm Group Norm for Learning Latent Structural SVMs Digit Recognition Key Contribution Images Rotation (Latent Var.) Feature Vector At group level, the norm behave like norm and induces group sparsity. Within each group, the norm behave like norm and does not promote sparsity. Daozheng Chen (UMD, College Park), Dhruv Batra (TTI Chicago), Bill Freeman (MIT), Micah K. Johnson (GelSight, Inc.)

Alternate coordinate and subgradient descent Group Norm for Learning Latent Structural SVMs Daozheng Chen (UMD, College Park), Dhruv Batra (TTI Chicago), Bill Freeman (MIT), Micah K. Johnson (GelSight, Inc.) Minimize Upper bound of learning objective Alternate coordinate and subgradient descent Fixed and minized w.r.t. Subgradient method Only perform one subgradient step Convex if is fixed Fixed and minized w.r.t.

Experiment Group Norm for Learning Latent Structural SVMs Digit recognition experiment: MNIST data (following the setups of Kumar et al. [10] closely) Binary classification on four difficult digit pairs: (1,7), (2,7), (3,8), (8,9) Rotate digit images with angles uniformly distributed from -60 o to 60 o PCA to form 10 dimensional feature vector -60 o -48 o -36 o -24 o -12 o 0o0o 12 o 24 o 36 o 48 o 60 o Daozheng Chen (UMD, College Park), Dhruv Batra (TTI Chicago), Bill Freeman (MIT), Micah K. Johnson (GelSight, Inc.)

Experiment Group Norm for Learning Latent Structural SVMs -60 o -48 o Angles with the highest magnitude Select only a few angles Random sampling needs higher number of angles per digit to give similar accuracy Running time increases linearly as budget increases. Daozheng Chen (UMD, College Park), Dhruv Batra (TTI Chicago), Bill Freeman (MIT), Micah K. Johnson (GelSight, Inc.)