Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen.

Slides:

Advertisements

Similar presentations

Sharing Features Between Visual Tasks at Different Levels of Granularity Sung Ju Hwang 1, Fei Sha 2 and Kristen Grauman 1 1 University of Texas at Austin,

Advertisements

Thomas Berg and Peter Belhumeur

Adding Unlabeled Samples to Categories by Learned Attributes Jonghyun Choi Mohammad Rastegari Ali Farhadi Larry S. Davis PPT Modified By Elliot Crowley.

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.

Large-Scale Object Recognition using Label Relation Graphs Jia Deng 1,2, Nan Ding 2, Yangqing Jia 2, Andrea Frome 2, Kevin Murphy 2, Samy Bengio 2, Yuan.

Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.

Machine learning continued Image source:

Capturing Human Insight for Visual Learning Kristen Grauman Department of Computer Science University of Texas at Austin Work with Sudheendra Vijayanarasimhan,

Max-Margin Matching for Semantic Role Labeling David Vickrey James Connor Daphne Koller Stanford University.

Weiwei Zhang, Jian Sun, and Xiaoou Tang, Fellow, IEEE.

MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…

The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features Kristen Grauman Trevor Darrell MIT.

Learning Visual Similarity Measures for Comparing Never Seen Objects Eric Nowak, Frédéric Jurie CVPR 2007.

Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.

Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.

Graphical Multi-Task Learning Dan Sheldon Cornell University NIPS SISO Workshop 12/12/2008.

Discriminative and generative methods for bags of features

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

On the Relationship between Visual Attributes and Convolutional Networks Paper ID - 52.

1 Transfer Learning Algorithms for Image Classification Ariadna Quattoni MIT, CSAIL Advisors: Michael Collins Trevor Darrell.

Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.

Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.

Constructing Category Hierarchies for Visual Recognition Marcin Marszaklek and Cordelia Schmid.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Discriminative and generative methods for bags of features

Birch: An efficient data clustering method for very large databases

Large Scale Recognition and Retrieval. What does the world look like? High level image statistics Object Recognition for large-scale search Focus on scaling.

Label Tree in Large-Scale Mun Jonghwan. From small to large scale 2.

Joint Image Clustering and Labeling by Matrix Factorization

Bag of Video-Words Video Representation

Step 3: Classification Learn a decision rule (classifier) assigning bag-of-features representations of images to different classes Decision boundary Zebra.

Overcoming Dataset Bias: An Unsupervised Domain Adaptation Approach Boqing Gong University of Southern California Joint work with Fei Sha and Kristen Grauman.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.

1 A Graph-Theoretic Approach to Webpage Segmentation Deepayan Chakrabarti Ravi Kumar

Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,

Comparing Univariate and Multivariate Decision Trees Olcay Taner Yıldız Ethem Alpaydın Department of Computer Engineering Bogazici University

Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.

Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags Sung Ju Hwang and Kristen Grauman University of Texas at Austin Jingnan.

Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.

Prediction of Molecular Bioactivity for Drug Design Experiences from the KDD Cup 2001 competition Sunita Sarawagi, IITB

Sharing Features Between Objects and Their Attributes Sung Ju Hwang 1, Fei Sha 2 and Kristen Grauman 1 1 University of Texas at Austin, 2 University of.

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei Dept. of Computer Science, Princeton University, USA CVPR ImageNet1.

Hierarchical Classification

Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman.

Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.

Optimal Sampling Strategies for Multiscale Stochastic Processes Vinay Ribeiro Rolf Riedi, Rich Baraniuk (Rice University)

1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.

Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &

Boris Babenko, Steve Branson, Serge Belongie University of California, San Diego ICCV 2009, Kyoto, Japan.

Guest lecture: Feature Selection Alan Qi Dec 2, 2004.

2015/12/251 Hierarchical Document Clustering Using Frequent Itemsets Benjamin C.M. Fung, Ke Wangy and Martin Ester Proceeding of International Conference.

Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.

Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.

Iterative K-Means Algorithm Based on Fisher Discriminant UNIVERSITY OF JOENSUU DEPARTMENT OF COMPUTER SCIENCE JOENSUU, FINLAND Mantao Xu to be presented.

Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.

A Fast Kernel for Attributed Graphs Yu Su University of California at Santa Barbara with Fangqiu Han, Richard E. Harang, and Xifeng Yan.

BMVC 2010 Sung Ju Hwang and Kristen Grauman University of Texas at Austin.

Spectral Methods for Dimensionality

Compact Bilinear Pooling

Article Review Todd Hricik.

Trees, bagging, boosting, and stacking

Accounting for the relative importance of objects in image retrieval

Approximate Correspondences in High Dimensions

Learning with information of features

Random feature for sparse signal classification

CS 2750: Machine Learning Support Vector Machines

Biological Classification: How would you group these animals?

Boris Babenko, Steve Branson, Serge Belongie

Presentation transcript:

Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen Grauman (University of Texas at Austin)

Limitation of status quo recognition Until recently, most categorization methods solely relied on the category labels, treating each instance as an isolated entity. Semantic space Visual world Cat Dog Wolf Zebra x x x x

Limitation of status quo recognition Semantic space Cat Wolf Zebra Canine Visual world Dog Pet Wild Similar Dissimilar However, semantic entities exist in relation to others. Larger and finer-grained datasets → more meaningful relations How can we exploit such relations for improved categorization? [Fergus10] Semantic Label Sharing for Learning with Many Categories, R. Fergus, H. Bernal, Y. Weiss, A. Torralba,, ECCV 2010 [Zhao11] Large Scale Category Structure Aware Image Classification, B. Zhao, L. Fei Fei, E. P. Xing, NIPS 2011

Motivation Our focus: a semantic taxonomy 1) Partial alignment between the taxonomy and visual distribution DalmatianWolf Siam. Cat Domestic leopard Wild Tameness DalmatianWolfLeopard Spotted Siam. Cat Pointy Corner Texture Dalmatian Siam. Cat Wolf Canine leopard Feline Animal Biological Appearance Habitat 2) No single ‘optimal’ taxonomy - But, potentially two snags. What information to exploit from multiple taxonomies and how to leverage it?

Idea DalmatianWolf Siam. Cat Domestic leopard Wild Tameness Exploit multiple semantic taxonomies for visual feature learning DalmatianWolfLeopard Spotted Siam. Cat Pointy Corner Texture Dalmatian Siam. Cat Wolf Canine leopard Feline Animal Dog-like shape Cat-face Biological Appearance Habitat SpotPointy corner Indoor setting, person Woods - Taxonomies provide human merge/split criteria - Each taxonomy provides complementary information How do we then, 1. Learn granularity and view specific features on each taxonomy, and 2. Combine learned features across taxonomies for object recognition?

Overview Goal: Learn and combine features across multiple taxonomies DalmatianWolf Siam. Cat Domestic leopard Wild Tameness DalmatianWolfLeopard Spotted Siam. Cat Pointy Corner Texture Dalmatian Siam. Cat Wolf Canine leopard Feline Animal Dog-like shape Cat-face SpotPointy corner Indoor setting, person Woods Dog-like shape Cat-face Spot Pointy corner Indoor setting, person Woods 1. Learn view and granularity specific features at each taxonomy Categorization model 2. Optimally combine learned features for categorization [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Tree of Metrics How to learn granularity- and view- specific features? Siamese cat Persian cat Canine Carnivore Feline Domestic Cat Dalmatian WolfBit cat Intuition: Features useful for the discrimination of the superclasses less useful for subcategory discrimination – Exploit parent-child relationship to isolate features at each node [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Given a taxonomy,we learn a metric for each internal (superclass) node n to discriminate between its subclasses. Tree of Metrics Approach the feature learning problem as hierarchical metric learning with disjoint regularization xixi xjxj xlxl Feline Canine margin [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011 Lighter element has higher value Siamese cat Persian cat Carnivore Domestic Cat Wolf CanineFeline Dalmatian Big cat

Tree of Metrics Siamese cat Persian cat Carnivore Domestic Cat Given a taxonomy,we learn a metric for each internal (superclass) node n to discriminate between its subclasses. CanineFeline Wolf Dalmatian Big cat Approach the feature learning problem as hierarchical metric learning with disjoint regularization [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Further, we learn all metrics simultaneously with two regularizations A sparsity-based regularization to identify informative features. A disjoint regulazation to learn features exclusive to each granularity. Tree of Metrics Siamese cat Persian cat Carnivore Feline Domestic Cat Canine Wolf Dalmatian Big cat [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Regularization Terms to Learn Compact, Discriminative Metrics Minimize the sum of the diagonal entries. Sparsity regularization How can we select few informative features at each node? → Competition between features in a single metric [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Regularization Terms to Learn Compact, Discriminative Metrics How can we regularize each metric to use features disjoint from its ancestors? Disjoint regularization Enforce two metrics not to have large value at the same time, for the same feature. Ancestor Descendant → Competition between ancestors and descendants Both regularizers are convex [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011

Overview Goal: Learn and combine features across multiple taxonomies DalmatianWolf Siam. Cat Domestic leopard Wild Tameness DalmatianWolfLeopard Spotted Siam. Cat Pointy Corner Texture Dalmatian Siam. Cat Wolf Canine leopard Feline Animal Dog-like shape Cat-face SpotPointy corner Indoor setting, person Woods Dog-like shape Cat-faceSpot Pointy corner Indoor setting, person Woods 1. Learn view and granularity specific features at each taxonomy 2. Optimally combine learned features for categorization Categorization model [Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012

Semantic Kernel Forest DalmatianWolf Siam. Cat Domestic leopard Wild Tameness DalmatianWolfLeopard Spotted leopard Pointy Corner Texture Dalmatian Siam. Cat Wolf Canine leopard Feline Animal Biological Appearance Habitat From multiple ToMs, we obtain a semantic kernel forest, a set of non-linear view- and granularity- specific feature spaces Compute RBF kernel on the distance computed using the learned metric [Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012

Semantic Kernel Forest DalmatianWolf Siam. Cat Domestic leopard Wild Tameness DalmatianWolfLeopard Spotted leopard Pointy Corner Texture Dalmatian Siam. Cat Wolf Canine leopard Feline Animal Biological Appearance Habitat How to combine the learned kernel forest for optimal discrimination? Obtain class specific kernel by linearly combining kernels on the tree paths. multiple kernel learning Consider only a small fraction of relevant kernels – O(TlogN)

Proposed Sparse Hierarchical Regularization Dalmatian Siam. Cat Wolfleopard Feline Animal Biological DalmatianWolf Siam. Cat Domestic leopard Wild Habitat Usual L1 regularization: selects few useful kernels Multiple taxonomies may provide some redundant kernels Canine Tameness - Interleaved selection of kernels Are all kernels equal?

Hierarchical regularization - weight of a node must be larger than its children’s Dalmatian Siam. Cat Wolf Canine leopard Feline Animal Biological < Proposed Sparse Hierarchical Regularization - Implicitly enforce hierarchical structure among kernels DalmatianWolf Siam. Cat Domestic leopard Wild Habitat < Tameness Multiple taxonomies provide redundant kernels - Higher level kernels discriminate with more categories

Optimization for Semantic Kernel Forest Nonsmooth due to the hierarchical regularization term MKL objective - Use projected subgradient to optimize We minimize the sum of the MKL objective + regularization term Sparsity Reg.Hierarchical regularization [Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012

Datasets AWA-10 -6,180 images -10 animal classes -Fine-grained Imagenet ,957 images -20 non-animal classes -Coarser-grained (a) Wordnet(b) Appearance(c) Behavior(d) Habitat (a) Wordnet (b) Visual (c) Attributes Constructed on different attribute groups

Multiclass Classification Results MethodAWA-4AWA-10Imagenet-20 Raw feature kernel47.67 ± ± ± 1.45 Raw feature kernel + MKL48.50 ± ± ± 1.50 Perturbed semantic kernel treeN/A31.53 ± ± 2.02 We compare to three baselines - Raw feature kernel: RBF kernel computed on the original image features - Raw feature kernel + MKL: MKL with RBF kernels with different bandwidth. - Perturbed semantic kernel tree: Semantic kernel forest on randomly permuted taxonomy.

Multiclass Classification Results MethodAWA-4AWA-10Imagenet-20 Raw feature kernel47.67 ± ± ± 1.45 Raw feature kernel + MKL48.50 ± ± ± 1.50 Perturbed semantic kernel treeN/A31.53 ± ± 2.02 Semantic kernel tree + Avg47.17 ± ± ± 1.61 Semantic kernel tree + MKL48.89 ± ± ± 1.26 Semantic kernel tree + MKL-H50.06 ± ± ± 0.70 Semantic kernel tree (ToM) > perturbed kernel tree - Semantic kernel tree + Avg: Averged semantic kernel tree on a single taxonomy - Semantic kernel tree + MKL: MKL on a single taxonomy only with sparsity reg. - Semantic kernel tree + MKL-H: with both sparsity and hierarchical regularization. - More meaningful grouping/splits for object categorization

Multiclass Classification Results MethodAWA-4AWA-10Imagenet-20 Raw feature kernel47.67 ± ± ± 1.45 Raw feature kernel + MKL48.50 ± ± ± 1.50 Perturbed Semantic kernel treeN/A31.53 ± ± 2.02 Semantic kernel tree + Avg47.17 ± ± ± 1.61 Semantic kernel tree + MKL48.89 ± ± ± 1.26 Semantic kernel tree + MKL-H50.06 ± ± ± 0.70 Semantic kernel forest+MKL49.67 ± ± ± 1.14 Semantic kernel forest+MKL-H52.83 ± ± ± Semantic kernel forest+MKL: MKL with kernels learned on multiple taxonomies, with only the sparsity regularization - Semantic kernel forest+MKL-H: with both sparsity and hierarchical regularization. Multiple taxonomies > a single taxonomy - Each taxonomy provides complementary information

Multiclass Classification Results MethodAWA-4AWA-10Imagenet-20 Raw feature kernel47.67 ± ± ± 1.45 Raw feature kernel + MKL48.50 ± ± ± 1.50 Perturbed Semantic kernel treeN/A31.53 ± ± 2.02 Semantic kernel tree + Avg47.17 ± ± ± 1.61 Semantic kernel tree + MKL48.89 ± ± ± 1.26 Semantic kernel tree + MKL-H50.06 ± ± ± 0.70 Semantic kernel forest+MKL49.67 ± ± ± 1.14 Semantic kernel forest+MKL-H52.83 ± ± ± 1.00 Hierarchical regularizer > Standard L1 regularization - Regularizer’s effect is minimal on the semantic kernel tree, which lacks redundancy - Good to consider the structure of the feature spaces

Confusion matrices on 4 animal classes Blue: Low confusion Red: High confusion Each taxonomy is suboptimal, but provides complementary information which could be optimally leveraged with MKL CanineFeline Animal Biological DalmatianWolfSiam. CatLeopard Spotted Pointy Ear Animal Appearance Dalmatian Wolf Siam. Cat Leopard DomesticWild Animal DalmatianSiam. CatLeopardWolf Habitat

Lower Wordnet Higher Appearance BehaviorHabitat Effect of hierarchical regularization Sparsity regularization only: 34.33Sparsity+ Hierarchical: Hierarchical regularizer avoids overfitting with implicit structure enforced among kernels procyonidfeline even-toed aquatic carnivore placental cat/rat hairless ~panda appearance racoon/rat land aquatic predator/prey behaviorjungle nonjungle aquatic land habitat

Key message: semantic taxonomies for visual feature learning Summary - Exploits disjoint sparsity between parent and child classes in a taxonomy: Tree of Metrics - Leverages complementary information from multiple semantic taxonomies: Semantic Kernel Forest - Novel regularizers that exploit category relations - Disjoint regularizer that exploits parent-child relationship to learn disjoint features. - Hierarchical regularizer that favors upper level kernels. [Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012 [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011 Intuition Competing features between parent and child Complementary across different semantic views Learning Methods Disjoint and hierarchical regularizer for competing features MKL with hierarchical regularizer for complementary features

Key message: Semantic taxonomies for visual feature learning Summary [Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012 [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011 Intuition: Competing features between parent and child Complementary across different semantic views Learning methods: Disjoint regularizer for competing features MKL with hierarchical regularizer for complementary features

A single taxonomy often improves performance on some classes, at the expense of others. - Individual taxonomy suboptimal. Habitat - Better for h. whale - Worse for panda Wordnet - Better for panda - Worse for h. whale All - Better for both Semantic kernel forest takes the best of both through learned combination. Per-class results

Idea Learn non-linear feature space for each view and granularity, that splits the categories according to each merge/split criteria Dog-like shape Cat-face Canine vs. Feline DalmatianWolf Siam. Cat Domestic leopard Wild Tameness DalmatianWolfLeopard Spotted Siam. Cat Pointy Corner Texture Appearance Habitat SpotPointy corner Indoor setting, person Woods

Idea Dog-like shape Cat-face Canine vs. Feline DalmatianWolf Siam. Cat Domestic leopard Wild Tameness Habitat Indoor setting, person Woods Spot vs. Pointy corner SpotPointy corner Learn non-linear feature space for each view and granularity, that splits the categories according to each merge/split criteria

Idea Dog-like shape Cat-face Canine vs. Feline Spot vs. Pointy corner SpotPointy corner Learn non-linear feature space for each view and granularity, that splits the categories according to each merge/split criteria Domestic vs. Wild Indoor setting, person Woods

Idea Dog-like shape Cat-face Spot vs. Pointy corner Domestic vs. Wild SpotPointy corner Indoor setting, person Woods Canine vs. Feline Then, combine the feature space to obtain an optimally discriminative space for categorization. Combined feature space Combined feature space How do we then, - learn such features, and - optimally combine them?