CVPR 2014 Orientational Pyramid Matching for Recognizing Indoor Scenes

Slides:

Advertisements

Similar presentations

Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.

Advertisements

Aggregating local image descriptors into compact codes

Three things everyone should know to improve object retrieval

Presented by Xinyu Chang

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.

Multi-layer Orthogonal Codebook for Image Classification Presented by Xia Li.

CS395: Visual Recognition Spatial Pyramid Matching Heath Vinicombe The University of Texas at Austin 21 st September 2012.

1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.

Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,

Ziming Zhang *, Ze-Nian Li, Mark Drew School of Computing Science, Simon Fraser University, Vancouver, B.C., Canada {zza27, li, Learning.

Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,

Image alignment Image from

Bag of Features Approach: recent work, using geometric information.

Robust and large-scale alignment Image from

Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.

1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.

Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.

Spatial Pyramid Pooling in Deep Convolutional

Student: Kylie Gorman Mentor: Yang Zhang COLOR-ATTRIBUTES- RELATED IMAGE RETRIEVAL.

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

Global and Efficient Self-Similarity for Object Classification and Detection CVPR 2010 Thomas Deselaers and Vittorio Ferrari.

Review: Intro to recognition Recognition tasks Machine learning approach: training, testing, generalization Example classifiers Nearest neighbor Linear.

Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,

Exercise Session 10 – Image Categorization

Real-time Action Recognition by Spatiotemporal Semantic and Structural Forest Tsz-Ho Yu, Tae-Kyun Kim and Roberto Cipolla Machine Intelligence Laboratory,

Multiclass object recognition

Computer vision.

Watch, Listen and Learn Sonal Gupta, Joohyun Kim, Kristen Grauman and Raymond Mooney -Pratiksha Shah.

Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.

Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,

A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)

Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.

Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.

UNBIASED LOOK AT DATASET BIAS Antonio Torralba Massachusetts Institute of Technology Alexei A. Efros Carnegie Mellon University CVPR 2011.

Locality-constrained Linear Coding for Image Classification

P ROBING THE L OCAL -F EATURE S PACE OF I NTEREST P OINTS Wei-Ting Lee, Hwann-Tzong Chen Department of Computer Science National Tsing Hua University,

Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &

Hierarchical Matching with Side Information for Image Classification

Recognition Using Visual Phrases

Classifying Covert Photographs CVPR 2012 POSTER. Outline  Introduction  Combine Image Features and Attributes  Experiment  Conclusion.

Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.

1.Learn appearance based models for concepts 2.Compute posterior probabilities or Semantic Multinomial (SMN) under appearance models. -But, suffers from.

SUN Database: Large-scale Scene Recognition from Abbey to Zoo Jianxiong Xiao *James Haysy Krista A. Ehinger Aude Oliva Antonio Torralba Massachusetts Institute.

776 Computer Vision Jan-Michael Frahm Spring 2012.

NICTA SML Seminar, May 26, 2011 Modeling spatial layout for image classification Jakob Verbeek 1 Joint work with Josip Krapac 1 & Frédéric Jurie 2 1: LEAR.

Another Example: Circle Detection

Compact Bilinear Pooling

Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.

Bag-of-Visual-Words Based Feature Extraction

An Additive Latent Feature Model

Learning Mid-Level Features For Recognition

Nonparametric Semantic Segmentation

Part-Based Room Categorization for Household Service Robots

Paper Presentation: Shape and Matching

ICCV Hierarchical Part Matching for Fine-Grained Image Classification

Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT

By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

ICMR Image Classification and Retrieval are ONE (Online NN Estimation)

Speaker: Lingxi Xie Authors: Lingxi Xie, Qi Tian, Bo Zhang

IEEE ICIP Feature Normalization for Part-Based Image Classification

CS 1674: Intro to Computer Vision Scene Recognition

Data-driven Depth Inference from a Single Still Image

Fine-Grained Visual Categorization

KFC: Keypoints, Features and Correspondences

Objects as Attributes for Scene Classification

Outline Background Motivation Proposed Model Experimental Results

A Graph-Matching Kernel for Object Categorization

Presentation transcript:

CVPR 2014 Orientational Pyramid Matching for Recognizing Indoor Scenes Speaker: Lingxi Xie Authors: Lingxi Xie, Jingdong Wang, Baining Guo, Bo Zhang, Qi Tian State Key Laboratory of Intelligent Technology and Systems Department of Computer Science and Technology Tsinghua University http://www.tsinghua.edu.cn

Outline Introduction The Bag-of-Feature Model Orientational Pyramid Matching Experimental Results Analysis Conclusions 11/21/2018 CVPR 2014 - Presentation

Outline Introduction The Bag-of-Feature Model Orientational Pyramid Matching Experimental Results Analysis Conclusions 11/21/2018 CVPR 2014 - Presentation

Scene Recognition A basic task towards image understanding The Scale is Getting Larger! UIUC Sport-8 Dataset: 8 Categories, 1573 Images Scene-15 Dataset: 15 Categories, 4485 Images MIT Indoor-67 Dataset: 67 Categories, 15620 Images SUN-397 Dataset: 397 Categories, 100K+ Images Different from Object Recognition Prefer Global Features to Local Features 11/21/2018 CVPR 2014 - Presentation

Outline Introduction The Bag-of-Feature Model Orientational Pyramid Matching Experimental Results Analysis Conclusions 11/21/2018 CVPR 2014 - Presentation

Image-level Vector Compact Feature Codes Visual Vocabulary Spatial Pooling: Sum Pooling/Max Pooling, Spatial Pyramid Matching [Lazebnik, CVPR06] Geometric Phrase Pooling [Xie, ACMMM12] Compact Feature Codes Hard/Soft/Sparse Coding methods: Vector Quantization ScSPM encoding [Yang, CVPR09] LLC encoding [Wang, CVPR10] Visual Vocabulary Clustering methods: K-Means Hierarchical K-Means [Nister, CVPR06] Approximate K-Means [Philbin, CVPR07] Image Descriptors Gradient-based local descriptors: SIFT [Lowe, IJCV04] HOG[Dalal, CVPR05] Raw Image 11/21/2018 CVPR 2014 - Presentation

Image-level Vector Compact Feature Codes Visual Vocabulary Spatial Pooling: Sum Pooling/Max Pooling, Spatial Pyramid Matching [Lazebnik, CVPR06] Geometric Phrase Pooling [Xie, ACMMM12] Compact Feature Codes Hard/Soft/Sparse Coding methods: Vector Quantization ScSPM encoding [Yang, CVPR09] LLC encoding [Wang, CVPR10] Visual Vocabulary Clustering methods: K-Means Hierarchical K-Means [Nister, CVPR06] Approximate K-Means [Philbin, CVPR07] Image Descriptors Gradient-based local descriptors: SIFT [Lowe, IJCV04] HOG[Dalal, CVPR05] Raw Image 11/21/2018 CVPR 2014 - Presentation

Spatial Pyramid Matching (SPM) = Part 1 [Lazebnik, CVPR06] = Part 2 = Part 3 = Part 4 = Part 5 11/21/2018 CVPR 2014 - Presentation

Outline Introduction The Bag-of-Feature Model Orientational Pyramid Matching Experimental Results Analysis Conclusions 11/21/2018 CVPR 2014 - Presentation

What’s the Main Difference between the Confusing Pairs? Examples Motivation classroom meeting room What’s the Main Difference between the Confusing Pairs? Examples The Orientation of Chairs! inside bus inside subway 11/21/2018 CVPR 2014 - Presentation

Spatial Distribution of Chairs are Very Confusing! Motivation 0.0 upper-left upper-right lower-left lower-right 0.2 0.4 0.6 0.8 1.0 spatial histogram classroom meeting room 0.0 upper-left upper-right lower-left lower-right 0.2 0.4 0.6 0.8 1.0 spatial histogram inside bus inside subway Spatial Distribution of Chairs are Very Confusing! 11/21/2018 CVPR 2014 - Presentation

Orientational Distribution of Chairs are More Dipersive! Motivation 0.0 front left back right 0.2 0.4 0.6 0.8 1.0 orient. histogram classroom meeting room 0.0 front left back right 0.2 0.4 0.6 0.8 1.0 orient. histogram inside bus inside subway Orientational Distribution of Chairs are More Dipersive! 11/21/2018 CVPR 2014 - Presentation

Spatial Pooling Revisited space y x 11/21/2018 CVPR 2014 - Presentation

Orientational Pooling space ϕ θ 11/21/2018 CVPR 2014 - Presentation

Orientational Pooling space ϕ θ 11/21/2018 CVPR 2014 - Presentation

Spatial vs. Orientational space orientational space y ϕ x θ 11/21/2018 CVPR 2014 - Presentation

How to Estimate 3D Orientation? Still an Open Problem Very Challenging, Especially for Small Patches Our Solution: Data-driven Approach Illumination might Help! SIFT Feature: Capturing the Gradients? Very Rough, Far from Perfect Possible Solutions Geometric: Vanishing Points, Line Detection, ... 3D Modeling: Kinect, ... 11/21/2018 CVPR 2014 - Presentation

Training Image Annotation Conti-nued Planes Plane Orien-tation Land- mark Points d = (x, y, z) 11/21/2018 CVPR 2014 - Presentation

Training Patch Collection Non-Planar Patch Planar Patch 11/21/2018 CVPR 2014 - Presentation

Orientation Assignment Planar Patches Non-Planar Patches New Patch Planar Patch! Find K (5) NN Patches 11/21/2018 CVPR 2014 - Presentation

Orientation Assignment Planar Patches Non-Planar Patches New Patch Non-Planar Patch! Find K (5) NN Patches 11/21/2018 CVPR 2014 - Presentation

Statistics 100,000 Patches are Collected In the Testing Process 50000 Planar 50000 Non-Planar In the Testing Process K is set as 101 About 50% Patches are Classified as Non-Planar These Patches are Simply Ignored (not used in Pooling) 11/21/2018 CVPR 2014 - Presentation

Outline Introduction The Bag-of-Feature Model Orientational Pyramid Matching Experimental Results Analysis Conclusions 11/21/2018 CVPR 2014 - Presentation

Datasets MIT Indoor-67 Dataset SUN-397 Dataset Currently Largest Indoor Scene Dataset 67 Categories, 15620 Images 80 trainings, 20 testings, per Category SUN-397 Dataset A Large-scale Scene Recognition Task 397 Categories, More than 100K Images 50 trainings, 50 testings, per Category 11/21/2018 CVPR 2014 - Presentation

MIT Indoor-67 Comparison Algorithm Accuracy Kobayashi et.al, CVPR13 58.91 Juneja et.al, CVPR13 63.10 Ours (SPM) 61.22 Ours (OPM) 51.45 Ours (SPM+OPM) 63.48 11/21/2018 CVPR 2014 - Presentation

SUN-397 Comparison Algorithm Accuracy Xiao et.al, CVPR10 38.0 Sanchez et.al, IJCV13 43.2 Ours (SPM) 43.58 Ours (OPM) 34.61 Ours (SPM+OPM) 45.91 11/21/2018 CVPR 2014 - Presentation

Caltech101 Comparison Algorithm Accuracy Chatfield et.al, BMVC11 77.78 Jia et.al, CVPR12 75.3 Ours (SPM) 80.73 Ours (OPM) 65.59 Ours (SPM+OPM) 81.75 11/21/2018 CVPR 2014 - Presentation

Outline Introduction The Bag-of-Feature Model Orientational Pyramid Matching Experimental Results Analysis Conclusions 11/21/2018 CVPR 2014 - Presentation

Further Analysis On the MIT Indoor-67 Dataset Observing the Confusion Matrix Two Comparisons Confusing Pairs Better Discriminated by SPM or OPM A Confusing Group Better Classified after Combining SPM and OPM 11/21/2018 CVPR 2014 - Presentation

Confusing Pairs Category Pairs better Discriminated by SPM game room vs. garage (+4.78%) restaurant kitchen vs. studio music (+4.64%) library vs. clothing store (+4.62%) Category Pairs better Discriminated by OPM bookstore vs. library (+6.69%) auditorium vs. concert hall (+5.35%) computer room vs. office (+4.40%) 11/21/2018 CVPR 2014 - Presentation

are More Discriminative! Pairs better with SPM game room garage tables OPM SPM +4.78% restr. kitchen studio music cabinets Spatial Distribution are More Discriminative! OPM SPM +4.64% library clothin. store shelves OPM SPM +4.62% 11/21/2018 CVPR 2014 - Presentation

Orientational Distribution are More Discriminative! Pairs better with OPM bookstore library shelves OPM SPM +6.69% auditorium concerthall chairs Orientational Distribution are More Discriminative! OPM SPM +5.35% comptr.room office computers OPM SPM +4.40% 11/21/2018 CVPR 2014 - Presentation

Confusing Group A Category Group with Confusing Concepts bedroom, children room, computer room, hospital room, living room, meeting room, office, waiting room SPM can not Discriminate Very Well SPM+OPM Works Much Better! 11/21/2018 CVPR 2014 - Presentation

Decreased Off-diagonal Element Increased On-diagonal Element Confusing Group bedroom classroom compt. room hosptl. room living room meetn. room office waitn. room bedroom classroom compt. room hosptl. room living room meetn. room office waitn. room bedroom 44.6 1.1 2.7 14.6 23.4 2.8 4.8 6.0 bedroom 52.7 0.6 1.2 11.7 22.6 1.8 4.1 5.3 classroom 0.9 71.8 8.5 2.1 1.8 7.9 3.6 3.3 classroom 0.9 80.9 5.8 0.6 3.0 4.8 0.6 3.3 compt. room 2.6 15.6 44.1 9.4 5.9 6.8 8.2 7.4 compt. room 5.1 19.1 44.1 4.1 3.5 3.2 10.3 4.1 hosptl. room 11.0 1.4 5.2 61.4 3.3 3.3 8.1 6.2 hosptl. room 6.2 1.4 2.4 71.4 0.5 7.1 5.2 5.7 living room 18.2 4.6 3.4 5.0 51.0 3.6 8.8 5.5 living room 17.1 2.5 2.6 4.7 58.2 2.6 6.8 5.3 meetn. room 3.6 12.2 9.1 11.0 5.5 45.5 3.9 9.3 meetn. room 1.4 10.8 4.6 6.7 6.9 59.5 3.3 6.9 office 5.9 9.0 19.3 8.6 16.2 8.3 27.2 5.5 office 5.5 6.9 14.8 6.9 15.2 6.2 36.2 8.3 waitn. room 8.6 5.6 10.0 13.1 10.3 9.4 8.9 34.6 waitn. room 5.4 3.9 5.9 10.6 8.9 9.3 8.7 47.3 Decreased Off-diagonal Element Increased On-diagonal Element 11/21/2018 CVPR 2014 - Presentation

Summarization Orientational Features are Useful OPM Provides Complementary Information SPM+OPM Achieves State-of-the-Art Accuracy 11/21/2018 CVPR 2014 - Presentation

Outline Introduction The Bag-of-Feature Model Orientational Pyramid Matching Experimental Results Analysis Conclusions 11/21/2018 CVPR 2014 - Presentation

Main Contributions A Novel Proposal to Scene Recognition Orientational Features are Useful! Spatial and Orientational Clues are Complementary A Neat Algorithm for Orientational Features Data-driven Method for Orientation Assignment Pyramid Matching for Orientational Pooling Might be Cooperated with Multiple Algorithms 11/21/2018 CVPR 2014 - Presentation

Conclusions and Future Work Orientation Prediction is Very Interesting Very Challenging in 2D Images Data-driven Method is NOT Adequate Possible Solutions? 3D Image Modeling Geometric Methods Vanishing Points Line Detection 11/21/2018 CVPR 2014 - Presentation

Thank you! Questions please? 11/21/2018 CVPR 2014 - Presentation