Speaker: Lingxi Xie Authors: Lingxi Xie, Qi Tian, Bo Zhang

Slides:

Advertisements

Similar presentations

Semantic Contours from Inverse Detectors Bharath Hariharan et.al. (ICCV-11)

Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.

Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.

DONG XU, MEMBER, IEEE, AND SHIH-FU CHANG, FELLOW, IEEE Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment.

Capturing Human Insight for Visual Learning Kristen Grauman Department of Computer Science University of Texas at Austin Work with Sudheendra Vijayanarasimhan,

Multi-layer Orthogonal Codebook for Image Classification Presented by Xia Li.

CS395: Visual Recognition Spatial Pyramid Matching Heath Vinicombe The University of Texas at Austin 21 st September 2012.

Patch to the Future: Unsupervised Visual Prediction

1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.

1 Building a Dictionary of Image Fragments Zicheng Liao Ali Farhadi Yang Wang Ian Endres David Forsyth Department of Computer Science, University of Illinois.

Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.

Object-centric spatial pooling for image classification Olga Russakovsky, Yuanqing Lin, Kai Yu, Li Fei-Fei ECCV 2012.

Groups of Adjacent Contour Segments for Object Detection Vittorio Ferrari Loic Fevrier Frederic Jurie Cordelia Schmid.

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

São Paulo Advanced School of Computing (SP-ASC’10). São Paulo, Brazil, July 12-17, 2010 Looking at People Using Partial Least Squares William Robson Schwartz.

Recognition: A machine learning approach

Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.

Statistical Recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Kristen Grauman.

1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.

Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.

Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.

Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.

Local Features and Kernels for Classification of Object Categories J. Zhang --- QMUL UK (INRIA till July 2005) with M. Marszalek and C. Schmid --- INRIA.

5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.

Spatial Pyramid Pooling in Deep Convolutional

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

Multiclass object recognition

Bag-of-Words based Image Classification Joost van de Weijer.

Computer vision.

A Thousand Words in a Scene P. Quelhas, F. Monay, J. Odobez, D. Gatica-Perez and T. Tuytelaars PAMI, Sept

Learning Visual Bits with Direct Feature Selection Joel Jurik 1 and Rahul Sukthankar 2,3 1 University of Central Florida 2 Intel Research Pittsburgh 3.

Object Bank Presenter ： Liu Changyu Advisor ： Prof. Alex Hauptmann Interest ： Multimedia Analysis April 4 th, 2013.

Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.

1 Action Classification: An Integration of Randomization and Discrimination in A Dense Feature Representation Computer Science Department, Stanford University.

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Group Sparse Coding Samy Bengio, Fernando Pereira, Yoram Singer, Dennis Strelow Google Mountain View, CA (NIPS2009) Presented by Miao Liu July

Deformable Part Model Presenter ： Liu Changyu Advisor ： Prof. Alex Hauptmann Interest ： Multimedia Analysis April 11 st, 2013.

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.

Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.

Locality-constrained Linear Coding for Image Classification

Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.

Object detection, deep learning, and R-CNNs

Hierarchical Matching with Side Information for Image Classification

Shiliang Zhang1, Qi Tian2, Gang Hua3, Qingming Huang4, Shipeng Li2 1Key Lab of Intelli. Info. Process., Inst. of Comput. Tech., CAS, Beijing , China.

Recognition Using Visual Phrases

Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.

1.Learn appearance based models for concepts 2.Compute posterior probabilities or Semantic Multinomial (SMN) under appearance models. -But, suffers from.

Object Recognition by Discriminative Combinations of Line Segments and Ellipses Alex Chia ^˚ Susanto Rahardja ^ Deepu Rajan ˚ Maylor Leung ˚ ^ Institute.

Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.

Facial Smile Detection Based on Deep Learning Features Authors: Kaihao Zhang, Yongzhen Huang, Hong Wu and Liang Wang Center for Research on Intelligent.

Compact Bilinear Pooling

Object detection with deformable part-based models

Bag-of-Visual-Words Based Feature Extraction

Multimedia Content-Based Retrieval

Learning Mid-Level Features For Recognition

Recognizing Deformable Shapes

ICCV Hierarchical Part Matching for Fine-Grained Image Classification

Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT

By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

HOGgles Visualizing Object Detection Features

IEEE ICIP Feature Normalization for Part-Based Image Classification

CVPR 2014 Orientational Pyramid Matching for Recognizing Indoor Scenes

“The Truth About Cats And Dogs”

Fine-Grained Visual Categorization

Faster R-CNN By Anthony Martinez.

Outline Background Motivation Proposed Model Experimental Results

Y2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences 1, Zhizhong.

Motivation It can effectively mine multi-modal knowledge with structured textural and visual relationships from web automatically. We propose BC-DNN method.

Presentation transcript:

ACM Multimedia 2012 Spatial Pooling of Heterogeneous Features for Image Applications Speaker: Lingxi Xie Authors: Lingxi Xie, Qi Tian, Bo Zhang State Key Laboratory of Intelligent Technology and Systems Department of Computer Science and Technology Tsinghua University http://www.tsinghua.edu.cn

ACM Multimedia 2012 - Oral Presentation Outline Introduction The Bag-of-Features Framework Our Proposed Framework Analysis Experimental Results Conclusions 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Outline Introduction The Bag-of-Features Framework Our Proposed Framework Analysis Experimental Results Conclusions 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Image Applications Image Classification Image Retrieval Scene Understanding Image Recommendation Image Tagging ...... 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Image Representation Important Step for Various Applications The Main Task Represent Images with High-Dimensional Vectors Find the Semantics on the Images The Bag-of-Features (BoF) Framework The Most Widely-Used Algorithm A Statistical method Very Successful in Recent Years 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Outline Introduction The Bag-of-Features Framework Our Proposed Framework Analysis Experimental Results Conclusions 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Raw Image Data 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation SIFT Descriptor Gray scale: 128D Color image: 384D Texture Descriptors SIFT Raw Image Data 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Visual Vocabulary K-Mns Texture Descriptors SIFT Raw Image Data The Feature Space 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation The Codebook 1 2 3 4 5 6 7 8 9 A B C D E F Code Word 1 2 3 4 5 6 7 8 9 A B C D E F Gray scale: 128D Color image: 384D B E Visual Vocabulary 3 C 7 K-Mns 1 A 8 5 6 9 4 Texture Descriptors 2 D F SIFT Raw Image Data The Feature Space 11/17/2018 ACM Multimedia 2012 - Oral Presentation

LLC K-Mns SIFT The Feature Space Visual Word Same Dimension with Codebook Size (16) 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Visual Words B LLC E Visual Vocabulary 3 C 7 K-Mns 1 A 8 5 6 9 4 Texture Descriptors 2 D F SIFT Raw Image Data The Feature Space 11/17/2018 ACM Multimedia 2012 - Oral Presentation

LLC K-Mns SIFT The Feature Space Visual Word Same Dimension with Codebook Size (16) 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Visual Words B LLC E Visual Vocabulary 3 C 7 K-Mns 1 A 8 5 6 9 4 Texture Descriptors 2 D F SIFT Raw Image Data The Feature Space 11/17/2018 ACM Multimedia 2012 - Oral Presentation

LLC K-Mns SIFT The Feature Space Visual Word 1 2 3 4 5 6 7 8 9 A B C D E F Same Dimension with Codebook Size (16) 1 2 3 4 5 6 7 8 9 A B C D E F Visual Words B LLC E Visual Vocabulary 3 C 7 K-Mns 1 A 8 5 6 9 4 Texture Descriptors 2 D F SIFT Raw Image Data The Feature Space 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation The Feature Space B F 2 4 5 E D 6 A 9 1 C 7 8 3 Visual Words LLC Visual Vocabulary K-Mns Texture Descriptors SIFT Raw Image Data 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Pooled Vectors MAX Visual Words LLC Visual Vocabulary K-Mns Texture Descriptors SIFT Raw Image Data 11/17/2018 ACM Multimedia 2012 - Oral Presentation

MAX LLC K-Mns SIFT Pooled Vector 1 2 3 4 5 6 7 8 9 A B C D E F Pooled Vectors MAX 1 2 3 4 5 6 7 8 9 A B C D E F Visual Words 1 2 3 4 5 6 7 8 9 A B C D E F Pooled Vector LLC Visual Vocabulary Same Dimension with Codebook Size (16) K-Mns Other Feature Codes are Ignored Here 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Texture Descriptors 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F SIFT Raw Image Data 11/17/2018 ACM Multimedia 2012 - Oral Presentation

SPM MAX LLC K-Mns SIFT Feature Super-Vectors Pooled Vectors 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F MAX Visual Words 1 2 3 4 5 6 7 8 9 A B C D E F LLC Visual Vocabulary K-Mns 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Texture Descriptors SIFT Raw Image Data 11/17/2018 ACM Multimedia 2012 - Oral Presentation

SPM MAX LLC K-Mns SIFT Feature Super-Vector Feature Super-Vectors SPM 1 2 3 4 5 6 7 8 9 A B C D E F Pooled Vectors 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F MAX 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Visual Words 1 2 3 4 5 6 7 8 9 A B C D E F LLC 1 2 3 4 5 6 7 8 9 A B C D E F Visual Vocabulary K-Mns 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Feature Super-Vector 1 2 3 4 5 6 7 8 9 A B C D E F Texture Descriptors Codebook Size (16) times Number of Regions (5) SIFT Raw Image Data 11/17/2018 ACM Multimedia 2012 - Oral Presentation

Shortcomings of BoF Framework The Poor Description of SIFT Descriptors Synonymy and Polysemy The Lack of Using Spatial Information Global Structure: Image Division Local Structure: Visual Phrase Difficulty of Locating Interesting Objects Noises from Background Clutters 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Outline Introduction The Bag-of-Features Framework Our Proposed Framework Analysis Experimental Results Conclusions 11/17/2018 ACM Multimedia 2012 - Oral Presentation

#1: The Fused Descriptors K-Mns Feature Super-Vectors SPM Pooled Vectors MAX Visual Words LLC SIFT Descriptor Edge-SIFT Descriptor Fused Descriptors Visual Vocabulary #1: The Fused Descriptors K-Mns Texture Descriptor Fused Descriptors Shape Descriptors Shape Descriptor Texture Descriptors Same Dimension SIFT SIFT Raw Image Data Edgemap 11/17/2018 ACM Multimedia 2012 - Oral Presentation

SPM MAX GPP LLC K-Mns SIFT SIFT Visual Phrase Visual Word Feature Super-Vectors SPM Pooled Vectors MAX Phrase Vectors GPP Central Word Visual Words LLC Side Word Visual Word Visual Vocabulary K-Mns Fused Descriptors Shape Descriptors Visual Phrase Texture Descriptors (Word Group) SIFT SIFT Raw Image Data Edgemap 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation 1 2 3 4 5 6 7 8 9 A B C D E F Central Word Side Word 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Phrase Vector for 1st Word Pair 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation 1 2 3 4 5 6 7 8 9 A B C D E F Central Word Side Word 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Phrase Vector for 2nd Word Pair 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation 1 2 3 4 5 6 7 8 9 A B C D E F Central Word Side Word 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Phrase Vector 1 2 3 4 5 6 7 8 9 A B C D E F for 3rd Word Pair 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation 1 2 3 4 5 6 7 8 9 A B C D E F …… 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Phrase Vector for the Visual Phrase 11/17/2018 ACM Multimedia 2012 - Oral Presentation

SPM MAX GPP LLC #2: The GPP Algorithm K-Mns SIFT SIFT Visual Phrase Feature Super-Vectors SPM Pooled Vectors MAX Phrase Vectors GPP Visual Words LLC #2: The GPP Algorithm Visual Word Visual Vocabulary K-Mns Fused Descriptors Shape Descriptors Visual Phrase Texture Descriptors SIFT SIFT Raw Image Data Edgemap 11/17/2018 ACM Multimedia 2012 - Oral Presentation

#3: The Spatial Weighting GPP Feature Super-Vectors SPM Weighted Vectors Pooled Vectors Weighting Matrix MAX Blur Phrase Vectors #3: The Spatial Weighting GPP Visual Words LLC Visual Vocabulary K-Mns Fused Descriptors Shape Descriptors Texture Descriptors SIFT SIFT Raw Image Data Edgemap 11/17/2018 ACM Multimedia 2012 - Oral Presentation

SPM MAX Blur GPP LLC K-Mns SIFT SIFT The Improved Bag-of-Features Feature Super-Vectors SPM Weighted Vectors Weighting Matrix MAX Blur The Improved Phrase Vectors GPP Visual Words Bag-of-Features LLC Visual Vocabulary Framework K-Mns Fused Descriptors Shape Descriptors Texture Descriptors SIFT SIFT Raw Image Data Edgemap 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Outline Introduction The Bag-of-Features Framework Our Proposed Framework Analysis Experimental Results Conclusions 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Analysis The Beneficiations Heterogeneous Descriptors Geometric Phrase Pooling The Complexity of Geometric Phrase Pooling Time Complexity Space Sparsity 11/17/2018 ACM Multimedia 2012 - Oral Presentation

Using Both Descriptors! Why Multiple Descriptors? Better on SIFT Better using Texture Better on Edge-SIFT Better using Shape wild cat anchor How to Classify All of Them? water lily butterfly Using Both Descriptors! crocodile wrench 11/17/2018 ACM Multimedia 2012 - Oral Presentation

Confused using Shape Features Confused using Texture Features Perfect Discrimination with Both Descriptors! 11/17/2018 ACM Multimedia 2012 - Oral Presentation

Why Geometric Phrase Pooling (GPP)? Central Word Side Word 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F MAX 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F MAX and GPP No Difference GPP 11/17/2018 ACM Multimedia 2012 - Oral Presentation ACM Multimedia 2012 - Oral Presentation

Why Geometric Phrase Pooling (GPP)? Central Word Side Word 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F MAX 1 2 3 4 5 6 7 8 9 A B C D E F Enhancing Overlapping Dimensions 1 2 3 4 5 6 7 8 9 A B C D E F GPP 11/17/2018 ACM Multimedia 2012 - Oral Presentation

The Feature Space Why Enhancing Overlapping Dimensions? Coding Phase B E 3 C 7 1 A Neighborhood in 8 5 6 Euclidean Space 9 4 Feature Space 2 D F The Feature Space 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Time / Image Time Complexity (s) 0.6 GPP 0.5 0.4 0.3 LLC 0.2 0.1 256 512 1024 2048 Codebook Size 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Feature Sparsity Non-zero Dimensions More Time (%) Denser Features 50 Better Results 40 30 20 GPP 10 LLC 256 512 1024 2048 Codebook Size 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Outline Introduction The Bag-of-Features Framework Our Proposed Framework Analysis Experimental Results Conclusions 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Experimental Results Image Classification The Caltech101 Dataset The Caltech256 Dataset Image Retrieval The Pascal VOC 2007 Challenge Scene Understanding The 15-Scene Dataset 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation The Caltech101 Dataset accordion car side trilobite leopard motorbike anchor butterfly pyramid cougar body pigeon wild cat ant octopus schooner ketch 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation The Caltech101 Dataset #training 5 10 15 20 30 SPM[2007] 56.4 64.6 ScSPM[2009] 67.0 73.2 LLC[2010] 51.15 59.77 65.43 67.74 73.44 MFea[2011] 75.7 RFrst[2007] 81.3 GPP 61.90 71.75 76.03 78.53 82.45 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation The Caltech256 Dataset lawn mower saturn tower pisa guitar pick desk globe basket- loop bat frog golf ball hot dog conch elk kayak rifle socks 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation The Caltech256 Dataset #training 5 15 30 45 60 Baseline 28.3 34.1 64.6 ScSPM[2009] 27.73 34.02 37.46 40.14 LLC[2010] 34.36 41.19 45.31 47.68 RFrst[2007] 44.0 GPP 26.12 36.35 45.07 48.02 50.33 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation The Pascal VOC 2007 Dataset person dining table sofa person horse potted plant tv monitor sofa chair car bus motorbike sheep aeroplane car aeroplane cat boat sheep bike bottle sheep dog bird boat 11/17/2018 ACM Multimedia 2012 - Oral Presentation

The Pascal VOC 2007 Challenge category plane bicycle bird boat bottle LLC 67.47 55.29 40.68 58.56 21.29 GPP 72.29 56.33 45.41 61.26 26.24 category bus car Cat chair cow LLC 44.10 69.43 46.73 51.50 31.21 GPP 53.77 73.56 52.18 54.19 40.78 category din.tab. dog horse m.bike person LLC 35.06 39.00 72.41 53.98 79.18 GPP 47.40 41.58 74.38 57.52 83.02 category p.plant sheep sofa train tv.mon. LLC 18.77 33.14 44.73 66.59 40.96 GPP 26.03 37.51 52.30 69.51 47.50 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation The 15-scene Dataset bedroom suburb industrial kitchen living-room coast forest highway inside-city mountain opn-country street tall-building office store 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation The 15-Scene Datset #training 10 20 30 50 100 SPM[2007] 81.4 ScSPM[2009] 80.4 LLC[2010] 66.97 72.44 75.78 78.84 82.34 GPP 70.67 76.12 78.74 81.72 85.13 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Outline Introduction The Bag-of-Features Framework Our Proposed Framework Analysis Experimental Results Conclusions 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Main Contributions Extraction of Heterogeneous Descriptors Introduction of Shape Description A Simple and Efficient Method for Fusion Construction and Pooling of Visual Phrases A Mid-Level Structure for Image Representation An Efficient Pooling Algorithm Spatial Weighting A Step towards Regions-of-Interest Detection 11/17/2018 ACM Multimedia 2012 - Oral Presentation

Conclusions and Future Works An Improved Version of BoF Framework Extraction of Heterogeneous Descriptors Construction and Pooling of Visual Phrases Spatial Weighting Open Problems Learning to Describe: Selection of Descriptors Better Local Structures: Advanced Visual Phrases Deep Mining in Edgemaps: Geometric Algorithms 11/17/2018 ACM Multimedia 2012 - Oral Presentation

ACM Multimedia 2012 - Oral Presentation Thank you! Questions please? 11/17/2018 ACM Multimedia 2012 - Oral Presentation