Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICCV Hierarchical Part Matching for Fine-Grained Image Classification

Similar presentations


Presentation on theme: "ICCV Hierarchical Part Matching for Fine-Grained Image Classification"— Presentation transcript:

1 ICCV 2013 Hierarchical Part Matching for Fine-Grained Image Classification
Speaker: Lingxi Xie Authors: Lingxi Xie, Qi Tian, Richang Hong, Shuicheng Yan, Bo Zhang State Key Laboratory of Intelligent Technology and Systems Department of Computer Science and Technology Tsinghua University

2 Outline Introduction The Bag-of-Feature Model
Hierarchical Part Matching Experimental Results Conclusions 9/19/2018 ICCV Presentation

3 Outline Introduction The Bag-of-Feature Model
Hierarchical Part Matching Experimental Results Conclusions 9/19/2018 ICCV Presentation

4 Image Classification A basic task towards image understanding
General vs. Fine-Grained 9/19/2018 ICCV Presentation

5 Outline Introduction The Bag-of-Feature Model
Hierarchical Part Matching Experimental Results Conclusions 9/19/2018 ICCV Presentation

6 Image-level Vector Compact Feature Codes Visual Vocabulary
Spatial Pooling: Sum Pooling/Max Pooling, Spatial Pyramid Matching [Lazebnik, CVPR06] Geometric Phrase Pooling [Xie, ACMMM12] Compact Feature Codes Hard/Soft/Sparse Coding methods: Vector Quantization ScSPM encoding [Yang, CVPR09] LLC encoding [Wang, CVPR10] Visual Vocabulary Clustering methods: K-Means Hierarchical K-Means [Nister, CVPR06] Approximate K-Means [Philbin, CVPR07] Image Descriptors Gradient-based local descriptors: SIFT [Lowe, IJCV04] HOG[Dalal, CVPR05] Raw Image 9/19/2018 ICCV Presentation

7 Image-level Vector Compact Feature Codes Visual Vocabulary
Spatial Pooling: Sum Pooling/Max Pooling, Spatial Pyramid Matching [Lazebnik, CVPR06] Geometric Phrase Pooling [Xie, ACMMM12] Compact Feature Codes Hard/Soft/Sparse Coding methods: Vector Quantization ScSPM encoding [Yang, CVPR09] LLC encoding [Wang, CVPR10] Visual Vocabulary Clustering methods: K-Means Hierarchical K-Means [Nister, CVPR06] Approximate K-Means [Philbin, CVPR07] Image Descriptors Gradient-based local descriptors: SIFT [Lowe, IJCV04] HOG[Dalal, CVPR05] Raw Image 9/19/2018 ICCV Presentation

8 Spatial Pyramid Matching (SPM)
= Part 1 [Lazebnik, CVPR06] = Part 2 = Part 3 = Part 4 = Part 5 9/19/2018 ICCV Presentation

9 Hierarchical Part Matching (HPM)
= Part 1 [Xie, ICCV13] = Part 2 = Part 3 = Part 4 = Part 5 9/19/2018 ICCV Presentation

10 Outline Introduction The Bag-of-Feature Model
Hierarchical Part Matching Experimental Results Conclusions 9/19/2018 ICCV Presentation

11 Model Overview Key Modules Foreground Inference Part Segmentation
Hierarchical Structure Learning Geometric Phrase Pooling 9/19/2018 ICCV Presentation

12 Foreground Inference bounding box crown forehead left eye beak nape
right eye throat right wing right leg breast back belly left wing tail left leg 9/19/2018 ICCV Presentation

13 Grab-Cut Algorithm Foreground Inference definite background possible
9/19/2018 ICCV Presentation

14 Ultremetric Contour Map Part Segmentation 9/19/2018
ICCV Presentation

15 Part Segmentation edge response matrix 9/19/2018
ICCV Presentation

16 Part Segmentation 0.50 0.00 0.85 0.15 9/19/2018 ICCV Presentation

17 Part Segmentation step penalty = 0.01 0.50 0.00 0.85 0.85 0.50 0.00
0.86 0.86 0.51 0.01 0.86 0.50 0.00 0.85 0.85 0.51 0.51 0.01 0.01 0.86 0.01 0.86 0.86 0.01 0.01 0.86 0.50 0.00 0.00 0.85 0.51 0.01 0.01 0.51 0.01 0.01 0.01 0.01 0.01 0.86 0.16 0.00 0.00 0.00 0.15 0.01 0.01 0.16 0.01 0.01 0.01 0.00 0.00 0.15 0.15 0.01 0.01 0.01 0.01 0.01 0.16 0.16 0.16 0.01 0.16 0.16 step penalty = 0.01 0.01 0.01 0.16 9/19/2018 ICCV Presentation

18 Part Segmentation 9/19/2018 ICCV Presentation

19 Part Segmentation back breast left wing belly 9/19/2018
ICCV Presentation

20 Shortest-Path Algorithm Part Segmentation 9/19/2018
ICCV Presentation

21 Part Segmentation forehead left eye beak nape throat back breast
left wing belly tail left leg 9/19/2018 ICCV Presentation

22 Hierarchical Structure Learning
Discovering Mid-Level Parts Part Distance 9/19/2018 ICCV Presentation

23 Hierarchical Structure Learning
Discovering Mid-Level Parts Cost Function when Merging Parts 9/19/2018 ICCV Presentation

24 Hierarchical Structure Learning
Discovering Mid-Level Parts Hierarchical Structure Learning (HSL) Algorithm 9/19/2018 ICCV Presentation

25 Hierarchical Structure Learning
beak + crown + forehead + eyes = head nape + throat = neck back + belly + breast + tail = neck 9/19/2018 ICCV Presentation

26 Hierarchical Structure Learning
best choice μ: controlling the complexity of the model! 9/19/2018 ICCV Presentation

27 Geometric Phrase Pooling
tail visual words side words visual phrase central word 9/19/2018 ICCV Presentation

28 ACM Multimedia 2012 - Oral Presentation
1 2 3 4 5 6 7 8 9 A B C D E F central word side words 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Phrase Vector for 1st Word Pair 9/19/2018 ACM Multimedia Oral Presentation

29 ACM Multimedia 2012 - Oral Presentation
1 2 3 4 5 6 7 8 9 A B C D E F central word side words 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Phrase Vector for 2nd Word Pair 9/19/2018 ACM Multimedia Oral Presentation

30 ACM Multimedia 2012 - Oral Presentation
1 2 3 4 5 6 7 8 9 A B C D E F …… 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F 1 2 3 4 5 6 7 8 9 A B C D E F Phrase Vector for the Visual Phrase 9/19/2018 ACM Multimedia Oral Presentation

31 Model Summarization Foreground Inference and Part Segmentation
Accurate Segmentation, Better Representation Hierarchical Structure Learning Discovering High-level Semantic Parts Geometric Phrase Pooling Capturing Geometric Information 9/19/2018 ICCV Presentation

32 Outline Introduction The Bag-of-Feature Model
Hierarchical Part Matching Experimental Results Conclusions 9/19/2018 ICCV Presentation

33 Dataset and Annotations
Caltech-UCSD Bird Dataset 200 Bird Categories 11788 Images (at least 55 per Category) Accuracy by Category (5, 10, 20, 30 trainings) Manual Annotation by Web Users At Most 15 Landmarks per Image Beak, Crown, Forehead, Nape, Throat, Left/Right Eyes; Belly, Breast, Back, Tail, Left/Right Wings/Legs. 9/19/2018 ICCV Presentation

34 Part Segmentation #training 5 10 20 30 Baseline 13.64 20.25 28.36
33.63 +FG Inf. 19.25 27.66 37.08 43.06 +Part Seg. 28.55 40.46 52.52 58.09 9/19/2018 ICCV Presentation

35 Hierarchical Structure Learning
#training 5 10 20 30 No Struct. 28.55 40.46 52.52 58.09 Struct. #1 29.29 41.62 53.36 59.24 Struct. #2 29.75 42.03 53.55 59.32 Struct. #3 30.33 42.66 53.94 59.86 Struct. #4 27.38 38.64 50.22 56.11 9/19/2018 ICCV Presentation

36 Hierarchical Structure Learning
best choice μ: controlling the complexity of the model! 9/19/2018 ICCV Presentation

37 Geometric Phrase Pooling
#training 5 10 20 30 No Phrase 30.33 42.66 53.94 59.86 GPP (5,5) 31.69 43.80 55.26 60.80 GPP (5,10) 32.23 45.10 56.11 61.93 GPP (5,20) 34.13 47.29 58.60 64.01 GPP (5,40) 36.09 48.87 60.56 65.62 9/19/2018 ICCV Presentation

38 Comparison #training 5 10 20 30 Wah et.al, TechRep11 10.05 Zhang et.al, CVPR12 24.21 Wang et.al, CVPR10 13.64 20.25 28.36 33.63 Xie et.al, ACMMM12 15.34 22.91 31.01 36.17 Ours 36.09 48.87 60.56 65.62 9/19/2018 ICCV Presentation

39 Summarization All the Components Help! The HUGE Improvement
Mainly Comes from Part Segmentation Comparison Directly with Previous Methods without Using Landmarks is NOT Fair 9/19/2018 ICCV Presentation

40 Updates after Publication
Baseline Performance might be Much Better Using Fisher Kernels [Perronnin, ECCV10] Using Deep Features [Donahue, ICML14] Automatically Detected Parts Works Well Landmark Detection [Berg, CVPR13] Symbiotic Localization [Chai, ICCV13] Geometric Segmentation [Gavves, ICCV13] State-of-the-Art ~80%/~70% w/o using Landmarks (30 trainings). 9/19/2018 ICCV Presentation

41 Outline Introduction The Bag-of-Feature Model
Hierarchical Part Matching Experimental Results Conclusions 9/19/2018 ICCV Presentation

42 Main Contributions An Important Clue to Fine-Grained Problem
Part Information is Crucial Automatic Annotation is Required A Complete Flowchart for Part Representation Starting from Landmarks Foreground Inference and Segmentation Hierarchical Structure Learning Transplantable to Other Datasets 9/19/2018 ICCV Presentation

43 Conclusions and Future Work
Fine-Grained Classification is More Difficult Surprising Inter-class Similarity Discovering Parts is Very Important! BUT, Annotation is still Expensive and Unrealistic Alternative Methods? Template Matching [Yang, NIPS12] Co. Segmentation and Localization [Chai, ICCV13] Geometric Shape Alignment [Gaaves, ICCV13] 9/19/2018 ICCV Presentation

44 Thank you! Questions please? 9/19/2018 ICCV Presentation


Download ppt "ICCV Hierarchical Part Matching for Fine-Grained Image Classification"

Similar presentations


Ads by Google