Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS395: Visual Recognition Spatial Pyramid Matching Heath Vinicombe The University of Texas at Austin 21 st September 2012.

Similar presentations


Presentation on theme: "CS395: Visual Recognition Spatial Pyramid Matching Heath Vinicombe The University of Texas at Austin 21 st September 2012."— Presentation transcript:

1 CS395: Visual Recognition Spatial Pyramid Matching Heath Vinicombe The University of Texas at Austin 21 st September 2012

2 Goal Given a number of categorized images, can we recognize the category of a test image Method: ‘Spatial Pyramid Matching’ (SPM) – Lazebnik, Schmid and Ponce – Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories Drunk Panda Drunk Polar Bear

3 Outline SPM Method Datasets Results Analysis Conclusions Discussion

4 Method - Summary Extract Features Compile Vocabulary Generate Histograms Compare Histograms Kernel Matrix Learning Algorithm

5 Method – Feature Extraction Dense SIFT descriptor – 8 x 8 pixel grid, each patch 16 x 16 (overlapping) – Advantage over sparse features for natural scenes – Matlab code from Lazebnik [1] – ~ 80s for 500 images – [1] http://www.cs.illinois.edu/homes/slazebni/research/SpatialPyramid.zip

6 Method – Vocab Generation K-Means Clustering 100 image subset of training data 200 word vocabulary ~ 130s

7 Method – Pyramid Matching Kernel Matrix

8 Method - Learning Algorithm SVM One vs All Precomputed Kernel is input Spider learning library collection for matlab [1] ~ 2s – [1] http://people.kyb.tuebingen.mpg.de/spider/main.html

9 Summary of Runtimes ComponentTime(s) SIFT Extraction80 Vocab Generation130 Pyramid Matching Kernel50 SVM2

10 Dataset- Details Caltech 101 image database [1] 101 Classes, 50-800 images per class This demo – 10 classes – 50 training per class – 20 test per class – [1] http://www.vision.caltech.edu/Image_Datasets/Caltech101/

11 Dataset - Classes Kangaroo Llama

12 Dataset - Classes Menorah Chandelier

13 Dataset - Classes Airplane Helicopter

14 Dataset - Classes Electric Guitar Grand Piano

15 Dataset - Classes Sunflower Bonsai

16 Results – Success Rate 86% classification rate on test images (guessing = 10%) 100% for Electric Guitar 65-70% for Llamas and Kangaroos

17 Results – Confusion Matrix Airplane Bonsai Chandelier Electric Guitar Grand Piano Helicopter Kangaroo Llama Menorah Sunflower Airplane Bonsai Chandelier Electric Guitar Grand Piano Helicopter Kangaroo LlamaMenorah Sunflower 900000100000 07055010 000 00950000500 000100000000 00509000500 00000950005 0000006525010 000000307000 001000000900 00005000095

18 98603956668318253422 199251 315358563060 13529452403644585556 24585695605920323760 38485775964719314940 545843674294373933 5615046164891854157 7655240185387943847 19547054553733369547 864 63502546434294 Results – Score Matrix Airplane Bonsai Chandelier Electric Guitar Grand Piano Helicopter Kangaroo Llama Menorah Sunflower Airplane Bonsai Chandelier Electric Guitar Grand Piano Helicopter Kangaroo LlamaMenorah Sunflower

19 Results – Examples of misclassified Llamas classified as Llamas Kangaroos classified as Kangaroos Llamas classified as Kangaroos Kangaroos classified as Llamas

20 Results – 180 deg Rotation Test images rotated 180 degrees Previous support vectors 55% accuracy

21 Results – Confusion Matrix (180 deg) Airplane Bonsai Chandelier Electric Guitar Grand Piano Helicopter Kangaroo Llama Menorah Sunflower Airplane Bonsai Chandelier Electric Guitar Grand Piano Helicopter Kangaroo LlamaMenorah Sunflower 750055150000 020250515251000 0 5550505155 510 505500015 001058000500 010000850005 0050005525015 0100005404500 0055020005515 001005000085

22 Results – 90 deg Rotation Test images rotated 90 degrees Previous support vectors 31% accuracy

23 00955000000 010355002515010 03025200150505 005020000015 0060103000000 007500510055 00550060150 050000356000 003515 05510 00005000095 Results – Confusion Matrix (90 deg) Airplane Bonsai Chandelier Electric Guitar Grand Piano Helicopter Kangaroo Llama Menorah Sunflower Airplane Bonsai Chandelier Electric Guitar Grand Piano Helicopter Kangaroo LlamaMenorah Sunflower

24 Results – Questions Raised Why are some classes more affected by rotation? Why does 90 deg have greater effect than 180 deg? Why are so many Aeroplanes classified as Chandeliers?

25 Analysis – Questions Raised Why are some classes more affected by rotation? Why does 90 deg have greater effect than 180 deg? Why are so many Aeroplanes classified as Chandeliers?

26 Analysis – Effect of Rotation

27 Analysis – Questions Raised Why are some classes more affected by rotation? Why does 90 deg have greater effect than 180 deg? Why are so many Aeroplanes classified as Chandeliers?

28 Analysis – Symmetry Many images have vertical symmetry

29 Analysis – Questions Raised Why are some classes more affected by rotation? Why does 90 deg have greater effect than 180 deg? Why are so many Aeroplanes classified as Chandeliers?

30 Analysis – Aeroplane/Chandelier results 90% of Aeroplanes correctly classified 90 deg rotation – 95% of Aeroplanes incorrectly classified as Chandeliers

31 Analysis – Vocabulary Comparison of Aeroplane and Chandelier Red dots = most common shared feature Large histogram overlap of airplanes and chandeliers despite little visual similarity

32 Analysis – Comparison of 3L Pyramid and BoW Bag of Words classifier effectively 0 levels Pyramid that does not use spatial information. Orientation compared to training 3 LevelBag of Words (0 Level) 086%76.5% 180 degrees55%73.5% 90 degrees31%29.5%

33 Conclusions 86% Classification accuracy achieved Runtime in order of a few minutes SPM is sensitive to rotation, especially 90 deg SPM performs better than BoW for correctly orientated images Dense SIFT features sensitive to changes in image size

34 Discussion Points Test examples outside training classes? What explains the higher accuracy compared to Lazebnik paper? How to improve the accuracy of SPM and BoW for 90 deg rotations? Could colour information be used as features?


Download ppt "CS395: Visual Recognition Spatial Pyramid Matching Heath Vinicombe The University of Texas at Austin 21 st September 2012."

Similar presentations


Ads by Google