Download presentation
Presentation is loading. Please wait.
Published byGavin Brooks Modified over 6 years ago
1
Inverse Compositional Spatial Transformer Networks
Chen-Hsuan Lin, Simon Lucey Carnegie Mellon University
2
cat How to achieve spatial invariance within convolutional neural networks?
3
cat 1. Data augmentation 2. Spatial pooling
The network tolerates geometric perturbations. cat 2. Spatial pooling Spatial details are destroyed for higher-level abstractions.
4
spatial variations implicitly
1. Data augmentation We could tolerate spatial variations implicitly The network consumes geometric perturbations. …… but inefficiently. cat How might we resolve spatial variations explicitly……? 2. Spatial pooling Spatial details are destroyed for higher-level abstractions.
5
Spatial Transformer Networks
cat warp p Iin Iout geometric predictor Iout p sampler Iin grid generator Spatial Transformer Networks resolves spatial variations instead of tolerating them. [1] Jaderberg et al. "Spatial Transformer Networks." NIPS 2015.
6
… p Iout Iin 1. Boundary effect warp
geometric predictor 1. Boundary effect Original geometric information is not preserved in the network.
7
2. Single transformation
… warp p Iin Iout difference image geometric predictor 1. Boundary effect Original geometric information is not preserved in the network. 2. Single transformation Predicting large displacements from appearance is difficult. (Appearance is more correlated in close spatial proximity)
8
The Lucas-Kanade (LK) Algorithm
source image I template image T p p p p Δp Δp Δp The Lucas-Kanade (LK) Algorithm solves for the warp update iteratively. [2] Lucas & Kanade. "An iterative image registration technique with an application to stereo vision." IJCAI 1981.
9
The Lucas-Kanade (LK) Algorithm
source image I template image T p Δp The Lucas-Kanade (LK) Algorithm needs to recompute the linear models. (dependent on the warp state p)
10
Inverse Compositional LK
source image I template image T p p p p Δp-1 Δp Δp-1 Δp-1 Inverse Compositional LK …… and inversely compose it to the source image. solves for Δp on the template image …… [3] Baker & Matthews. "Lucas-kanade 20 years on: A unifying framework." IJCV 2004
11
Inverse Compositional LK
source image I template image T p Δp-1 Inverse Compositional LK requires only a single linear model for successful iterative alignment.
12
Spatial Transformer Networks
warp Iout p geometric predictor … Spatial Transformer Networks warp geometric predictor compose Δp … Iout … warp geometric predictor Iin Δp pin pout compose 1. Boundary effect Geometry is preserved Inverse Compositional LK requires only a single model for successful alignment …… 2. Single transformation Allows iterative alignment (LK)
13
Inverse Compositional Spatial Transformer Networks
pin geometric predictor warp compose Δp … Iout Iin Iin pin geometric predictor Δp warp Iout … compose 1. Geometry is preserved 2. Allows iterative alignment (LK) 3. Recurrent spatial transformations Inverse Compositional Spatial Transformer Networks
14
Experimental settings
Iin pin Iin(pin) ? Iin pin Iin(pin) warp p ? CNN STN Iin pin Δp warp compose ? Experimental settings IC-STN -(N) classifier ×N conv(3×3) ×3 conv(3×3) ×6 (max-pooling) conv(3×3) ×9 conv(3×3) ×12 FC(48) FC(10) geometric predictor classifier conv(7×7) ×4 conv(7×7) ×8 (max-pooling) FC(48) FC(8) conv(9×9) ×3 FC(10) 39079 learnable parameters 39048 learnable parameters
15
Handwritten digit classification (MNIST)
Visualization of transformations STN IC-STN-4 (#2) (final) (#3) (#1) Variance heatmap Mean appearance Classification error original 0% 1% 2% 3% 4% 5% 6% 7% perturbed CNN STN IC-STN-1 IC-STN-2 IC-STN-4 6.597% STN 4.944% IC-STN-1 3.687% 1.905% IC-STN-2 1.230% IC-STN-4
16
Traffic sign classification (GTSRB)
Visualization of transformations Mean appearance class 30 80 perturbed STN IC-STN-4 IC-STN-1 50 IC-STN-2 Classification error 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% CNN STN IC-STN-1 IC-STN-2 IC-STN-4 8.287% 4 6.495% 1 5.011% 4.122% 3 2 3.184%
17
Inverse Compositional Spatial Transformer Networks
Theoretical connection between STN and IC-LK Tolerating spatial variations needs huge increase of model capacity Resolving spatial variations is more efficient by predicting iterative small geometric updates Iin pin Δp warp Iout … compose Inverse Compositional Spatial Transformer Networks Thank you! Please come to Poster #11 for discussions! Chen-Hsuan Lin Simon Lucey
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.