Download presentation
Presentation is loading. Please wait.
1
LSUN Semantic Segmentation Extended PSPNet
Yi ZHANG, Hengshuang ZHAO, Jianping SHI
2
Pyramid Scene Parsing Network
PSPNet with Resnet 101 H. Zhao, J. Shi, X. Qi, X. Wang and J. Jia. Pyramid Scene Parsing Network. In CVPR, 2017.
3
Pyramid Scene Parsing Network
Details Auxiliary loss in with weight 0.4. Images resized to 1000 pixels at short side. Random mirror and random resize between 0.5 and 2 for data augmentation. Crop-size 713, and batch-size 16 mIoU 48.52% without PSP mIoU 49.76% with PSP
4
Hybrid Dilated Convolution
Details For res4b module, every 4 blocks are grouped together and dilation rates are set to be 1, 2, 5, and 9. For last 3 blocks, dilation rates are 1, 2, and 5. For res5b module, dilation rates are set to be 5, 9, and 17. Improve by 0.52. > mIoU P. Wang, P. Chen, Y. Yuan, D. Liu, Z. Huang, X. Hou, and G. Cottrell. Understanding convolution for semantic segmentation. arXiv preprint arXiv: , 2017.
5
HDC-PSPNet-WeightedLoss
Training data is imbalanced. Rare classes with lower performance get larger weights. Improve by 1.22%. > 51.5 mIoU Model HDC-PSPNet HDC-PSPNet-WeightedLoss Mean IoU 50.28% 51.50% 1 Bird 0.00% 18.76% 10 Curb Cut 19.56% 21.37% 11 Parking 23.12% 26.51% 23 Other Rider 0.89% 0.93% 38 CCTV Camera 2.41% 19.24% 41 Mailbox 6.10% 18.86% 43 Phone Booth 9.18% 15.55% 44 Pothole 4.72% 11.75% 57 Caravan 0.31% 0.88%
6
Cityscapes Pretrain Improvement is minor. 51.59, improvement of 0.09.
7
Final Result Table 1: Single scale test results of single model on validation data Model mIoU PSPNet 49.76% HDC-PSPNet 50.28% HDC-PSPNet-WeightedLoss 51.50% HDC-PSPNet-WeightedLoss-CityscapesPretrain 51.59% Table 2: Test results of HDC-PSPNet-WeightedLoss-CityscapesPretrain on validation data (six scales for multi-scale test: 0.5, 0.75, 1.0, 1.25, 1.5, 1.75) Scale mIoU Single-Scale 51.59% Multi-Scale 53.51% Table 3: Multi-scaleTest results of ensemble model on validation data Model mIoU HDC-PSPNet-WeightedLoss-CityscapesPretrain 53.51% 4-models-emsemble 53.85%
8
Visual Results
9
Visual Results
10
Visual Results
11
Visual Results
15
LSUN Instance Segmentation Mask Instance Segmentation
Shu LIU, Lu QI, Haifang QIN, Jianping SHI and Jiaya JIA
16
Features of MVD 20,000 images 37 classes with instance labels
Varying of image scales, from 554 to 4,901 Varying number of instances per image, from 0 to 389 Large range of instance size, from 1 to 3,166 Large variation of street view across the world
17
Our Insights Varying image size Small objects Scale vs deeper model
Resize to the same size Small objects Optimized RPN Scale vs deeper model Scale matters Long tail label distribution More data helps
18
Optimized RPN RPN Drawback Improvement
Pretrained Resnet50 with FPN structure Default hyperparameter as FPN paper Recall IoU 0.5 Drawback Performs bad on small objects Improvement Use smaller anchors Recall IoU 0.5 More anchors Recall IoU 0.5 T. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie. Feature Pyramid Networks for Object Detection. In CVPR, 2017.
19
Scale matters Smaller model but larger image size RPN FRCNN
Resnet 50 vs Resnet 101 Max size: 1900 vs 1500 82.9 vs 72.1 IoU 0.5 FRCNN 39.8 vs 38.5 IoU 0.5 Resnet 50 vs Inception Resnet 50 39.8 vs IoU 0.5
20
Long tail label distribution
287,016 poles vs 127 caravans
21
Long tail label distribution
Get more data Pretrain on MSCOCO 39.8 -> 41.1 IoU 0.5 with Resnet 50
22
Ensemble Models Improvements
2 Resnet 50 pretrained on COCO (same initialization but trained with different step sizes) 2 Inception 50 pretrained on Imagenet (same initialization but trained with different step sizes) Improvements Bbox: > Mask: > 23.7 AP K. He, G. Gkioxari, P. Dollár, and R. Girshick. Mask R-CNN. In arxiv, 2017.
23
Others Overfitting Check loss curve and validation performance carefully. This time, Dropout is not helpful. Poly strategy converges not as good as step.
24
Summary
25
Visual Results
26
Visual Results
27
Visual Results
28
Visual Results
29
Thanks & Questions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.