ICLR, 2019 Jiahe Li 2019.1.24
Outlines Introduction Experiments Conclusions
Introduction: Texture versus Shape Classification of a standard ResNet-50
Data Sets Original (160 natural colour images of objects (10 per category) with white background) Greyscale Silhouette Edges Texture 4 networks: AlexNet, GoogLeNet, VGG16, ResNet50.
Results
Data Sets Cue conflict Images generated using iterative style transfer (Gatys et al., 2016) between an image of the Texture data set (as style) and an image from the Original data set (as content). We generated a total of 1280 cue conflict images (80 per category), which allows for presentation to human observers within a single experimental session.
Stylized-ImageNet (SIN) AdaIN style transfer (Huang & Belongie, 2017) Different stylization techniques Take prohibitively long with an iterative approach
Overcoming the texture bias of CNNs AlexNet: 195, VGG-16: 212, ResNet-50: 299
Robustness and Accuracy of Shape-based Representation
Robert Geirhos et al. Generalisation in humans and deep neural networks, 2018 NeurIPS
Conclusions Texture bias of CNNs trained on IN A step towards more plausible models of human visual object recognition Emergent benefits