Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation

Similar presentations


Presentation on theme: "Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation"ā€” Presentation transcript:

1 Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation
Source: IEEE Transactions on Image Processing, Vol. 28, No. 4, pp , April 2019. Author: Yu-Lei Niu, Zhi-Wu Lu, Ji-Rong Wen, Tao Xiang, and Shih-Fu Chang Speaker: Chih-Lung Chen Date: 2019/05/23

2 Outline Introduction Preliminaries Proposed scheme Experiments
Conclusions

3 Introduction (1/2) Cat Dog ? Cat ? Dog Annotation Application

4 Introduction (2/2) Single label Top-š‘˜ label Ground truth Top-šŸ‘
person, water, mountain, reflection, sky, leaf Ground truth thunder, cloud, tree flower Top-šŸ‘ thunder, cloud, tree person, water, mountain flower, fire, sky person, water, mountain, reflection, sky, leaf Proposed thunder, cloud, tree flower

5 Preliminaries (1/5) - NN NN Input Output How are you? Iā€™m fine. Cat
Neural network NN Input Output How are you? Iā€™m fine. Cat

6 Preliminaries (2/5) - NN š‘¦=š‘¤š‘„+š‘ Cat Input Output Basic classifier

7 Preliminaries (3/5) - CNN
Convolutional neural network 3. 2. 1.

8 Preliminaries (4/5) - CNN
1 -1 1 Neuron -2 -3 3 -2 -1 -2 -2 3 1 -1 -2 -2 -2 3 -2 -2 Image

9 Preliminaries (5/5) - CNN
1 -1 -2 -3 3 -2 -1 3 -1 -2 -2 3 1 -1 -2 -2 3 -2 -2 3 -2 -2

10 Proposed scheme (1/2) ā€“ MS-CNN
Multi-scale convolutional neural network Fusion_1 Fusion_2 Fusion_3 Fusion_4 Conv_1 Conv_2 Conv_3 Conv_4 Conv_5

11 Proposed scheme(2/2) MS-CNN NN Multi-class Visual feature extraction
classification Visual feature extraction Cat Dog Rug Grass . MS-CNN Image NN Cat Dog Rug Concatenate NN Pet Home Tags NN 3 Label quantity prediction Textual feature extraction

12 Experiments (1/5) Dataset NUS-WIDE MSCOCO Dataset
T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, Y. Zheng, "NUS-WIDE: A real-world Web image database from National University of Singapore",Ā  Proc. CIVR, pp. 48, Jan O. Vinyals, A. Toshev, S. Bengio, D. Erhan, "Show and tell: Lessons learned from the 2015 MSCOCO image captioning challenge",Ā  IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, No. 4, pp , Apr

13 Experiments (2/5) Cat, Dog, Rug Dog, Chair, Refrigerator Dog, Blanket
NUS-WIDE Cat, Dog, Rug Dog, Chair, Refrigerator Dog, Blanket Dog, Chair, Door Dog, Blanket Cat, Dog, Rug

14 Experiments (3/5) Cat, Dog, Rug Dog, Chair, Refrigerator Dog, Blanket
MSCOCO Cat, Dog, Rug Dog, Chair, Refrigerator Dog, Blanket Dog, Chair, Door Dog, Blanket Cat, Dog, Rug

15 Experiments (4/5) NUS-WIDE MSCOCO

16 Experiments (5/5)

17 Conclusions Multi-scale Adaptive label

18 Thanks for listening

19 References [22] H. Hu, G.-T. Zhou, Z. Deng, Z. Liao, G. Mori, "Learning structured inference neural networks with label relations", Proc. CVPR, pp , Jun [23] J. Johnson, L. Ballan, L. Fei-Fei, "Love thy neighbors: Image annotation by exploiting image metadata", Proc. ICCV, pp , Dec [24] F. Liu, T. Xiang, T. M. Hospedales, W. Yang, C. Sun, "Semantic regularisation for recurrent image annotation", 2016, [online] Available: [25] J. Jin, H. Nakayama, "Annotation order matters: Recurrent image annotator for arbitrary length image tagging", Proc. ICPR, pp , Dec [26] J. Wang, Y. Yang, J. Mao, Z. Huang, C. Huang, W. Xu, "CNN-RNN: A unified framework for multi-label image classification", Proc. CVPR, pp , Jun [30] Y. Gong, Y. Jia, T. Leung, A. Toshev, S. Ioffe, "Deep convolutional ranking for multilabel image annotation", 2013, [online] Available:


Download ppt "Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation"

Similar presentations


Ads by Google