Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong

Similar presentations


Presentation on theme: "Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong"— Presentation transcript:

1 Char-Net A Character-Aware Neural Network for Distorted Scene Text Recognition
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong Department of Computer Science, The University of Hong Kong AAAI-18

2 Motivation of Char-Net
*The original images are from Google Street View. Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

3 Motivation of Char-Net
*The original images are from Google Street View. Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

4 Motivation of Char-Net
*The original images are from Google Street View. Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

5 Motivation of Char-Net
*The original images are from Google Street View. Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

6 Design of Char-Net Distorted Text ([Shi et al, CVPR’2016], [Liu et al, BMVC’2016]) Encoder Spatial Transformer Text Image Convolutional Neural Network Bi-Directional LSTMs convolutional feature map CTC-based Decoder Attention-based Decoder or Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

7 Design of Char-Net Spatial Transformer TPS Transformation
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

8 Design of Char-Net Spatial Transformer TPS Transformation
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

9 Design of Char-Net Spatial Transformer TPS Transformation Rotation
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

10 Design of Char-Net Global and Complicated Transformation
Local and Simple Transformation Spatial Transformer TPS Transformation Rotation Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

11 Architecture of Char-Net
Word-Level Encoder hyper-connection Character-Level Encoder Input Image hyper-connection Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

12 Architecture of Char-Net
Word-Level Encoder hyper-connection Character-Level Encoder Input Image hyper-connection Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

13 Architecture of Char-Net
Word-Level Encoder hyper-connection Character-Level Encoder Input Image hyper-connection Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

14 Architecture of Char-Net
Word-Level Encoder hyper-connection Character-Level Encoder Input Image hyper-connection Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

15 Architecture of Char-Net
Word-Level Encoder hyper-connection Character-Level Encoder Input Image hyper-connection Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

16 Architecture of Char-Net
Word-Level Encoder hyper-connection Character-Level Encoder Input Image hyper-connection Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

17 Traditional Attention
Hierarchical Attention Mechanism (HAM) Input Image Traditional Attention Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

18 HAM: Recurrent RoIWarp Layer
Input Image Traditional Attention Grid Generator Bilinear Sampler Recurrent Localization Network Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

19 HAM: Recurrent RoIWarp Layer
Input Image Traditional Attention Grid Generator Bilinear Sampler Recurrent Localization Network Character Location and Size: Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

20 HAM: Recurrent RoIWarp Layer
Input Image Traditional Attention Traditional Attention Mechanism: Grid Generator Bilinear Sampler Recurrent Localization Network Character Location and Size: Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

21 HAM: Recurrent RoIWarp Layer
Input Image Traditional Attention Traditional Attention Mechanism: Grid Generator Bilinear Sampler Recurrent Localization Network Character Location and Size: Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

22 Recurrent Localization Network
HAM: Recurrent RoIWarp Layer Input Image crop warp Variable-size character of interest with a fixed size Grid Generator Bilinear Sampler Recurrent Localization Network Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

23 Recurrent Localization Network
HAM: Recurrent RoIWarp Layer Input Image crop warp Variable-size character of interest with a fixed size 1. Grid Generator: where (u, v) is a point in and (u’, v’) is its corresponding sampling point in . Grid Generator Bilinear Sampler Recurrent Localization Network Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

24 Recurrent Localization Network
HAM: Recurrent RoIWarp Layer Input Image crop warp Variable-size character of interest with a fixed size 1. Grid Generator: where (u, v) is a point in and (u’, v’) is its corresponding sampling point in . Grid Generator Bilinear Sampler Recurrent Localization Network 2. Bilinear Sampler Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

25 HAM: Character-Level Attention
Input Image Take the form of the traditional attention mechanism Essential for the end-to-end training Semi-supervised learning for character locations Distortion of the whole text Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

26 Experiment Six public benchmarks: ICDAR-2003 (IC-03)
Street View Text (SVT) IIIT5K Street View Text Perspective (SVT-P) ICDAR Incidental Scene Text (IC-IST) Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

27 Experiment - Comparison with Previous Methods
Experimental Setting: 37 classes (26 case-insensitive characters + 10 digits + eos) Training datasets: 8-million synthetic images (Jaderberg et al. 2014). Image size: 100 x 32 Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

28 Experiment - General Scene Text Recognition
Experimental Setting: 96 classes (26 upper-case letters + 26 lower-case letters + 10 digits + 33 punctuations + eos) Training datasets: 12-million synthetic images (Jaderberg et al Gupta et al. 2016) Image size: 100 x 100 Data augmentation Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

29 Experiment – Qualitative Results
prediction: toast Character-Level Attention prediction: beyond Character-Level Attention prediction: wishing Character-Level Attention Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.

30 Conclusion Input Image A simple but efficient network for distorted text recognition End-to-end trainable framework Hierarchical attention mechanism Character-level encoder Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.


Download ppt "Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong"

Similar presentations


Ads by Google