Download presentation
Presentation is loading. Please wait.
Published byHendri Tanuwidjaja Modified over 6 years ago
1
Char-Net A Character-Aware Neural Network for Distorted Scene Text Recognition
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong Department of Computer Science, The University of Hong Kong AAAI-18
2
Motivation of Char-Net
*The original images are from Google Street View. Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
3
Motivation of Char-Net
*The original images are from Google Street View. Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
4
Motivation of Char-Net
*The original images are from Google Street View. Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
5
Motivation of Char-Net
*The original images are from Google Street View. Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
6
Design of Char-Net Distorted Text ([Shi et al, CVPR’2016], [Liu et al, BMVC’2016]) Encoder Spatial Transformer Text Image Convolutional Neural Network Bi-Directional LSTMs … convolutional feature map CTC-based Decoder Attention-based Decoder or Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
7
Design of Char-Net Spatial Transformer TPS Transformation
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
8
Design of Char-Net Spatial Transformer TPS Transformation
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
9
Design of Char-Net Spatial Transformer TPS Transformation Rotation
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
10
Design of Char-Net Global and Complicated Transformation
Local and Simple Transformation Spatial Transformer TPS Transformation Rotation Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
11
Architecture of Char-Net
Word-Level Encoder hyper-connection Character-Level Encoder Input Image hyper-connection Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
12
Architecture of Char-Net
Word-Level Encoder hyper-connection Character-Level Encoder Input Image hyper-connection Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
13
Architecture of Char-Net
Word-Level Encoder hyper-connection Character-Level Encoder Input Image hyper-connection Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
14
Architecture of Char-Net
Word-Level Encoder hyper-connection Character-Level Encoder Input Image hyper-connection Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
15
Architecture of Char-Net
Word-Level Encoder hyper-connection Character-Level Encoder Input Image hyper-connection Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
16
Architecture of Char-Net
Word-Level Encoder hyper-connection Character-Level Encoder Input Image hyper-connection Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
17
Traditional Attention
Hierarchical Attention Mechanism (HAM) Input Image Traditional Attention Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
18
HAM: Recurrent RoIWarp Layer
Input Image Traditional Attention Grid Generator Bilinear Sampler Recurrent Localization Network Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
19
HAM: Recurrent RoIWarp Layer
Input Image Traditional Attention Grid Generator Bilinear Sampler Recurrent Localization Network Character Location and Size: Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
20
HAM: Recurrent RoIWarp Layer
Input Image Traditional Attention Traditional Attention Mechanism: Grid Generator Bilinear Sampler Recurrent Localization Network Character Location and Size: Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
21
HAM: Recurrent RoIWarp Layer
Input Image Traditional Attention Traditional Attention Mechanism: Grid Generator Bilinear Sampler Recurrent Localization Network Character Location and Size: Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
22
Recurrent Localization Network
HAM: Recurrent RoIWarp Layer Input Image crop warp Variable-size character of interest with a fixed size Grid Generator Bilinear Sampler Recurrent Localization Network Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
23
Recurrent Localization Network
HAM: Recurrent RoIWarp Layer Input Image crop warp Variable-size character of interest with a fixed size 1. Grid Generator: where (u, v) is a point in and (u’, v’) is its corresponding sampling point in . Grid Generator Bilinear Sampler Recurrent Localization Network Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
24
Recurrent Localization Network
HAM: Recurrent RoIWarp Layer Input Image crop warp Variable-size character of interest with a fixed size 1. Grid Generator: where (u, v) is a point in and (u’, v’) is its corresponding sampling point in . Grid Generator Bilinear Sampler Recurrent Localization Network 2. Bilinear Sampler Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
25
HAM: Character-Level Attention
Input Image Take the form of the traditional attention mechanism Essential for the end-to-end training Semi-supervised learning for character locations Distortion of the whole text Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
26
Experiment Six public benchmarks: ICDAR-2003 (IC-03)
Street View Text (SVT) IIIT5K Street View Text Perspective (SVT-P) ICDAR Incidental Scene Text (IC-IST) Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
27
Experiment - Comparison with Previous Methods
Experimental Setting: 37 classes (26 case-insensitive characters + 10 digits + eos) Training datasets: 8-million synthetic images (Jaderberg et al. 2014). Image size: 100 x 32 Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
28
Experiment - General Scene Text Recognition
Experimental Setting: 96 classes (26 upper-case letters + 26 lower-case letters + 10 digits + 33 punctuations + eos) Training datasets: 12-million synthetic images (Jaderberg et al Gupta et al. 2016) Image size: 100 x 100 Data augmentation Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
29
Experiment – Qualitative Results
prediction: toast Character-Level Attention prediction: beyond Character-Level Attention prediction: wishing Character-Level Attention Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
30
Conclusion Input Image A simple but efficient network for distorted text recognition End-to-end trainable framework Hierarchical attention mechanism Character-level encoder Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong. “Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition”. AAAI-18.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.