End-to-End Text Recognition with Convolutional Neural Networks

Name: End-to-End Text Recognition with Convolutional Neural Networks
Uploaded: 2017-07-18T12:43:02+00:00
Duration: PTM6S53
Channel: Myrtle Mosley
Description: End-to-End Text Recognition with Convolutional Neural Networks

End-to-End Text Recognition with Convolutional Neural Networks
Tao Wang*, David J. Wu*, Adam Coates, Andrew Y. Ng Computer Science Department Stanford University * Denotes equal contribution

Scene Text Recognition Overview
Text “in the wild” are hard to recognize Wide range of variations in backgrounds, textures, fonts, and lighting conditions ICDAR 2003 Dataset S. Lucas et al., 2003 Street View Text Dataset K.Wang et al., 2011 Tao Wang

Detection/Classification
Two-Stage Framework Detection/Classification High-level Inference “HOTEL” Tao Wang

Classification and detection
Works Classification and detection High-level inference Weinman et al., 2008 Appearance + Geometry Semi-Markov CRF K. Wang et al., 2011 HOG + Random Ferns Pictorial Structure Mishra et al., 2012 HOG + SVM with RBF Kernel CRF + N-gram model Neumann and Matas, 2012 MSER + SVM with RBF Kernel Exhaustive Graph Search Tao Wang

Classification and detection
High-level inference Most other approaches Hand-designed features + off-the-shelf classifier Graph based inference models Our approach Learnt features layer CNN Simple off-the-shelf heuristics Tao Wang

SOTA Various Benchmarks SOTA SOTA on ICDAR Detection/Classification
End-to-end system after high-level inference ICDAR 62-way cropped character classification ICDAR and SVT end-to-end text recognition SOTA Lexicon ICDAR and SVT Cropped word recognition SOTA SOTA on ICDAR Tao Wang

Unsupervised Feature Learning
Contrast Normalization + ZCA whitening K-Means Coates et al., 2011 Tao Wang

~10K parameters for detection
~50K parameters for classification L2-SVM Classifier √ Text × Non-Text Large representation but not enough data. Overfitting? 96 256 Spatial Pooling Spatial Pooling Convolution Convolution 1st layer 2nd layer Backpropagation Tao Wang

Java.Font + Natural backgrounds
Synthetic Data Real Real Data Unrealistic Synthetic Data Synthetic Java.Font + Natural backgrounds Color Statistics Synthetic “hard negatives” Tao Wang

Detector Performance Tao Wang

Text Line Bounding boxes
Candidate spaces Tao Wang

Classifier Performance
62-way classification accuracy on ICDAR cropped characters 83.9 Higher is better Accuracy(%) (on ICDAR-Sample characters) Tao Wang

Tao Wang

Sliding window position
Char Class Sliding window position Tao Wang

Word Recognition max ∑ Lexicon: … MAKE S E R I E S SERIES ESTATE
POKER S E R I E S -5.45 7.82 -1.74 -9.02 max ∑ Tao Wang

Cropped Word Recognition Accuracy
Higher is better Cropped Words Benchmarks Tao Wang

Candidate spaces generated by detector … … Tao Wang

Tao Wang

End-to-end text recognition results
F-Score Higher is better End-to-end Benchmarks Tao Wang

Sample Output Images from SVT
Tao Wang

Sample Output Images from ICDAR-FULL
Tao Wang

c Hunspell -- “confidence margin” LEXICON POSE POST Suggested Words
PEOPLE PISTOL … Suggested Words POS POST Our F-score: 0.38 Neumann and Matas, 2010: 0.40 Hunspell PEOST PEOSTEL Tao Wang

Conclusion Learnt features + 2-layer CNN for+ character detection and classification Simple heuristics to build end-to-end scene text recognition system State-of-the-art performances on - ICDAR cropped character classification - ICDAR cropped word recognition - Lexicon based end-to-end recognition on ICDAR and SVT Extensible to more general lexicon with off-the-shelf spelling checker Tao Wang

Questions? Tao Wang

End-to-End Text Recognition with Convolutional Neural Networks

Similar presentations

Presentation on theme: "End-to-End Text Recognition with Convolutional Neural Networks"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

End-to-End Text Recognition with Convolutional Neural Networks

Similar presentations

Presentation on theme: "End-to-End Text Recognition with Convolutional Neural Networks"— Presentation transcript:

Similar presentations

About project

Feedback