Download presentation
Presentation is loading. Please wait.
1
Text Detection in Images and Video
2
Outline Background Video Demo (Google translate, HuayuNavi, Camcard)
Optical character recognition (OCR) ICDAR Competition Part I: Caption detection in video Part II: Scene text detection Conclusion
3
Background Detecting text and caption from videos is important and in great demand for video retrieval, annotation, indexing, and content analysis. Provide high level semantic information such as program name, speaker name, speech content, sports scores, date, time, location, and so forth. Scene text detection attempts to extract textual information from images/videos in more natural settings.
4
Possible Applications
Annotation, tagging, indexing, search Translation Navigation Book cover recognition License plate recognition Computerized aid for visually impaired Automatic geocoding of businesses
5
Demo Video Google translate HuayuNavi Camcard
6
OCR Tesseract OCR (pptx file)
7
ICDAR Competitions International Conference on Document Analysis and Recognition Competitions Robust Reading Competition Challenge 2: Reading Text in Scene Images
8
Text Localization Task
9
Word Recognition Task
10
Part I: Caption Detection
Text From Corners: A Novel Approach to Detect Text and Caption in Videos, IEEE Transactions on Image Processing, 2011.
11
Existing Methods Most existing approaches can be generally classified into three categories texture based methods connected component based methods edge based methods
12
Corner Features Three-fold advantages of corner points
Corners are frequent and essential patterns in text regions. The distributions of corner points in text regions are usually orderly . generates more flexible and efficient criteria, under which the margin between text and non-text regions in the feature space is discriminative.
13
Corner Extraction
14
Feature Description (1/2)
morphology dilation on the binary corner image Corner points are dense and usually regularly placed in a horizontal string. The text can be effectively detected by figuring out the shape properties of the formed regions.
15
Feature Description (2/2)
Five region properties : Area > Ra Saturation -> Rs Orientation -> Ro aspect ratio -> Ras position > Rc bounding box : smallest rectangular that completely encloses the corner points formed regions.
16
Area The area of a region is defined as the number of foreground pixels in the region enclosed by a rectangle bounding box.
17
Saturation Saturation specifies the proportion of the foreground pixels in the bounding box that also belong to the region, which can be calculated by
18
Orientation Orientation is defined as the angle between the x-axis and the major axis of the ellipse
19
Aspect Ratio and Position
Aspect Ratio: Aspect Ratio of a bounding box is defined as the ratio of its width to its height. Position: We describe the position of a region with its centroid.
20
Language-independent
21
Part II: Scene Text Detection
Paper 1: Detecting text in natural scenes with stroke width transform, CVPR 2010. Paper 2: How salient is scene text? Proceedings of the th IAPR International Workshop on Document Analysis Systems, Pages 3: Stroke Filter
22
Paper 1 Detecting text in natural scenes with stroke width transform, CVPR 2010.
23
Detected Text Area in Natural Scenes
24
The Extraction Process
25
Stroke Width Transform (SWT)
26
Flowchart
27
Performance Evaluation
28
Paper 2 How Salient is Scene Text?
29
How Salient is Scene Text?
Comparing the performance of four attention models in scene text detection: Torralba’s saliency map (color) Torralba’s saliency map (Intensity) Harel’s GBVS Zhang’s Fast Saliency Itti’s saliency map (N2P2CI)
30
Input and Ground Truth
31
Results
32
Performance Comparison
33
Stroke Filter (1/3) Original
34
Results
35
Stroke Filter (2/3) Improved
36
Stroke Filter (3/3) Fast
37
Conclusions Caption detection is easier due to its nature of origin.
Robust scene text detection is more challenging due to unconstrained environments. Detecting Chinese text in natural images is still an open problem.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.