Download presentation
Presentation is loading. Please wait.
1
1 Détection des textes dans les images issues d’un flux vidéo pour l´indexation sémantique Laboratoire d'Informatique en Images et Systèmes d'information LIRIS, FRE 2672 CNRS Bât. Jules Verne, INSA de Lyon 69621 Villeurbanne cedex 1 juillet 2004 Christian.wolf@liris.cnrs.fr http://rfv.insa-lyon.fr/~wolf Christian Wolf
2
2 FeaturesIntroductionEvaluationConclusionText detectionResults Introduction Features Evaluation/ Choice of features Text detection Conclusion Experimental Results Plan
3
3 Image/video indexing Content based image retrieval (Master’s degree): Query by example: Indexing based on local texture (Gabor) features Video indexing using semantic descriptors (PhD) : Text detection, enhancement, segmentation and recognition. Result keyword-based Search Patrick Mayhew Min. chargé de l´irlande de Nord ISRAEL Jerusalem montage T.Nouel... Key word Indexing phase FeaturesIntroductionEvaluationConclusionText detectionResults
4
4 Text detection “Soukaina Oufkir” Detection Enhancement Segmentation FeaturesIntroductionEvaluationConclusionText detectionResults
5
5 Detection in an image Contrast and Edge features Geometrical features Texture features Color features Problems: Which features? How can the decision be taken (text - non-text)? Separate populations (discriminant analysis) Learning a model (SVM, etc.) Reinforcement learning Master’s thesis of Graham Taylor Heuristics Region/stroke segmentation Corner features FeaturesIntroductionEvaluationConclusionText detectionResults
6
6 FeaturesIntroductionEvaluationConclusionText detectionResults Introduction Features Evaluation/ Choice of features Text detection Conclusion Experimental Results Plan
7
7 Videos vs. scanned documents Temporal aspects Complex and moving background Artificial shadows FeaturesIntroductionEvaluationConclusionText detectionResults
8
8 Videos vs. scanned documents Low resolution Low quality Antialising artifacts Compression artifacts Color bleeding FeaturesIntroductionEvaluationConclusionText detectionResults
9
9 What is text? - character segmentation Artificial text Scene text FeaturesIntroductionEvaluationConclusionText detectionResults
10
10 What is text? - texture Example: Gabor energy features on a text image Original imageFilter tuned to the example text Gabor energyThresholded Gabor energy FeaturesIntroductionEvaluationConclusionText detectionResults
11
11 What is text? - texture Still imagesIntroductionVideosIndexingCharacter segmentationResults
12
12 What is text? - corners Unthresholded “Harris” corner response FeaturesIntroductionEvaluationConclusionText detectionResults Derivative2nd derivative smeared
13
13 What is text? - contrast & geometry Example image Accumulated horizontal Sobel edges FeaturesIntroductionEvaluationConclusionText detectionResults
14
14 What is text? - color Original image Sobel on grayscale image Modified Sobel on L*u*v* image Special cases of text: Small contrast in the lumination plane High(er) contrast in the color plane FeaturesIntroductionEvaluationConclusionText detectionResults
15
15 FeaturesIntroductionEvaluationConclusionText detectionResults Introduction Features Text detection Conclusion Experimental Results Plan Evaluation/ Choice of features
16
16 Evaluation A good evaluation algorithm permits: A simple and intuitive interpretation of the obtained performance An objective comparison between the different algorithms to evaluate A good correspondence between the performance measures and the real performance, taking into account the objective of the algorithm (goal oriented approach) Takes into account only the performance of the algorithm, without side effects of other processing steps FeaturesIntroductionEvaluationConclusionText detectionResults
17
17 Evaluation at different levels Statistical separation: Bhattacharyya distance Error rate, Recall/Precision on pixel level Recall/Precision on rectangle level Goal oriented: Recall/Precision on character level Higher relevance to the application Lower influence of later stages Lower computational complexity Patrick Mayhew Min. chargé de l´irlande de Nord ISRAEL Jerusalem montage T.Nouel... Patrick Mayhew Min. chargé de l´irlande de Nord ISRAEL Jerusalem montage T.Nouel... Detection resultGround truth FeaturesIntroductionEvaluationConclusionText detectionResults
18
18 Evaluation on rectangle level DetectionGround truth Pure overlap is ambiguous on multiple images: 50% of recall could mean: 50% of the text rectangles have been detected perfectly 100% of the rectangles have been detected with 50% surface Anything between the two... FeaturesIntroductionEvaluationConclusionText detectionResults
19
19 Evaluation on rectangle level Requirements of an evaluation measure: Tells intuitively how many rectangles have been detected, and how many false alarms Measures the detection quality Takes into account one-2-one, one-2-many and many-2-one matches Scales up to multiple images Counts number of correctly detected rectangles Measures the detection quality Problem: Contradiction FeaturesIntroductionEvaluationConclusionText detectionResults
20
20 Performance graphs Ground truth G i Detection D i “Surface” Recall and Precision: Thresholded by different thresholds on recall and precision For each rectangle, we will know whether it has been detected or not, depending on a quality threshold FeaturesIntroductionEvaluationConclusionText detectionResults
21
21 Performance graphs Threshold on surface recall Threshold on surface precision FeaturesIntroductionEvaluationConclusionText detectionResults
22
22 Comparison of different detection algorithms Method 1: Local contrast Method 2: SVM Learning FeaturesIntroductionEvaluationConclusionText detectionResults
23
23 The influence of the test database Local contrastSVM learning FeaturesIntroductionEvaluationConclusionText detectionResults
24
24 FeaturesIntroductionEvaluationConclusionText detectionResults Introduction Features Conclusion Experimental Results Plan Evaluation/ Choice of features Text detection
25
25 The local contrast method Calculate a text probability image according to a text model (1 value/ pixel) Separate the probability values into 2 classes. Post processing Fisher/Otsu Mathematical morphology Geometrical constraints Verification of special cases Combination of rectangles F. LeBourgeois Still imagesIntroductionVideosConclusionCharacter segmentationResults
26
26 The learning method Learning gray values and edge maps alone may not generalize enough. Texture alone is not reliable, especially if the text is short. Geometry is a valuable feature. State of the art: enforce geometrical constraints in the post-processing step (mathematical morphology) We propose the usage of geometrical features very early in the detection process, i.e. not during post-processing. FeaturesIntroductionEvaluationConclusionText detectionResults
27
27 Geometrical features: baseline Text consists of: A high density of strokes in direction of the text baseline. A consistent baseline (a rectangular region with an upper and lower border). Two detection philosophies: Detection of the baseline directly before detecting the text region. Detection of the baseline as the boundary area of the detected text region in order to refine the detection quality. FeaturesIntroductionEvaluationConclusionText detectionResults
28
28 Estimation of the text rectangle height Original image Accumulated gradients FeaturesIntroductionEvaluationConclusionText detectionResults
29
29 Mode width (=rectangle height)Mode height (=Contrast)Difference height left-right Mode meanMode standard deviationDifference in mode width Features IntroductionEvaluationConclusionText detectionResults
30
30 Learning with Support Vector Machines Training image database positive samplesnegative samples Classification step: a reduction of the computational complexity is necessary: Sub-sampling of the pixels to classify (4x4) Approximation of the SVM model by SVM-regression. Bootstrapping, cross-validation FeaturesIntroductionEvaluationConclusionText detectionResults
31
31 FeaturesIntroductionEvaluationConclusionText detectionResults Introduction Features Conclusion Plan Evaluation/ Choice of features Text detection Experimental Results
32
32 FeaturesIntroductionEvaluationConclusionText detectionResults AIM3 News AIM4 Cartoons, News AIM5 News AIM2 Commercials
33
33 Detection in still images Local contrast SVM learning FeaturesIntroductionEvaluationConclusionText detectionResults
34
34 FeaturesIntroductionEvaluationConclusionText detectionResults Local contrast SVM learning
35
35 FeaturesIntroductionEvaluationConclusionText detectionResults Local contrast SVM learning
36
36 Detection in video sequences FeaturesIntroductionEvaluationConclusionText detectionResults
37
37 Character segmentation: examples Original image Fisher/Otsu Fisher/Otsu (windowed) Yanowitz-B. Yanowitz-B. +post-proc. Niblack Sauvola et al. Contrast maximiz. FeaturesIntroductionEvaluationConclusionText detectionResults
38
38 OCR results Local contrast based binarization Recognition by Abby Finereader 5.0 Sauvola et al. MRF Bayesian estimation using a Markov random field prior FeaturesIntroductionEvaluationConclusionText detectionResults
39
39 TREC 2002 “Dance” “Energy Gas” “Music” “Oil” “Airline” “Air plane” FeaturesIntroductionEvaluationConclusionText detectionResults Collaboration with Laboratory LAMP, University of Maryland
40
40 Conclusion êThe choice of features is primordial in vision. êWe developed a new system for detection, tracking, enhancement and binarisation of text. êDetection performance is high due to the integration of several types of features in a very early stage. The learning method is less sensitive to textured noise in the image. êWe propose a new evaluation method which allows intuitive visualization of the detection quality by performance graphs. FeaturesIntroductionEvaluationConclusionText detectionResults
41
41 Outlook êPossible improvement of the features (e.g. contrast normalization, non-linear texture filters). êIntegration of different feature types (statistical, structural,...) êUsage of a priori knowledge on text in order to decrease the number of false alarms êIntegration of the detected text into a indexing/browsing/segmentation framework FeaturesIntroductionEvaluationConclusionText detectionResults
42
42 Optional slides
43
43 The Bhattacharyya distance
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.