Download presentation
Presentation is loading. Please wait.
Published byMalcolm Lang Modified over 9 years ago
1
Detection and Extraction of Artificial Text from Videos PROJECT France Télécom Research & Development 001B575 Laboratoire de Reconnaissance de Formes et Vision Bât. Jules Verne INSA 69621 Villeurbanne CEDEX 10 th July 2001 Christian Wolf and Jean-Michel Jolion http://rfv.insa-lyon.fr/~{wolf,jolion}
2
Plan of the presentation êIntroduction êDetection êImage enhancement - multiple frame integration êBinarisation of the text boxes êSetup of the experiments êResults ÔDetection ÔBinarisation ÔOCR êConclusion and outlook 6 8 3 10 11 6 2 46 Slides: IntroDetectionEnhancementBinarisationResultsExperiments
3
Content based image retrieval Similarity Function Result Example image Indexing phase DetectionEnhancementBinarisationResultsExperimentsIntro
4
Similarity measures similar Not similar DetectionEnhancementBinarisationResultsExperimentsIntro
5
Indexing using Text Keyword based Search Patrick Mayhew Min. chargé de l´irlande de Nord ISRAEL Jerusalem montage T.Nouel... Result Key word Indexing phase DetectionEnhancementBinarisationResultsExperimentsIntro
6
Video properties 80 px 12 px 8 px DetectionEnhancementBinarisationResultsExperimentsIntro
7
Text extraction: general scheme Tracking Detection of the text in single frames Image enhancement - Multiple frame integration Segmentation/ Binarisation OCR "EVENEMENT" "ACTU" "SPELEOS" "Gouffre Berger (Isére)" "aujourd'hui" "France 3 Alpes" "un spéléologue sauveteur" Video IntroDetectionEnhancementBinarisationResultsExperiments
8
Detection in single frames Calculation of the gradient Accumulation Binarisation Mathematical Morphology Connected components Analysis Verification of geometric constraints Combination of the rectangles Verification of special cases Video List of rectangles IntroDetectionEnhancementBinarisationResultsExperiments
9
Detection in single frames: examples IntroDetectionEnhancementBinarisationResultsExperiments
10
A filter for text detection Accumulation of horizontal gradients. Justification: Text forms a regular texture containing vertical edges which are aligned horizontally. WM-W IntroDetectionEnhancementBinarisationResultsExperiments
11
Mathematical morphology Close Deletion of small bridges between the components dilate (special) to connect characters erode (special) to connect characters erode horizontally dilate horizontally IntroDetectionEnhancementBinarisationResultsExperiments
12
Detection in video sequences Detection per single frame List of rectangles per frame Tracking - keeping track of text occurrences Suppression of false alarms Image Enhancement - Multiple frame integration Text occurrences Frame nr. (time) IntroDetectionEnhancementBinarisationResultsExperiments
13
Integration of the rectangles occurrences At every new frame, the detected rectangles must be matched with the stored text occurrences List of rectangles detected for the current frame Text occurrences Frame nr. (time) List containing the most recent rectangle of each text occurrence The integration is done using overlap information (overlap matrix) IntroDetectionEnhancementBinarisationResultsExperiments
14
Suppression of false alarms: Examples All detections After suppression of false alarms IntroDetectionEnhancementBinarisationResultsExperiments
15
Image enhancement Super-resolution (interpolation) Multiple frame integration: Averaging IntroDetectionEnhancementBinarisationResultsExperiments Integration of multiple frames to create a single image of higher quality. M1M1 M4M4 M2M2 M3M3 F i i th image MMean image VStd.deviation image An additional weight is included into the interp.scheme: Robust bi-linear Robust bi-cubic
16
Interpolation: Examples Bi-linear interpolation Robust bi-linear interpolation Robust bi-cubic interpolation IntroDetectionEnhancementBinarisationResultsExperiments
17
Interpolation: thresholded examples Bi-linear interpolation Robust bi-linear interpolation Robust bi-cubic interpolation IntroDetectionEnhancementBinarisationResultsExperiments
18
Binarisation Different Binarisation algorithms have been implemented and evaluated: Fisher/Otsu and windowed Fisher/Otsu algorithm Yanowitz-Bruckstein Niblack, Sauvola Our adaptive version of Niblack/Sauvola´s method. IntroDetectionEnhancementBinarisationResultsExperiments
19
Binarisation methods Yanowitz Bruckstein: The threshold surface is calculated from the edge information. Windowed-Fisher, Niblack-Sauvola: The threshold surface is calculated from the statistics collected in a window which is shifted across the image. Threshold surface IntroDetectionEnhancementBinarisationResultsExperiments
20
Binarisation by Niblack Niblack proposed a method which calculates a threshold surface by gliding a rectangular window over the image and calculating statistics on this window: mmean sstandard deviation kparameter, = -0.2 IntroDetectionEnhancementBinarisationResultsExperiments
21
Binarisation by Niblack: Problems Problems are light textures in the background, which are considered as text with small contrast: IntroDetectionEnhancementBinarisationResultsExperiments
22
Binarisation: Improvement by Sauvola mmean sstandard deviation kparameter, = 0.5 Rparameter (dynamic range of std.dev.), R = 128 To overcome these problems, Sauvola et al. proposed a new improved formula to calculate the threshold: Reformulation shows, that a hypothesis on the gray values of text and non-text are used to remove the noise produced by background textures: IntroDetectionEnhancementBinarisationResultsExperiments
23
Binarisation by Sauvola, examples Original image Binarised using Niblack´s method Binarised using Sauvola et al.´s method IntroDetectionEnhancementBinarisationResultsExperiments
24
Improvement: Adaptive dynamic range Nib Sauv. R=128 R ad. Fixing the dynamic range R=128 might be ok for document images, but not for text boxes taken from videos. Binarisation will not be correct, if the contrast of the image is smaller. We therefore set the parameter R to the maximum standard deviation for all windows calculated: To avoid two passes of the windowing algorithm, the mean and standard deviation can be stored in a table during the first pass and the threshold surface calculated on this data. IntroDetectionEnhancementBinarisationResultsExperiments
25
Improvement: Shift of the image range The strong hypothesis on the gray values (text pixels must be near zero) is not justified for some video text boxes: Gray value histogram Niblack Sauvola R=128 R ad. IntroDetectionEnhancementBinarisationResultsExperiments
26
Improvement: Shift of the image range A correction of the image´s histogram resolves this problem: Original image Corrected imagebinarised, R adaptive IntroDetectionEnhancementBinarisationResultsExperiments mmean sstandard deviation kparameter, = 0.5 R= maximum of the std.dev. of all windows M= minimum gray value of the text box The same effect can also be achieved by changing the threshold formula:
27
Fast incremental calculation Mean and variance can be calculated in one pass: L R At the beginning of each line, the full window is calculated and the variables a and b kept. After each shift, a and b are calculated incrementally by subtracting the column of pixels which left the window and adding the column which entered the window. Mean and standard deviation are stored in 2d tables, then the maximum R=max(s) is computed before calculating the threshold surface IntroDetectionEnhancementBinarisationResultsExperiments
28
The experiments Description of the experiments êThe videos used in the experiments. êDescription of the evaluation process (OCR Evaluation). Results for: êText detection êBinarisation êOCR IntroDetectionEnhancementBinarisationResultsExperiments
29
Test videos We performed experiments on 5 different MPEG 1 videos of resolution 384x288: IntroDetectionEnhancementBinarisationResultsExperiments
30
AIM3 News AIM4 Cartoon, News AIM5 News AIM2 Commercials IntroDetectionEnhancementBinarisationResultsExperiments
31
Video example - France Télécom ~22 minutes of video ~33000 frames IntroDetectionEnhancementBinarisationResultsExperiments
32
The interface to the OCR software Ideal situation: Pass individual (binarised) text boxes to an OCR software which recognises the contents box after box. In reality: We used standard commercial OCR software for our tests. This software has been designed to recognise scanned A4 or US letter pages and cannot directly process text boxes. A4 page IntroDetectionEnhancementBinarisationResultsExperiments
33
OCR Page - Manual An input image, ready for the OCR IntroDetectionEnhancementBinarisationResultsExperiments
34
OCR Output 051Q07Ô7 N*Verf 05JQ0707 PUBLICITE IPUBIIÏITE IPUBLICITE prenez prenez prenez boyard boyard boyard ^française ^française ^française FRANCE FRANCE FRANCE FRANCE FRANCE c'est plus musclé iï 'J fort fort fort fort fort.fort.fort.fort cotHfUet blé cotHfUet blé cQ#tfUet blé uutàfruuk On va beaucoup {&*$ loin avec Itineris. Partout Partout Partout Partout Partout I22h35 I22h35 I22h35 I22h35 I22h35 PUBLICITE \PUBLICITE \PUBLICITE >3h55l23h55l23h55l23h55l23h55l23h55 20h.50120h50 |20h50120h50 |20h50120h50,f ort boyard,f ort boyard 2,4 Kg J 2,4 Kg g 2,4 Kg J 2,4 Kg g 2,4 Kg J 2,4 Kg g 2,4 Kg J 2,4 Kg g 2,4 Kg J II II II II II II II II II gà dentsgà dents gà dents IIH r Lessive classique lljir Lessive classique I[HT Lessive classique le temps le temps le temps le temps le temps ^PUBLICITE ^PUBLICITE ^PUBLICITE I Par Amour du Goût. Il Par Amour du Goût. I en en en en en en en en en révolution révolution révolution IntroDetectionEnhancementBinarisationResultsExperiments
35
Post processing of OCR output 23h55 051Q07Ô7 PUBLICITE prenez boyard ^française FRANCE c'est plus musclé fort blé cotHfUet uutàfruuk On va beaucoup {&*$ loin avec Itineris. Partout I22h35 PUBLICITE \ >3h55l 20h.50,f ort boyard dimanche 23h55 N Vert 05100707 Berlingo PUBLICITE prenez diffusion simultanée en stéréo sur boyard française FRANCE c'est plus musclé PUBLICITE fort Coral blé complet fruits On va beaucoup Plus loin avec Itineris. Bohême Partout 22h35 PUBLICITE 23h55 20h50 fort fort boyard Post processed OCR outputGround truth IntroDetectionEnhancementBinarisationResultsExperiments
36
Automatic evaluation using markers The manual processing of the OCR output (separation of the output strings and search of the corresponding input box) is time consuming and error prone, especially in cases where the quality of the OCR output is very poor. Automatic OCR output processing can be achieved by placing marker images between the text boxes. The marker boxes contain text which is easily recognised by the OCR software. In the results section we will present results for both types of evaluation. IntroDetectionEnhancementBinarisationResultsExperiments
37
An input image with markers, ready for the OCR IntroDetectionEnhancementBinarisationResultsExperiments
38
OCR Evaluation Tkenchar 037 'gfrançaise 'gfrançaise Tkenchar 038 Mpe pire de| fj^e pire de| fj^e pire de| Tkenchar 039 @S Par Amour du Goût. @S en @S révolution @S la @S française @S le pire de @S 20H45 OCR outputRaw ground truth Search output for individual text boxes List of strings, each corres- ponding to the output for a text box, but eventually multiple times # Page 1: P 1 T 1 2 M 1 2 T 2 3 M 2 2 T 3 2 Structure log Prepare ground truth List of strings, each corresponding to the ground truth for a text box. Each string is repeated the same number of times as the corresponding text image in the OCR input image Evaluation Transformation cost Recall Precision IntroDetectionEnhancementBinarisationResultsExperiments
39
OCR Evaluation: Wagner & Fischer cost Substitution: Insertion: Deletion: AirbagGtroônn Airbag Citroën A measure for resemblance of two character strings. The cost to transform string A into string B is calculated. Basic transformation operations are used, which correspond to a certain cost. The cost function is minimised. IntroDetectionEnhancementBinarisationResultsExperiments
40
Detection results - INA Videos No suppression of false alarms IntroDetectionEnhancementBinarisationResultsExperiments
41
Binarisation methods: Examples Original image Fisher Fisher (windowed) Yanowitz B. Yanowitz B. + PP Niblack Sauvola et al. Our method IntroDetectionEnhancementBinarisationResultsExperiments
42
Binarisation methods: Examples Original image Fisher Fisher (windowed) Yanowitz B. Yanowitz B. + PP Niblack Sauvola et al. Our method IntroDetectionEnhancementBinarisationResultsExperiments
43
OCR Results - Classification by binarisation method Results obtained using the manual evaluation method (no markers in the input page). 44 pages Robust bi-cubic interpolation IntroDetectionEnhancementBinarisationResultsExperiments
44
OCR Results: Interpolation methods Robust bi-linear interpolation Robust bi-cubic interpolation 97 pages Results obtained using the automatic evaluation method (including markers in the input page). Robust bi-cubic interpolation IntroDetectionEnhancementBinarisationResultsExperiments
45
Conclusion êWe developed a system for detection, tracking, enhancement and binarisation of text. êA detection performance of 93.5% is obtained. êWe derived a new binarisation method adapted to the type of text found in videos. êThe total recognition rate is surprisingly high, given the quality of the text, but not yet good enough for indexation purposes. êOCR integration problem: No software development kits for direct access to the recognition functions available. A collaboration with an OCR company seems to be inevitable. IntroDetectionEnhancementBinarisationResultsExperiments
46
Outlook The perspectives of our work are situated in the extension of the existing algorithms to text with more difficult properties, and the enhancement and deeper studies of the existing techniques: Scene text: The binarisation techniques developed in the last 30 years are aimed either at document images or images from computer vision. The method we introduced in the framework of this project is an improvement of the work already presented, but the quality of the text is not yet satisfying enough. Especially the binarisation of scene text will demand the development of new methods. Detection recall: We are convinced, that the recall of the detection system can still be increased by further research, e.g. on the binarisation technique applied to the map of accumulated gradients. IntroDetectionEnhancementBinarisationResultsExperiments
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.