Dr. István Marosi Scansoft-Recognita, Inc., Hungary SSIP 2005, Szeged Character Recognition Internals.

Slides:



Advertisements
Similar presentations
Don’t Type it! OCR it! How to use an online OCR..
Advertisements

Patient information extraction in digitized X-ray imagery Hsien-Huang P. Wu Department of Electrical Engineering, National Yunlin University of Science.
Chapter 3 – Web Design Tables & Page Layout
Standard Grade Notes General Purpose Packages. These are Software packages which allow the user to solve a range of problems.
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Segmentation of Touching Characters in Devnagari & Bangla Scripts Using Fuzzy MultiFactorial Analysis Presented By: Sanjeev Maharjan St. Xavier’s College.
Advanced Scanning Techniques. Workflow  Separate pages one by one Gives a feel for book structure and paper  Pick a few representative pages (6 max)
Premier Director Document Imaging
Microsoft Excel 2013 An Overview. Environment Quick Access Toolbar Customizable toolbar for one-click shortcuts Tabs Backstage View Tools located outside.
Dori Baldwin Student Info Mgmt Coordinator Kent ISD.
Flowcharts Using Visio. Definitions  An Algorithm is just a detailed sequence of simple steps that are needed to solve a problem.  A Flowchart is a.
Each pixel is 0 or 1, background or foreground Image processing to
Segmentation (2): edge detection
Extraction of text data and hyperlink structure from scanned images of mathematical journals Ann Arbor, March 19, 2002 Masakazu Suzuki (Kyushu University)
Virtual Dart: An Augmented Reality Game on Mobile Device Supervisor: Professor Michael R. Lyu Prepared by: Lai Chung Sum Siu Ho Tung.
Text Detection in Video Min Cai Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.
Prénom Nom Document Analysis: Segmentation & Layout Analysis Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Highlights Lecture on the image part (10) Automatic Perception 16
Artificial Neural Networks (ANNs)
TITLE: Template for a 84.1 cm wide x cm tall poster (A0)
Processing PDF: How to Go from PDF to E-text to Audio Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges Foothill.
Digital Image Processing & Pattern Analysis (CSCE 563) Course Outline & Introduction Prof. Amr Goneid Department of Computer Science & Engineering The.
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
Database Design IST 7-10 Presented by Miss Egan and Miss Richards.
Advanced OCR with OmniPage and FineReader. Overview Optical character recognition Optical character recognition Structural recognition Structural recognition.
Creating a Web Page HTML, FrontPage, Word, Composer.
Track, Trace & Control Solutions © 2010 Microscan Systems, Inc. Machine Vision Tools for Solving Auto ID Applications Part 3 of a 3-part webinar series:
Wizards, Templates, Styles & Macros Chapter 3. Contents This presentation covers the following: – Purpose, Characteristics, Advantages and Disadvantages.
TH-OCR NK. content introduction go to next page background assumptions overall structure chart IPO for overall structure dataflow diagram of overall structure.
Presented by: Kamakhaya Argulewar Guided by: Prof. Shweta V. Jain
Classification with Hyperplanes Defines a boundary between various points of data which represent examples plotted in multidimensional space according.
Aniko T. Valko, Keymodule Ltd.
© Cheltenham Computer Training 2002 Microsoft Publisher 2002 – Slide No 1 Microsoft Publisher 2002 Intermediate Level Course.
: Chapter 10: Image Recognition 1 Montri Karnjanadecha ac.th/~montri Image Processing.
Millionaire Kristin Strickland & Johnny Fuller. What does the wavy, red line under a word in a presentation mean? a. Misspelling b. Grammar Error c. Synonym.
CTS130 Spreadsheet Lesson 3 Using Editing and Formatting Tools.
Chapter 1 Review Images Links Images II Pictures and Extensions.
October 14, 2014Computer Vision Lecture 11: Image Segmentation I 1Contours How should we represent contours? A good contour representation should meet.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
EE 492 ENGINEERING PROJECT LIP TRACKING Yusuf Ziya Işık & Ashat Turlibayev Yusuf Ziya Işık & Ashat Turlibayev Advisor: Prof. Dr. Bülent Sankur Advisor:
Standard Grade Computing General Purpose Packages WORD-PROCESSING WORD-PROCESSING Chapter 2.
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
Problem description and pipeline
CS 6825: Binary Image Processing – binary blob metrics
Welcome to BIM Jeopardy. CopyrightAccessWordExcelPowerPoint
0 Paper rocess Scanner Throughput P eople PP P Effective Scanner Throughput Consider KOFAX – VRS (Virtual Re-Scan) Increase Productivity.
Scanner Applications Fall ‘08. Scanner Log in Names The scanner applications are now installed locally on all of the scanner computers – This means you.
Ascending  Sort order arranging text or numbers from A to Z, from smallest to largest, or from earliest to latest.
Presented By: Abirami Poonkundran Authors: Jeff Yan, Ahmad El Ahmad.
RESEARCH POSTER PRESENTATION DESIGN © entations.com QUICK DESIGN GUIDE (--THIS SECTION DOES NOT PRINT--) This PowerPoint 2007 template.
Scanned Documents INST 734 Module 10 Doug Oard. Agenda Document image retrieval  Representation Retrieval Thanks for David Doermann for most of these.
Additional Features in Microsoft Word Session Version 1.0 © 2011 Aptech Limited.
C - IT Acumens. COMIT Acumens. COM. To demonstrate the use of Neural Networks in the field of Character and Pattern Recognition by simulating a neural.
WP3: Image Segmentation - OCR Stavros Perantonis, Vassilis Maragos Edinburgh, March 6-7, 2003 Institute of Informatics & Telecommunications NCSR “Demokritos”
1 A Statistical Matching Method in Wavelet Domain for Handwritten Character Recognition Presented by Te-Wei Chiang July, 2005.
Surface Defect Inspection: an Artificial Immune Approach Dr. Hong Zheng and Dr. Saeid Nahavandi School of Engineering and Technology.
Optical Character Recognition
OCR Reading.
Microsoft Office 2007-Illustrated
S.Rajeswari Head , Scientific Information Resource Division
What is an ANN ? The inventor of the first neuro computer, Dr. Robert defines a neural network as,A human brain like system consisting of a large number.
Chapter 4 Application Software
Conversion accuracy Users often need to convert PDF files into more familiar and capable editing tools such as Microsoft Word, Excel and PowerPoint PDF.
Aniko T. Valko, Keymodule Ltd.
Dr. István Marosi Recosoft Ltd., Hungary
EE 492 ENGINEERING PROJECT
Ann Arbor, March 19, 2002 Masakazu Suzuki (Kyushu University)
Procedure for Developing a Multimedia Presentation
Creating Accessible PDF’s
Presentation transcript:

Dr. István Marosi Scansoft-Recognita, Inc., Hungary SSIP 2005, Szeged Character Recognition Internals

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition User assisted correction Result exportation

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Get image B/W Scanning Gray Scanning Color Scanning Load from image file Preprocess image Layout recognition Text recognition User assisted correction Result exportation

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Get image Preprocess image Color separation Thresholding Despeckling Rotation Deskewing Layout recognition Text recognition User assisted correction Result exportation Color Separation De-speckle, de-skew

04 Jul 2005Istvan Marosi The Preprocessed Image Joined chars

04 Jul 2005Istvan Marosi Joined chars The Preprocessed Image

04 Jul 2005Istvan Marosi The Preprocessed Image Joined chars

04 Jul 2005Istvan Marosi The Preprocessed Image Broken chars

04 Jul 2005Istvan Marosi The Preprocessed Image Broken chars

04 Jul 2005Istvan Marosi The Preprocessed Image Broken chars

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text zones Columns of flowed text Tables Inverse text Graphic zones Text recognition User assisted correction Result exportation

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text zones Graphic zones Line Art Photo Text recognition User assisted correction Result exportation

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition Segmentation Calculation of Feature Vector Elements Classification Language Analysis Voting User assisted correction Result exportation

04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?

04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?

04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?

04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?

04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?

04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?

04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition Segmentation Calculation of Feature Vector Elements Classification Language Analysis Voting User assisted correction Result exportation

04 Jul 2005Istvan Marosi Calculation of FV Elements: Contour Tracing Find a (new) white-black transition Follow the “edge” of the pixels using the MIN or MAX rule Administrate the already traced white-black transitions Collect information while going around And repeat the process on new shapes...

04 Jul 2005Istvan Marosi Contour Tracing Find a (new) white-black transition Follow the “edge” of the pixels using the MIN or MAX rule Administrate the already traced white-black transitions Collect information while going around And repeat the process on new shapes...

04 Jul 2005Istvan Marosi Contour Tracing Find a (new) white-black transition Follow the “edge” of the pixels using the MIN or MAX rule if black(a) then turn(ccw) else if black(b) then forward else turn(cw) ab

04 Jul 2005Istvan Marosi Contour Tracing Find a (new) white-black transition Follow the “edge” of the pixels using the MIN or MAX rule if black(a) then turn(ccw) else if black(b) then forward else turn(cw) ab if white(b) then turn(cw) else if white(a) then forward else turn(ccw) a b

04 Jul 2005Istvan Marosi Contour Tracing Find a (new) white-black transition Follow the “edge” of the pixels using the MIN or MAX rule Administrate the already traced white-black transitions Collect information while going around And repeat the process on new shapes...

04 Jul 2005Istvan Marosi Some Easily Calculatable Data Problem #1 Turning CW: I n =I n-1 +1 Turning CCW: I n =I n-1 -1 Going Forward: I n =I n-1

04 Jul 2005Istvan Marosi Some Easily Calculatable Data Problem #2 Turning CW: I n =I n-1 +1 Turning CCW: I n =I n-1 -1 Going Forward: I n =I n-1

04 Jul 2005Istvan Marosi Some Easily Calculatable Data Problem #3 Going Up: I n =I n-1 -X n Going Down: I n =I n-1 +X n Going Right: I n =I n-1 Going Left: I n =I n-1

04 Jul 2005Istvan Marosi Some Easily Calculatable Data Problem #4 Going Up: I n =I n-1 -X n Going Down: I n =I n-1 +X n Going Right: I n =I n-1 Going Left: I n =I n-1

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition Segmentation Calculation of Feature Vector Elements Classification Language Analysis Voting User assisted correction Result exportation A B A B

04 Jul 2005Istvan Marosi A B A B Classification; Training models Restricted Coulomb Energy (RCE) Network (Dr. Leon Cooper, Dr. Charles Elbaum)

04 Jul 2005Istvan Marosi Classification; Training models Restricted Coulomb Energy (RCE) Network (Dr. Leon Cooper, Dr. Charles Elbaum) A B A B

04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS)

04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS) Default radius R max

04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS)

04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS) Default radius R max

04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS)

04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS)

04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS) Default radius R max

04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS)

04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS) Decreased radius

04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS)

04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS) Decreased radius R min

04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS) Pass 2 Decreased radius

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition Segmentation Calculation of Feature Vector Elements Classification Language Analysis Voting User assisted correction Result exportation

04 Jul 2005Istvan Marosi Voting Text recognition in OmniPage Pro OCR Engines available: Caere’s engine (codename: Salt & Pepper) Recognita’s engine (codename: Paprika) ScanSoft’s engine (codename: Fireworx)

04 Jul 2005Istvan Marosi Text recognition in OmniPage Pro OCR Engines available: Caere’s engine (Salt & Pepper) Uses a Matrix Matching based algorithm feature set: 40 cells of an 8x5 grid good overall description of a shape weaker at detailed structure Recognita’s engine (Paprika) Uses a Contour Tracing based algorithm feture set: convex and concave arcs on the contour good detailed description of a shape weaker at overall structure Voting

04 Jul 2005Istvan Marosi Text recognition in OmniPage Pro OCR Engines available: Caere’s engine (Salt & Pepper) Recognita’s engine (Paprika) ScanSoft’s engine (Fireworx) Segmentation algorithms: Voting

04 Jul 2005Istvan Marosi Text recognition in OmniPage Pro OCR Engines available: Caere’s engine (Salt & Pepper) Recognita’s engine (Paprika) ScanSoft’s engine (Fireworx) Segmentation algorithms: Developed by independent groups Have different strengths and weaknesses Voting

04 Jul 2005Istvan Marosi Text recognition in OmniPage Pro OCR Engines available Segmentation algorithms Conclusion: They are complementary Let’s create a voting system Voting

04 Jul 2005Istvan Marosi Voting strategies External „Black box” voting ~20% gain Image Paprika Salt & Pepper Vote Txt 3Txt 1 Dict Final Txt Voting Fire- worx Txt 2

04 Jul 2005Istvan Marosi Voting strategies External „Black box” voting Internal „Shape” voting Voting Image Paprika Fire- worx Bronze Txt 3 Txt 2 Dict Final Txt Salt & Pepper Txt 1

04 Jul 2005Istvan Marosi Paprika Original segmentation: Every independent connected component is a character Good segmentation:recognize Bad segmentation:reject Image Recognize original segmentation K.B. Voting

04 Jul 2005Istvan Marosi Paprika Image Recognize original segmentation Txt 2 Train adaptive classifier from original shapes K.B. Adaptive K.B. Voting Txt 1

04 Jul 2005Istvan Marosi Paprika Try several segmentations Loop if unrecognizable Image Recognize original segmentation Txt 2 Train adaptive classifier from original shapes Recognize broken and joined shapes K.B. Adaptive K.B. Voting Txt 1 Dict

04 Jul 2005Istvan Marosi Paprika Image Recognize original segmentation Txt 2 Train adaptive classifier from original shapes Recognize broken and joined shapes K.B. Adaptive K.B. Train adaptive classifier from ‘ugly’ shapes Voting Txt 1 Dict

04 Jul 2005Istvan Marosi Paprika Image Recognize original segmentation Txt 3 Txt 2 Train adaptive classifier from original shapes Recognize broken and joined shapes K.B. Adaptive K.B. Train adaptive classifier from ‘ugly’ shapes Recognize more broken and joined shapes Try several segmentations Loop if unrecognizable Voting Txt 1 Dict

04 Jul 2005Istvan Marosi Image Paprika Fire- worx Bronze Txt 3 Txt 1 Dict Final Txt Salt & Pepper Txt 1 Voting strategies ~60% gain Voting

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition User assisted correction By the user’s random editing... Pop-up verifier Manual Training By proofreading of doubtful words Result exportation

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition User assisted correction By the user’s random editing... By proofreading of doubtful words Correct: User dictionary Changed: IntelliTrain Remember trained characters Apply them on following pages Result exportation

04 Jul 2005Istvan Marosi IntelliTrain Recognized word: sorneUüng

04 Jul 2005Istvan Marosi IntelliTrain Recognized word: sorneUüng Fixed word: something

04 Jul 2005Istvan Marosi IntelliTrain Recognized word: sorneUüng Fixed word: something

04 Jul 2005Istvan Marosi IntelliTrain Recognized word: sorneUüng Fixed word: something Substitutions found:m  rn thi  Uü

04 Jul 2005Istvan Marosi IntelliTrain Recognized word: sorneUüng Fixed word: something Substitutions found:m  rn thi  Uü Perform automatically: Learn image pattern and substitution info Find similar substituted (‘ blue ’) text on actual page Match against pattern of substitution and correct Find such errors on following pages, too

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition User assisted correction Result exportation Combine pages into a Document Header / Footer recognition Page numbers Hyperlinks (e.g. „ See Table 20 ”) Save results

04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition User assisted correction Result exportation Combine pages into a Document Save results doc file Speech synthesizer

04 Jul 2005Istvan Marosi