Session 302 Using Optical Character Recognition Programs Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges.

Slides:



Advertisements
Similar presentations
MS® PowerPoint.
Advertisements

1) Terms to Know 2) Starting an Office 97 Application 8) Finding a missing file 7)File Managment 4) Utilizing the Right Mouse Button 6) Using Help 3)
KompoZer. This is what KompoZer will look like with a blank document open. As you can see, there are a lot of icons for beginning users. But don't be.
Chapter 3 Creating a Business Letter with a Letterhead and Table
 Use the Left and Right arrow keys or the Page Up and Page Down keys to move between the pages. You can also click on the pages to move forward.  To.
Welcome to IT-Training -We’re here to teach you PowerPoint-
Loading Excel Double click the Excel icon on the desktop (if you have this) OR Click on Start All Programs Microsoft Office Microsoft Office Excel 2003.
Microsoft Word 2010 Lesson 1: Introduction to Word.
Word Processing Word Processing
Foundation Level Course
Microsoft Word Review.
XP New Perspectives on Microsoft Office Excel 2003, Second Edition- Tutorial 3 1 Microsoft Office Excel 2003 Tutorial 3 – Developing a Professional- Looking.
XP New Perspectives on Microsoft Office Word 2003 Tutorial 1 1 Microsoft Office Word 2003 Tutorial 1 – Creating a Document.
XP 1 Microsoft Office Word 2003 Tutorial 1 – Creating a Document.
Microsoft Word 2007 Introduction to Word Processors.
Intro to Microsoft Word.
Microsoft Office Word Plan a document Word is a tool that helps you quickly create documents with a professional look. You should follow four steps.
Key Applications Module Lesson 12 — Word Essentials
PowerPoint Add formulae. Course contents Overview: Typing math formulae Lesson1: Type a simple formula Lesson2: Type a complex formula.
Session 803: Processing PDF Files Gaeir Dietrich Director High Tech Center Training Unit
Processing PDF: How to Go from PDF to E-text to Audio Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges Foothill.
Processing PDF: How to Go from PDF to E-text to Audio Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges Foothill.
Advanced OCR with OmniPage and FineReader. Overview Optical character recognition Optical character recognition Structural recognition Structural recognition.
Using Microsoft Outlook: Basics. Objectives Guided Tour of Outlook –Identification –Views Basics –Contacts –Folders –Web Access Q&A.
Word Processing basics
Word Tutorial 1 Creating a document.
Microsoft Office Word 2003 Tutorial 1 Creating a Document.
Creating a Presentation
1 CA201 Word Application Increasing Efficiency Week # 13 By Tariq Ibn Aziz Dammam Community college.
Copyright 2007, Paradigm Publishing Inc. EXCEL 2007 Chapter 7 BACKNEXTEND 7-1 LINKS TO OBJECTIVES Record & run a macro Record & run a macro Save as a macro-
CTS130 Spreadsheet Lesson 3 Using Editing and Formatting Tools.
XP New Perspectives on Microsoft PowerPoint 2002 Tutorial 1 1 Microsoft PowerPoint 2002 Tutorial 1 – Creating a PowerPoint Presentation.
| | Tel: | | Computer Training & Personal Development Outlook Express Complete.
MICROSOFT WORD GETTING STARTED WITH WORD. CONTENTS 1.STARTING THE PROGRAMSTARTING THE PROGRAM 2.BASIC TEXT EDITINGBASIC TEXT EDITING 3.SAVING A DOCUMENTSAVING.
Microsoft Excel Spreadsheet Review. Templates  Templates can be produced for the following elements:  Text and Graphics  Formatting Information – Layouts,
Microsoft Office Excel 2003 Tutorial 3 – Developing a Professional-Looking Worksheet.
MSOffice WORD 1 Microsoft® Office 2010: Illustrated Introductory Part 1 ®
Productivity Programs Common Features and Commands.
Get up to speed What’s changed, and why Yes, there’s a lot of change in Excel It’s most noticeable at the top of the window. But it’s good change.
 To begin you first need to sign up to Weebly by going to or alternatively and we will create an account.
Key Applications Module Lesson 17 — Organizing Worksheets Computer Literacy BASICS.
A skills approach © 2012 The McGraw-Hill Companies, Inc. All rights reserved. powerpoint 2010 Chapter 4 Managing and Delivering Presentations.
Working with Inaccessible PDFs Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges
McGraw-Hill/Irwin The Interactive Computing Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft PowerPoint 2002 Lesson 3 Developing.
 Given live by a presenter  Played without a presenter on a computer screen or on the Web  Slides provide a way to use text and graphics to introduce.
Excel Tips to Make Your Life Easier Michael Winecoff Associate University Librarian for Technical Services November 5, 2015.
 Start Microsoft Word from the icon or shortcut for the application. This is usually accessible from the Start Button. Then go to Programs, then Microsoft.
Page Layout You can quickly and easily format the entire document to give it a professional and modern look by applying a document theme. A document theme.
Lecture 4 Prepared By : Md Jakaria 1 Microsoft Word Basics.
Introduction to Word Processing.  Learn uses of word-processing software  Differentiate between typewriter and word- processing software  Explore various.
MICROSOFT WORD PRESENTATION. Word Processing  Software that is designed for the entry, editing, and printing of documents.  Windows Version = Microsoft.
MS Word. Getting Started The Microsoft Office Button The Microsoft Office button performs many of the functions that were located in the File menu of.
How to Create a Power Point Presentation. Topics that will be covered: 1) Getting Started 2) Common Features 3) Working with Text 4) Working with Graphics.
MS WORD INFORMATION TECHNOLOGY MANAGEMENT SERVICE Training & Research Division.
Key Applications Module Lesson 12 — Word Essentials Computer Literacy BASICS.
What’s changed, and why Lesson 1 By the end of this lesson you will be able to complete the following: Get a handle on the new look of Excel. Understand.
Chapter 11 Enhancing an Online Form and Using Macros Microsoft Word 2013.
Creating Accessible PDFs
Setting Defaults in Microsoft Word for Accessibility
Key Applications Module Lesson 17 — Organizing Worksheets
© Paradigm Publishing, Inc.
Session 901 Using Optical Character Recognition Programs
Key Applications Module Lesson 12 — Word Essentials
Learning the Basics of Microsoft Word 2010 for Microsoft Windows
Key Applications Module Lesson 12 — Word Essentials
HIBBs is a program of the Global Health Informatics Partnership Learning the Basics of Microsoft Word 2019 and Microsoft office support TFN
Welcome To Microsoft Word 2016
Microsoft Excel 2007 – Level 2
Microsoft Word 2007 Introduction to Word Processors.
Presentation transcript:

Session 302 Using Optical Character Recognition Programs Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges

8/13/2015CTEBVI Conference2 Overview Optical character recognition Optical character recognition Structural recognition Structural recognition Options Options Loading Loading Zoning Zoning OCR OCR Editing Editing

8/13/2015CTEBVI Conference3 Optical Character Recognition (OCR) OCR turns pictures of text into e-text OCR turns pictures of text into e-text Does well unless… Does well unless… –The picture is fuzzy –The contrast is poor –The font is unusual –The font is too small or too large –The material has unusual characters

8/13/2015CTEBVI Conference4 Structural Recognition Analyzes the layout of the page Analyzes the layout of the page –Columns –Headings –Graphics –Tables Usually does fairly well, unless the layout is non-standard Usually does fairly well, unless the layout is non-standard

8/13/2015CTEBVI Conference5 Getting Better…but… Although the programs are improving all the time, it is unwise to trust to the automated features. Although the programs are improving all the time, it is unwise to trust to the automated features. Learn to know what the program is doing and correct it when it errs. Learn to know what the program is doing and correct it when it errs.

8/13/2015CTEBVI Conference6 Programs that Run OCR Programs for consumers Programs for consumers –Kurzweil 1000, 3000 –OpenBook –Intel Reader –Many others… Programs for production Programs for production –ABBYY FineReader –Nuance OmniPage

8/13/2015CTEBVI Conference7 Consumer Programs Highly automated Highly automated Designed for individuals who have print disabilities Designed for individuals who have print disabilities Are not good production tools Are not good production tools –Do not provide flexibility –Do not allow much overriding –Interfaces not designed for editing

8/13/2015CTEBVI Conference8 Production Programs in General A good program for production allows you to… A good program for production allows you to… –Control the zones (areas or blocks of text and graphics) Add, delete, change Add, delete, change –Edit easily –Improve recognition

8/13/2015CTEBVI Conference9 Preferred Programs ABBYY FineReader ABBYY FineReader –Relatively easy to learn –Fairly intuitive –Good structural recognition Nuance OmniPage Nuance OmniPage –Less intuitive but more accessible –Often does better with technical materials

8/13/2015CTEBVI Conference10 Both Good Tools If you can afford to have both, it’s nice, but not absolutely necessary. If you can afford to have both, it’s nice, but not absolutely necessary. If you have both, run a couple test pages through each to see which is doing better on a particular job. If you have both, run a couple test pages through each to see which is doing better on a particular job.

8/13/2015CTEBVI Conference11 For Today Focus on ABBYY FineReader Focus on ABBYY FineReader –A little less expensive –Easier for folks who do not use an OCR program every day Let’s launch and go! Let’s launch and go!

8/13/2015CTEBVI Conference12 Wizards Are Evil… Turn off the automated “Tasks” manager Turn off the automated “Tasks” manager Uncheck the Show at startup check box Uncheck the Show at startup check box –Bottom left corner of the Tasks box Choose Open Image/PDF Choose Open Image/PDF

8/13/2015CTEBVI Conference13

8/13/2015CTEBVI Conference14 Under the Hood For best results with a program, set up your options before you begin! For best results with a program, set up your options before you begin! Tools > Options Tools > Options Shortcut keys: Ctrl + Shift + O Shortcut keys: Ctrl + Shift + O

8/13/2015CTEBVI Conference15

8/13/2015CTEBVI Conference16 Document Tab Languages drop-down menu allows you to select the languages that are in your document. Languages drop-down menu allows you to select the languages that are in your document.

8/13/2015CTEBVI Conference17

8/13/2015CTEBVI Conference18 More Languages If you do not see the languages you need, select More Languages. If you do not see the languages you need, select More Languages. Notice at the end of the list, it includes computer languages, numbers, and chemical formulas. Notice at the end of the list, it includes computer languages, numbers, and chemical formulas. Turn on what you need, but only what you need. Turn on what you need, but only what you need.

8/13/2015CTEBVI Conference19

8/13/2015CTEBVI Conference20 Tip If you are running OCR on math, turn on Greek. If you are running OCR on math, turn on Greek. –Greek will allow the program to recognize alphas, deltas, sigmas, etc. For foreign language, turn on all the languages in the book. For foreign language, turn on all the languages in the book. –It will recognize the diacritical marks.

8/13/2015CTEBVI Conference21 Scan and Open Tab Change the radio button under General to “Do not read and analyze acquired page images automatically.” Change the radio button under General to “Do not read and analyze acquired page images automatically.” Remember…wizards are evil… Remember…wizards are evil…

8/13/2015CTEBVI Conference22 Another Decision Under Image Preprocessing, you have the choice to Detect page orientation. Under Image Preprocessing, you have the choice to Detect page orientation. –Try it if you have many pages turned, but it sometimes goofs. Also note the Split facing pages feature. Also note the Split facing pages feature. –Nice if you have a two-page spread.

8/13/2015CTEBVI Conference23

8/13/2015CTEBVI Conference24 Read Tab The “pattern editor” is useful if you have a book with a very unusual font. The “pattern editor” is useful if you have a book with a very unusual font. –You can map the letters by telling the program what each letter is. –Not worth it for occasional errors, but very useful for books filled with otherwise unreadable fonts.

8/13/2015CTEBVI Conference25

8/13/2015CTEBVI Conference26 Save Tab Specify which format you want as an end product. Specify which format you want as an end product. For Word docs, choose either Formatted Text or Plain Text. For Word docs, choose either Formatted Text or Plain Text. –Otherwise, you can get the dreaded “textbox.”

8/13/2015CTEBVI Conference27 Considerations You may or may not want to keep headers and footers. You may or may not want to keep headers and footers. –I generally keep them to pull the page numbers. You may want to keep the page breaks. You may want to keep the page breaks. –Retaining page breaks helps to maintain one-to-one page correspondence with the book.

8/13/2015CTEBVI Conference28 Paper Size In some cases, you may wish to work with a custom paper size and choose “Increase paper size to fit content.” In some cases, you may wish to work with a custom paper size and choose “Increase paper size to fit content.” This feature can be helpful when you are retaining everything on the page but not the layout. This feature can be helpful when you are retaining everything on the page but not the layout.

8/13/2015CTEBVI Conference29

8/13/2015CTEBVI Conference30 View Tab The view tab has some nice features for those with visual impairments. The view tab has some nice features for those with visual impairments. Colors are completely customizable. Colors are completely customizable. –Choose the mark-up, then click on the color swatch. –Choose Define Custom Colors for more choices.

8/13/2015CTEBVI Conference31

8/13/2015CTEBVI Conference32 More Choices The View Tab also allows you to control the appearance of your working window. The View Tab also allows you to control the appearance of your working window. Pages window > Thumbnails Pages window > Thumbnails –Shows graphics of the pages on the left- hand side (under “Pages”).

8/13/2015CTEBVI Conference33

8/13/2015CTEBVI Conference34 More Accessible Instead, you can see a detail view. Instead, you can see a detail view. Detail view is more accessible for screen readers. Detail view is more accessible for screen readers. Otherwise, it is personal preference. Otherwise, it is personal preference. Pages window > Details Pages window > Details –Shows text instead of graphics

8/13/2015CTEBVI Conference35

8/13/2015CTEBVI Conference36 Advanced Tab This tab has choices about spell check and editing. This tab has choices about spell check and editing. Please note that if the program is handling spacing around punctuation incorrectly, there is an option on this tab to fix the problem. Please note that if the program is handling spacing around punctuation incorrectly, there is an option on this tab to fix the problem.

8/13/2015CTEBVI Conference37

8/13/2015CTEBVI Conference38 Customizing Tools Choose Tools > Customize Choose Tools > Customize Under Categories, select Image Under Categories, select Image Move two tools to your Quick Access toolbar Move two tools to your Quick Access toolbar –Select the tool and use the double arrow button to move the tool

8/13/2015CTEBVI Conference39 Move Eraser

8/13/2015CTEBVI Conference40 Move Order Areas

8/13/2015CTEBVI Conference41 Turn on Quick Tools View > Toolbars > Quick Access View > Toolbars > Quick Access

8/13/2015CTEBVI Conference42

8/13/2015CTEBVI Conference43 Ready We have set our options. We have set our options. We have customized our tools. We have customized our tools. These features are now set. These features are now set. –Do not need to do again until reinstall program.

8/13/2015CTEBVI Conference44 Time to Start Working!

8/13/2015CTEBVI Conference45 Please Note Although you can scan with the program, preference is to scan with your scanning utility (that came with your scanner) and load the resulting TIFF or JPEGs into FineReader. Although you can scan with the program, preference is to scan with your scanning utility (that came with your scanner) and load the resulting TIFF or JPEGs into FineReader. No scanning utility? Then go ahead and scan with FineReader (Ctrl + K). No scanning utility? Then go ahead and scan with FineReader (Ctrl + K).

8/13/2015CTEBVI Conference46 Loading a File Open an Image Open an Image –Click the open icon –Control + O Image files include TIFF, JPEG, PDF, BMP, GIF, etc. Image files include TIFF, JPEG, PDF, BMP, GIF, etc.

8/13/2015CTEBVI Conference47

8/13/2015CTEBVI Conference48 Workspace The program has three primary areas The program has three primary areas Pages Pane Pages Pane –Either thumbnails or details –Allows simple navigation of pages Image Pane Image Pane –Your graphic Text Pane Text Pane –Area where the text from OCR will show

8/13/2015CTEBVI Conference49

8/13/2015CTEBVI Conference50 Handy Tip Whichever pane has your focus, bring up more information by using the shortcut Alt + Enter. Whichever pane has your focus, bring up more information by using the shortcut Alt + Enter. –Use shortcut again to toggle off Under the Image Pane, you get information about the image. Under the Image Pane, you get information about the image.

8/13/2015CTEBVI Conference51

8/13/2015CTEBVI Conference52 Understanding the Menus ABBYY designates three different “chunks” that it works with. ABBYY designates three different “chunks” that it works with. –Actions applied to entire documents Document Menu Document Menu –Actions applied only to the selected page Page Menu Page Menu –Actions applied only to the selected area Areas Menu Areas Menu

8/13/2015CTEBVI Conference53 To Avoid Confusion Always be aware of what is selected when you apply an action Always be aware of what is selected when you apply an action

8/13/2015CTEBVI Conference54 To Edit the Image Sometimes it is useful to clean up an image before processing it. Sometimes it is useful to clean up an image before processing it. A scan of a page marked with black pen, for instance, may benefit from erasing some of the stray marks. A scan of a page marked with black pen, for instance, may benefit from erasing some of the stray marks. Choose Edit Image from the tools. Choose Edit Image from the tools.

8/13/2015CTEBVI Conference55

8/13/2015CTEBVI Conference56 Edit Image The eraser tool allows you to remove stray marks. The eraser tool allows you to remove stray marks. –Just lasso whatever you want to delete.

8/13/2015CTEBVI Conference57

8/13/2015CTEBVI Conference58 Erasing We can remove the graphic in the middle of the text. We can remove the graphic in the middle of the text.

8/13/2015CTEBVI Conference59

8/13/2015CTEBVI Conference60 ABBYY Quirk It works best to separate the layout analysis from the character recognition. It works best to separate the layout analysis from the character recognition. Analyze layout first, adjust as necessary, then read the document. Analyze layout first, adjust as necessary, then read the document.

8/13/2015CTEBVI Conference61 Layout First Choose Document > Analyze Layout Choose Document > Analyze Layout Keyboard shortcut: Ctrl + Shift +E Keyboard shortcut: Ctrl + Shift +E (Please note: If you use Dolphin products, you may experience some keyboard conflicts.) (Please note: If you use Dolphin products, you may experience some keyboard conflicts.)

8/13/2015CTEBVI Conference62

8/13/2015CTEBVI Conference63 Areas Are Blocked There are now colored blocks around the areas. There are now colored blocks around the areas. –Text is green –Graphics are red –Tables are blue To change an area, right click. To change an area, right click.

8/13/2015CTEBVI Conference64 Right Click in Area

8/13/2015CTEBVI Conference65 Modify Area Choose the white arrow tool (on the image toolbar) to modify the area. Choose the white arrow tool (on the image toolbar) to modify the area. Please note: You can also draw the areas yourself using the tools at the top of the Image Paane. Please note: You can also draw the areas yourself using the tools at the top of the Image Paane.

8/13/2015CTEBVI Conference66 Change First Make sure that you do any changes to the layout before you run OCR. Make sure that you do any changes to the layout before you run OCR. ABBYY does not like have lots of changes made after the text has been recognized. ABBYY does not like have lots of changes made after the text has been recognized. –Crashing can result.

8/13/2015CTEBVI Conference67 Now Read Choose Document > Read Choose Document > Read Shortcut: Ctrl + Shift + R Shortcut: Ctrl + Shift + R

8/13/2015CTEBVI Conference68

8/13/2015CTEBVI Conference69 Edit You can visually scan errors You can visually scan errors Or use the verification tool. Or use the verification tool. The verification tool brings up the error and the graphic on one screen. The verification tool brings up the error and the graphic on one screen. –It works like spell-check for proofreading.

8/13/2015CTEBVI Conference70

8/13/2015CTEBVI Conference71 Save the File Save to Word. Save to Word. –The default is RTF, but you can choose DOC or DOCX. –Create a single file for all pages or individual page files (under File Options).

8/13/2015CTEBVI Conference72 Two Ways to Save To Save the FineReader file, choose File > Save FineReader Document To Save the FineReader file, choose File > Save FineReader Document This saves your work file. This saves your work file. You can close the FineReader file under the same menu. You can close the FineReader file under the same menu.

8/13/2015CTEBVI Conference73 You did it! You now have e-text! You now have e-text!

8/13/2015CTEBVI Conference74 Production Tips Work with dual monitors Work with dual monitors –Check your computer and video card Stretching an OCR program across two monitors is a HUGE time-saver! Stretching an OCR program across two monitors is a HUGE time-saver! Learn to use keyboard shortcuts. Learn to use keyboard shortcuts. –They save tons of time!

8/13/2015CTEBVI Conference75 Happy OCR-ing! Gaeir (rhymes with “fire”) Dietrich Gaeir (rhymes with “fire”) Dietrich