Processing PDF: How to Go from PDF to E-text to Audio Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges Foothill.

Slides:



Advertisements
Similar presentations
Don’t Type it! OCR it! How to use an online OCR..
Advertisements

Microsoft ® Office OneNote ® 2007 Training Using your Notebook to its fullest potential Kent School District presents:
Computer Basics Hit List of Items to Talk About ● What and when to use left, right, middle, double and triple click? What and when to use left, right,
Access 2007 ® Use Databases How can Microsoft Access 2007 help you to get and stay organized?
® Copyright 2008 Adobe Systems Incorporated. All rights reserved. ADOBE® ACCESSIBILITY Achieving Accessibility with PDF Greg Pisocky Accessibility Specialist.
Online Collaboration Applications ADE100- Computer Literacy Lecture 28.
Foundation Level Course
Microsoft Word Objectives: Word processing using Microsoft Word
Duxbury Braille Translation Software Gaeir (rhymes with “fire”) Dietrich Director High Tech Center Training Unit of the California Community Colleges.
Chapter 2 Publishing a Trifold Brochure
® Copyright 2008 Adobe Systems Incorporated. All rights reserved. ADOBE® ACCESSIBILITY Achieving Accessibility with PDF Greg Pisocky Adobe Systems Thursday.
Poster Print Size: This poster template is 20” high by 24” wide. It can be used to print any poster with a 3:4 aspect ratio. Placeholders: The various.
XP Information Technology Center - KFUPM1 Microsoft Office FrontPage 2003 Creating a Web Site.
PDFs & Dorsetforyou.com Laura Hall Senior Website Officer
AN OVERVIEW OF MAC PDF TOOLS 1. PDF Tools for Mac PDF files can be used either in Windows, Unix or Apple’s Mac OS operating system commonly. It still.
By Jeffrey Dell Assistive Technology Specialist Mary Theobald Graduate Assistant Alt Text Office of Disability Services Cleveland State University.
Session 803: Processing PDF Files Gaeir Dietrich Director High Tech Center Training Unit
Processing PDF: How to Go from PDF to E-text to Audio Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges Foothill.
Advanced OCR with OmniPage and FineReader. Overview Optical character recognition Optical character recognition Structural recognition Structural recognition.
Session 302 Using Optical Character Recognition Programs Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges.
Creating and publishing accessible course materials Practical advise you can replicate.
Creating Accessible PDF’s in Adobe Acrobat Professional 7.0.
With Alex Conger – President of Webmajik.com FrontPage 2002 Level I (Intro & Training) FrontPage 2002 Level I (Intro & Training)
PYP002 Intro.to Computer Science Microsoft Word1 Lab 07 Creating Documents with Efficiency and Consistency.
Creating a Web Page HTML, FrontPage, Word, Composer.
Microsoft Office Word 2013 Expert Microsoft Office Word 2013 Expert Courseware # 3251 Lesson 5: Setting Up Global Accessibility.
Create Professional-looking Content Easy to Use Interface Share Documents.
Accessibility with Office and Acrobat Andrew Arch Online Accessibility Consulting.
Publications, design sets, web pages
CTRL + Z is your best friend. Use it to undo anything! You can even undo multiple mistakes!
Office Tips and Tricks Lisa Short Technology Specialist Summer 2013.
1. Chapter 9 Maintaining Documents 3 Managing Files As with physical documents, folders, and filing cabinets, electronic files and folders must be well.
Kurzweil Designed for individuals with vision Designed for individuals with vision –Learning disabilities –Low vision –TBI/ABI –ADD/ADHD.
Using a Template to Create a Resume and Sharing a Finished Document
Kurzweil 3000 Ron Stewart Access Technology Instructor High Tech Center Training Unit.
Introduction to MS WORD.
1. Chapter 25 Protecting and Preparing Documents.
Productivity Programs Common Features and Commands.
Mark Turner Cuesta College Faculty Web Pages: Elegant Design with Students in Mind.
Accessible Word and PDF documents
Computer Literacy for IC 3 Unit 2: Using Productivity Software Chapter 3: Formatting and Organizing Paragraphs and Documents © 2010 Pearson Education,
 Each tab is geared towards a certain activity area.
Working with Inaccessible PDFs Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges
CHAPTER 4, 5, 6, 7 – MICROSOFT WORD Sravanthi Lakkimsetty Nov 18, 2015.
WHAT SHOULD YOU HAVE IN YOUR ALTERNATE FORMAT TOOLBOX?
Unit 1, Chapter 1. Creating & Editing  Objectives  Enter & format text  Save  Insert & format a picture  Add a border to the page  Print.
This poster has been designed to act as a customisable template. You do not have to use this template but it might be a useful starting point. The poster.
Chapter 10 Creating a Template for an Online Form Microsoft Word 2013.
Getting Started 1) Open Read & Write Gold 2) Open Word 3) Click on textHELP drop down arrow 4) Choose General Options.
+ Accessible Document Basics Cindy Compeán Accessibility/Assistive Technology Specialist
Getting Started 1) Open Read & Write Gold 2) Open Word 3) Click on textHELP drop down arrow 4) Choose General Options.
Alternate Media Workflow Strategies for PDF. Why PDF? Portable document format (PDF) Reads the same on any computer Looks like the book Contains all the.
Duxbury 11.3 Braille Translation Software Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges.
PDF is the preferred format for poster printing. For Power Point 2007 for Windows: Click the Microsoft Office button, point to the arrow next to Save As,
Making the Most of PDFs PDF (portable document format) is a file format developed by Adobe Systems. PDFs make it possible to send documents with original.
Creating Accessible PDFs
Jeopardy Word-1 Word-2 Word-3 Word-4 Word-5 Q $100 Q $100 Q $100
Microsoft Word 2016 Lesson 1.
Microsoft Word Objectives: Word processing using Microsoft Word
Microsoft Word 2010.
Microsoft Excel 2007 – Level 1
Accessible Documents with MS Word
The How-to-Guide for Using Word
Assistant lecturer Nisreen A. Jabr
Session 901 Using Optical Character Recognition Programs
Disability Resource Center
Microsoft PowerPoint 2007 – Unit 2
ICT Word Processing Lesson 1: Introduction to Word Processing
Welcome To Microsoft Word 2016
Quick and Dirty: the art of OCR
Presentation transcript:

Processing PDF: How to Go from PDF to E-text to Audio Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges Foothill Community College District

PDF from Publishers  Portable document format (PDF)  Reads the same on any computer  Looks like the book  Smaller than TIFFs  Contains all the text Always check to make sure the book is the right one!  Easy for publishers

Requesting through ATN  Access Text Network Now free for requesting files from ATN- member publishers Paid membership to exchange files  Not all publishers But ATN does have the largest ones

Other Resources at ATN  Accessible Textbook Finder  Link to Publisher Lookup Will have to contact non-ATN member publishers directly

Using Publisher PDFs  Sometimes students can use files directly  Often files will need further processing for student use  At the very least, large files may need to be broken into chapters

PDF Strengths  Good format for large print Cropping Fit to page on large pages Print sections on large pages (tiling)  Adobe Reader has some nice features Change colors Reflow Limited voicing  Works on both Mac and PC  Easy for most publishers to create

PDF Weaknesses  Not always fully accessible Screen readers do not always like them— even when they are text-based Reading order can be problematic  May be graphics (pictures of text)  May have too much security

As an Aside…  When faculty create PDFs… The PDF always started as something else…usually a Word file Try to get the starting document if the student prefers audio Security concerns?  Word files can be password protected  Button > Prepare > Encrypt

Types of PDF Documents  Text-based Text can be selected  Graphical Picture of text (i.e., a graphic) Text cannot be selected  Use text-select tool to tell the difference  Files may be “locked”

Processing PDFs  Adobe Acrobat Professional Check on College Buys for discount  Good OCR program Abbyy FineReader Nuance OmniPage  IF you are a Kurzweil campus, you will also need Kurzweil

Adobe Tools  Adobe Reader Free Useful for students who need minimal accessibility features  Adobe Acrobat Professional Essential for alt media specialists Extract text, create accessible PDFs, enabled Adobe Reader features Discounted Price

Acrobat Reader  Reads aloud But does not highlight or track  Enlarges text Nice reflow feature  Changes text/background colors  Text highlighting, sticky notes, and comments  Access for text-based PDFs

Production Features in Reader  Really designed for reading, not reformatting  Export PDF Subscription service (about $20/year) Upload PDF file, service auto-converts to Word, download

Process with Acrobat Pro  Cropping  Enlargement for printing  Tiling  Extracting/deleting pages  Combining/inserting pages  Text extraction Works best with text-based PDF Does have built-in OCR capability

Customize Quick Tools  Click on the “gear”  View > Show/hide > Toolbar Items > Quick Tools

Quick Tools Menu

Customize

Please Note  To enable single-key shortcuts Open Preferences dialog box Ctrl + K Under General > select Use Single-Key Accelerators To Access Tools (first checkbox under Basic Tools)

Cropping  Tools > Pages > Crop  Shortcut: C  (Please note: This shortcut brings up the mouse-driven cropping tool—must double click to open the dialog box!)

Crop Tool

Crop Toolbox

Enlarging  Choose paper size/printer  File > Print > Size…to Fit  Shortcut: Ctrl + P (tab through)  Tip: Crop document before enlarging

Print to Fit

Tiling  Choose paper size/printer  File > Print > Poster > Tile Scale and Overlap  Shortcut: Ctrl + P (tab through)  Tip: Crop document before tiling

Enlarge with Tiling

Extracting Pages  Tools > Pages > Extract  Delete Shortcut: Ctrl + Shift + D  Extract Pages Shortcut: Alt V + T + P (opens Pages pane; F6 focuses in pane and can arrow down)

Extraction Tool

Tips for Extracting Chapters  Crop on complete file before extracting  Work on a copy!!!!!  Extract from end toward front!  Use table of contents to help  Place focus on first page of chapter to extract (beginning with last)

Starting from the Back

Combining  File > Pages > Insert  OR  Create > Combine files

Inserting Pages

Combining Pages

Auto Extracting Text  File > Save As > MS Word Retains styles and paragraphs  File > Save As > More options… Text (Accessible)  Lose styles, places hard returns at end of line Text (Plain)  Lose styles, keeps paragraphs  Shortcut: Alt F + A

Save As Options

Better Text Extraction  OCR programs analyze text and structure Acrobat Pro has built-in OCR, but other programs provide more control  Can control which text to include

More Control over Text  For graphical PDFs  Or  To maintain more control over extracting text from text-based PDFs  Use an OCR program!

Processing Graphical PDFs  Must run optical character recognition (OCR) Computers cannot read pictures OCR programs recognize the “characters” in the picture  How you process the file depends on the end format the student wants!

Want to Stay in PDF?  Sometimes students do want a text- based PDF  Can OCR in Adobe Pro Tools> Recognize Text

Under Tools

Want Text Out  OmniPage or FineReader FineReader generally easier to learn Save to Word or HTML or Text based on student preference  Use virtual printer with Kurzweil Create KESI files  R&W Save as Word

Which One When?  Want a Word file? Best choice is OmniPage or FineReader  Want a Kurzweil document? Use Kurzweil to process the PDF  For students to do themselves? Whichever program they prefer

Why?  OCR programs are designed to make extraction and editing easy  Document readers (R&W, Kurzweil, etc.) are designed to make reading easy…NOT editing.

NEVER!!!  Do NOT run OCR with FineReader or OmniPage…save to PDF…and then take into Kurzweil, R&W, etc.  Kurzweil, R&W, WYNN will run their own OCR on the PDF! Wastes time, adds error to do OCR twice

OCR Programs  Treat PDFs the same as a TIFF If you OCR scanned documents, use the same process  Load image file  Select zones  Create templates as needed

OCR Process Details  Crop before loading into OCR engine  Turn on multiple languages as needed If doing math, turn on Greek Only turn on the languages you need  Edit in the OCR program Some OCR programs have font matching features  Save to Word

Captions and Such  For students who want audio or who are using screen readers Separate the main body of the text and the “ancillary text” (captions, sidebars, footnotes) Create two documents 00 Chapter and 00A Chapter  Allows the student to hear main text uninterrupted

Two Doc Workflow  Open PDF in OCR Program  Analyze layout for entire document Save a copy  On one copy…delete all ancillary text Save to Word as 00 Chapter  On other copy…delete all main body text Save as 00A Chapter  Keep page numbers in both documents!

Once in Word  Learn to use “show hidden” Ctrl + Shift + 8  Beware of the optional hyphen Search and replace to delete Search for ^- replace with nothing Run spell check  Use styles to structure files for braille program

Converting Files

Mobile Readers?  Check formats that device can handle Some handle PDF and DOC, some do not  All readers handle TXT Also called text, ASCII Can save from Word as plain text

Magic Conversion Tool  Calibre Converts to and from many formats Fairly intuitive Free! 

Another Conversion Tool  TechAdapt  TechAdapt Accessible Media Center (TAMC) For converting NIMAS and DAISY  DAISY to… RTF HTML

File Transfer  Can use DropBox or Box to transfer files for most readers  Kindle and iPad can often use

Resource Info  Gaeir Dietrich     Alt media listserv  Manuals online