Scanning From a File (or Fixing Inaccessible PDFs)

Optical Character Recognition (OCR) programs such as OmniPage and TextHelp Read & Write usually scan from a real Canon, Epson or HP scanner that sits your desk. But they are also able to scan directly from an image file or PDF on the computer.

Inaccessible PDFs

This feature is most often used to fix inaccessible PDFs.

Some poorly-authored PDF files contain text only as an image which is not accessible to a screenreader or other text-to-speech software such as PDF Aloud or ReadPlease. Inaccessible PDFs are read aloud in the wrong order or, more commonly, random characters, letters and symbols are spoken that reflect nothing of the original.

Fixing a PDF

Some people have found that they can fix these inaccessible PDFs by printing them out and re-scanning them into the computer as a new PDF. While this is a successful method it is rather time consuming and can cost quite a lot in ink and paper. It is a lot more efficient to scan directly from the inaccessible PDF file.

In the the following guide I use TextHelp Read & Write version 8.0 to demonstrate the quickest way to use OCR software to fix an inaccessible PDF.

Sample Inaccessible PDF

An Inaccessible PDF

Some PDFs are poorly designed making them inaccessible to screenreaders and other text-to-speech programs.

Many documents will produce nothing but useless gobbledygook.

OCR Menu in TextHelp Read and Write

Select the options in Read & Write

Ensure that you select Scan from File or the program will look for the document in your scanner.

Normally you would scan from PDF to PDF. You could also convert your inaccessible PDF into Word format (RTF) but the layout may be affected.

Scan Button on the Read & Write Toolbar

Press the Scan button

Open file from HDD dialog

Open Source PDF File

Locate and select the inaccessible PDF file. If the file is from the Internet you might need to download it and store it in your My Documents folder so that your OCR software can find it.

You can download any PDF by right-clicking on the hyperlink that links to it and selecting "Save Target As" or "Save As".

Comfirm Multiple Pages Dialog Box

Confirm Multiple Pages

Select how many pages of the document you want to include. This prompt will not appear for single-page PDFs.

OCR Save Dialog


Don't save over the original PDF. Give it a similar name ending in "accessible" or similar.

OCR Progress Message

Wait ...

... while the program recognises the text. Multi-page documents can take a lot longer to convert than single page documents.

Sample of an Accessible PDF

Accessible PDF

Up pops the new accessible PDF in Adobe Reader.

Now when we use the same screenreader or other text-to-speech program the text should be read as it appears on the page.



