Optical Character Recognition (OCR) programs such as OmniPage and TextHelp Read & Write usually scan from a real Canon, Epson or HP scanner that sits your desk. But they are also able to scan directly from an image file or PDF on the computer.
This feature is most often used to fix inaccessible PDFs.
Some poorly-authored PDF files contain text only as an image which is not accessible to a screenreader or other text-to-speech software such as PDF Aloud or ReadPlease. Inaccessible PDFs are read aloud in the wrong order or, more commonly, random characters, letters and symbols are spoken that reflect nothing of the original.
Some people have found that they can fix these inaccessible PDFs by printing them out and re-scanning them into the computer as a new PDF. While this is a successful method it is rather time consuming and can cost quite a lot in ink and paper. It is a lot more efficient to scan directly from the inaccessible PDF file.
In the the following guide I use TextHelp Read & Write version 8.0 to demonstrate the quickest way to use OCR software to fix an inaccessible PDF.
Some PDFs are poorly designed making them inaccessible to screenreaders and other text-to-speech programs.
Many documents will produce nothing but useless gobbledygook.
Locate and select the inaccessible PDF file. If the file is from the Internet you might need to download it and store it in your My Documents folder so that your OCR software can find it.
You can download any PDF by right-clicking on the hyperlink that links to it and selecting "Save Target As" or "Save As".
Select how many pages of the document you want to include. This prompt will not appear for single-page PDFs.
Don't save over the original PDF. Give it a similar name ending in "accessible" or similar.
... while the program recognises the text. Multi-page documents can take a lot longer to convert than single page documents.
Up pops the new accessible PDF in Adobe Reader.
Now when we use the same screenreader or other text-to-speech program the text should be read as it appears on the page.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.