PDFpenPro logo PDFpenPro logo
 

Help: OCR (Optical Character Recognition)

OCR (Optical Character Recognition) is the process of converting a bitmap image of text (like a scanned document) into text that can be selected, copied and searched by PDFpenPro and other text editing software.

OCR technology will not produce a perfect rendering of the bitmapped text. You will need to proofread and edit the text that results from OCR.

Using OCR in PDFpenPro

  1. Open a scanned PDF in PDFpenPro
  2. An alert box opens with the message "This document appears to be scanned. Would you like to perform optical character recognition (OCR) on it? OCR will allow you to select the text." You have three options:
    • Cancel
      No OCR will be performed
    • OCR Page
      OCR will be performed on the current page
    • OCR Document
      If your document has multiple pages, OCR will be performed on all of the pages.
    • You will also be able to pick which languages are recognized by OCR.

While PDFpenPro is performing the OCR, a progress bar will appear. The operation can take a few seconds or much longer, depending on the size and the contents of the scanned document.

To perform OCR manually, choose Edit > OCR. PDFpenPro commences to perform the OCR operation and the progress bar appears.

Selecting, copying and correcting OCR Text

The text generated by the OCR operation can be edited like any other text. See the Working with Text.

Searching OCR Text

The text generated by the OCR operation can be searched like any other text. See Searching Within A PDF.

Tips to Improve the OCR Results of Your Document:

  • The quality of the original document affects the quality of the OCR performance. Crisp, clean originals with clear text will produce much better results than crumpled, faded photocopies.
  • Place your original document on the scanner as straight as possible. If you have a scanned page that is not straight, you can "deskew", or straighten, the image in PDFpenPro by going to Edit > Deskew and Adjust Image…
  • Increase the contrast of your scanned document so that the background is as white as possible. You can adjust the contrast of the image by going to Edit > Deskew and Adjust Image…

How to Force PDFpenPro to Perform OCR

PDFpenPro looks at the document and if it sees one image the size of a page, it assumes that the document is a scan and automatically offers to perform OCR. In some cases, PDFpenPro may not recognize a scanned document. Under the Edit menu, OCR... will be grayed out and unavailable to select.

  1. Hold down the Command and Option keys together.
  2. Choose Edit > OCR... from the menu.

 

 

 
 
© 2003-2012 SmileOnMyMac, LLC dba Smile. All rights reserved.
PDFpen and PDFpenPro are registered trademarks of Smile. The Smile logo is a trademark of Smile.