Optical Character Recognition

When you scan a document, you create an image file. Even though the image seems to have text, to the computer it is just a picture.

Optical character recognition (OCR) is a process which extracts text from a scanned image. It does this by looking for recognizable letters and words. A good OCR engine can pull the text out of a scanned image with excellent OCR Accuracy.

Why Bother with OCR?

If your only concern is archiving paper documents as electronic files, OCR may not matter to you. But if you want to copy and paste text from a scan, or do a text search of the scan’s contents, you will need OCR.

Where Does the OCR Text End Up?

If you convert your files to PDF format, QuikFile embeds the OCR text in the PDF file:

  • The original scanned image of the document is on top
  • The OCR text of the document is embedded invisibly behind
  • Each word of OCR text is aligned behind the image of the word

QuikFile can also create a plain text file, which is just the unformatted OCR text without the original document image.

How Do I Turn On OCR?

When you set up your job, you’ll be asked whether you want to include OCR as part of the process.

Can I Re-Run OCR?

You can redo OCR by re-converting the file. You’ll need to set up a Conversion Jobs. Under the Source option, set the input file type to PDF and select the Redo PDFs That Are Already Searchable option. PDF files which have OCR text will be re-converted. QuikFile will discard the old OCR text, re-run OCR, and embed the new text in the file.

What OCR engine is QuikFile using?

When you set up a Conversion Jobs, you can choose among several OCR engines. See Optical Character Recognition for more information.

Attached Files
There are no attachments for this article.
Comments
There are no comments for this article. Be the first to post a comment.
Name
Email
Security Code Security Code
Related Articles RSS Feed
Job Advanced Options
Viewed 1221 times since Tue, Jul 29, 2014
OCR Accuracy
Viewed 1258 times since Fri, Jan 3, 2014
MENU