Improve OCR Accuracy

This article describes the factors that affect OCR accuracy.

OCR is a tricky thing. It requires a good, clear document. If the letters are too bold and blur together, the OCR engine will have a hard time figuring them out. Conversely, if the letters are too dim and have "open" sections, it will throw the OCR engine off. This is quite common with faxed documents.

Another common problem is when there are extra speckles or "noise" on the scan. This can confuse the engine. Skewed text can make it throw it off, since the OCR engine expects text that is relatively horizontal. You will also want to avoid decorative fonts, since these can be hard to recognize.

The best image for OCR is going to be black and white at 200-300 dpi. Ideally, it will use standard font faces, like Times or Arial. It should be clear of background noise and have as few images as possible.

Some scanning problems can be cleaned up automatically. For example, many scanners will automatically deskew scans, especially sheet-fed scanners where sheets are sometimes pulled through crooked. Some scanners also have automatic exposure options, which can reduce background noise and make sure that text has the right "weight".

Another factor is the OCR engine itself. Enterprise Organizer Pro includes ,"Advanced" OCR engine which has excellent accuracy. As an alternative, if you have Microsoft Office 2003 or newer installed on your machine you can use the "Microsoft Office Document Imaging" (MODI) engine, which also has very good accuracy and good speed.

Attached Files
There are no attachments for this article.
Comments
There are no comments for this article. Be the first to post a comment.
Name
Email
Security Code Security Code
Related Articles RSS Feed
Scans are Too Light (or Dark)
Viewed 1710 times since Fri, Oct 25, 2013
Scan Troubleshooter
Viewed 1484 times since Fri, Oct 25, 2013
Your Scanner isn’t Showing Up
Viewed 1401 times since Fri, Oct 25, 2013
Network Scanner Options
Viewed 1492 times since Fri, Oct 25, 2013
Best Practices for Scanning then Shredding Paper Documents
Viewed 2005 times since Fri, Oct 25, 2013
Problems with Multifunction Scanners
Viewed 8527 times since Fri, Oct 25, 2013
Blank Pages aren’t Removed
Viewed 1491 times since Fri, Oct 25, 2013
Recommended Scanning Resolution
Viewed 1740 times since Fri, Oct 25, 2013
Scanner Image Settings
Viewed 1455 times since Fri, Oct 25, 2013
How Do I Scan from the Document Feeder? Glass?
Viewed 1824 times since Fri, Oct 25, 2013
MENU