OCR Engine Comparison

Today, I tested four OCR engines currently available to software developers.

OCR Engines Tested

  • Tesseract  Tesseract is free and open source.
  • Microsoft OCR Library Sample App  The microsoft OCR library appears to be free.  The implementaion I used seems to only work for windows store apps.
  • Abbyy Fine Reader 12  Abbyy fine reader is a packaged software, but It appears that they also license their engine to developers.
  • LEADTOOLS OCR SDK  Lead tools is well know for their imaging related developer tools.

The Test Image

I created this image containing text in various sizes and styles and fed it to the OCR engines.

The OCR test image

Tesseract Result

Using Tesseract, it output the following text:

DneszmsDcKzmngruHywurk> H -
How about a bigger font?
1254-5678'? I13
Whai‘ abow{’Hr\4'4«fow{'?

Microsoft OCR library Result

Using the Microsoft OCR Library Sample App, this was the result:

microsoft OCR library test result

Abbyy Fine Reader Result

Here is a partial screenshot of the results produced by Abbyy Fine Reader 12 :

Abby Fine Reader Result


The LEADTOOLS OCR SDK produced the following results:



These OCR systems seem to perform well on typewritten text, but fail on the script font, or text that is handwritten.  The OCR engines are not as good as I would expect them to be.

  1. Several reason for that since Leadtools, ABBYY or either our Yunmai solution are use by various industry around the world:

    1] the image you try to ocr is way too small (low account of pixel per character)
    2] the image sample size is ridiculously small (1 picture?)

    To appreciate accuracy and performance of OCR you need to test in "real" working situation, 300dpi for A4 document, a large variety of samples.

    If you would like to change your mind about OCR and daily usage try to download a business card scanner application on your mobile phone and you'll see that, even if the OCR result are not 100%, you can save a lot of time on data entry using this pieces of software.
