Stages Details

Step 1: Image import and processing

An image is loaded from memory and prepared for OCR. Image binarization separates text from the background, producing a black-and-white image that is much smaller in size than the color original. Additional skew correction and document orientation detection can be applied.

Automatic image skew correction function

When photographs are taken with a mobile device camera or images are scanned with a portable scanner, image skews often occur, which has a negative impact on recognition quality. The ABBYY Mobile OCR Engine enables detection and correction of skews within one degree of accuracy – significantly improving quality. 

Document orientation detection function

Image pre-processing automatically detects the orientation of the page to be recognized (i.e. if it is sideways or upside down).

Step 2: Document analysis

Document Analysis is a set of algorithms that then analyse the image. It detects letters, joins the letters into words and then into lines of text, and finally, into paragraphs. Additionally, the reading area is cleaned and noise removed.

Hyphenation support

If the engine encounters a part of a hyphenated word on one line and the second part on the next line, it will join them into one word. 

Preserving multi-column text

The Engine has a function titled Paragraph Assembly. It identifies text block borders and recognizes each block separately, recognizing text left to right and top to bottom within each block – preserving the format of multi-column text, paragraphs and text segments.

Preserving character fonts

ABBYY Mobile OCR Engine identifies the font properties of source text – i.e. bold type, italic or underlined.

Confidence level indicator

This function shows the level of certainty for recognized text, enabling developers to set flexible criteria for implementation of proofreading and verification functions.  

Additionally, spell checking during text recognition considerably improves the quality of the output text.

Step 3: Optical Character Recognition (OCR)

Detected blocks on the image are recognized using special language and pattern definitions. If dictionaries are available, then the text is also compared to them to improve overall recognition quality. Additionally, each character is assigned a confidence level – showing how confident the recognition engine was in its final choice of character.

There are two mobile recognition modes:

  • Fast mode. Most convenient when an image is of good quality, this mode cuts the time required for recognition and processing.

  • Full mode for accurate recognition. It is best for low-quality images, when more time is required to achieve the optimal result.

Business card processing (optional)

Recognition results are analyzed and the relevant contact information from business cards is extracted. This allows retrieval of information including first name, last name, title, phone number, e-mail, address, etc. Plus, the technology can recognize cards in 23 languages.

Barcode recognition (optional)

This function provides barcode recognition and supports a wide range of 1D and 2D barcodes.  

Step 4: Result processing

Recognition results can be processed and exported. The developer has full control over the OCR results. 


Want to try?

Contact sales team for more information or request a free trial