ABBYY FineReader Engine
The most comprehensive OCR SDK for software developers

Automated document analysis
During the document analysis step, the document is analyzed in regards to its logical structure – first and last document pages are identified, the formatting elements such as footnotes, headers, footers and table of content are detected.
At the same time, the layout of each individual page is detected and each page is divided into individual objects, such as text blocks, pictures, tables and table cells, barcodes, and separators. Additionally, the document analysis algorithms detect page orientation, identifies double pages, detects vertical text and define page areas that are not relevant for the OCR process.
As a result, the ABBYY FineReader Engine is able to specify text areas and fields that should be recognized and page areas, such as images or diagrams, that should be kept in their original form. At the same time, it receives information about the logical document structure (including its formatting) which will be used at the end of the OCR process, when the document will be exactly reconstructed.
The results of this analysis are used for document structure and layout retrieval if documents are processed for further reuse – which means that the documents need be exactly reconstructed. All pictures and diagrams will be preserved in their original form - without recognizing the text inside pictures or logos.
Manual blocks specification for field-level recognition
Request a demo today!
Schedule a demo and see how ABBYY intelligent automation can transform the way you work—forever.