For the actual text recognition step, ABBYY FineReader Engine offers a comprehensive set of recognition technologies. The provided technologies include recognition of machine-printed texts (OCR), hand-printed texts (ICR), recognition of optical marks (OMR) and recognition of barcodes (OBR). As a market leader, ABBYY offers the highest number of OCR languages, which can be individually combined. See below the list of available technologies and processing options.
OCR technology is available for more than 200 languages:
ICR technology is available for more than 120 languages:
Recognition of hand-printed information in different languages (multilingual ICR) is possible.
The ABBYY’s OMR technology recognizes simple checkmarks, grouped checkmarks, model checkmarks and checkmarks with “corrections” made by hand in different variations:
By defining a recognition mode, developers can quickly set up and tune the processing speed and accuracy. In addition to the default-processing mode, both OCR and ICR recognition can be performed in normal, fast and balanced recognition modes:
The most accurate mode for achieving the highest quality of recognition, for example if you are planning to reuse the recognized content.
This mode increases the processing speed by 200-250%, making the technology ideal for content management (CMS), document management (DMS) and archiving systems.
This mode provides higher speed than the Normal mode, while achieving almost the same accuracy.
In general, two types of recognition are possible: full text and field-level recognition. Full text recognition is used for document conversion and usually includes usage of OCR technology. Field-level recognition is used to extract particular data and includes usage of OCR, ICR and other technologies.
The following table shows the differences:
|Specification||Full text recognition||Field-level recognition|
|Used for:||Document conversion, books archiving||Data capture / Data extraction|
|Document analysis:||General document analysis, document analysis for invoices, document analysis for full-text indexing||Manual blocks specification for field-level recognition|
|Recognition technologies:||OCR with up to 99% accuracy||OCR, ICR, OMR, Barcodes recognition with predefined data types and values range. 99.99% accurate|
|Verification:||Recommended (for content reuse)||Obligatory (as accuracy is a critical issue in most cases)|
|Synthesis||Used for document reconstruction||Not used|
|Export format||Document files (RTF, DOC, PDF, etc.)||Export to XML file or database|
Full text recognition is a basic recognition type for different tasks, like:
All of them require the recognition (OCR) of the whole text. Resulting text is exported as a plain text or as a complete document in the requested format.
To support key business processes such as forms processing, keyword classification, machine vision or robotic process automation, ABBYY FineReader Engine extracts text from fields or zones. Key functionality includes multilingual OCR and ICR, OMR, barcode recognition and a range of specific functions, such as:
Field-level recognition is as well supported by special tools for developers such as Voting API and "On-the-Fly" Recognition Tuning.
To increase the recognition quality, ABBYY FineReader Engine provides an API for creating and editing recognition languages, adjusting predefined recognition languages, and adding new words to user languages. Examples:
In the vast majority of cases, FineReader Engine can successfully read texts without any prior training. However, when recognizing decorative or outlined fonts or low print quality documents, own patterns can be trained and the recognition quality increased.