<< Back
OCR
Optical Character Recognition (OCR)
- OCR technology for 195 languages. See The Full List of Supported Languages
- 47 languages have dictionary/morphology support
- Recognition of multilingual documents
- Recognition of dot-matrix documents — FineReader Engine recognizes printed dot matrix texts of many types. It has been trained using several thousand samples produced by a variety of printers including dot matrix, daisy wheel, chain and band printers, as well as using draft and Near Letter Quality (NLQ) printing modes.
- Recognition of typewritten documents
- Chinese, Japanese, and Korean (CJK) character recognition
- Thai and Hebrew OCR
- Recognition of OCR-A, OCR-B, MICR (E13B) and CMC7
- FineReader XIX – a special OCR module for digitizing old documents
“FineReader XIX” is an OCR module designed specifically for digitizing and archiving old documents, books and newspapers published in the XVII-XX centuries, many of which are rare and unique. Stored in the historical archives of libraries and government organizations, they are national heritage that must be preserved. FineReader XIX provides a unique capability to recognize texts published in the period from 1600 till 1937 in English, French, German, Italian and Spanish. It supports recognition of old fonts such as Fraktur, Schwabacher and the majority of Gothic fonts.
Intelligent Character Recognition (ICR)
- ICR technology for more than 110 languages
- About 30 languages (with Latin, Greek and Cyrillic alphabets) with morphology/dictionary support.
- About 85 languages with Latin characters without dictionaries.
- ICR for Indian digits used in Arab states.
- 22 regional styles of hand-printing used in different countries and regions of the world (for supported ICR languages).
- Recognition of hand-printed characters in fields and frames – underlined fields, boxes, comb-style fields, etc.
- Multilingual ICR. One of the main advantages of ABBYY ICR technology is that it delivers almost the same high accuracy on digits and digits combined with letters of one or several languages, even if the fields contain both upper and lower case letters.
Recognition modes
With the Engine's pre-defined processing modes, developers have the ability to quickly set up and tune the processing speed and accuracy in a way which is the most appropriate for their needs. In addition to the default processing mode, both OCR and ICR recognition can be performed in fast and balanced recognition modes:
- Fast recognition mode
It is designed for high-volume document processing and for the cases when speed is of primary importance. This mode increases processing speed by 200-250% making the technology ideal for using in content management (CMS), document management (DMS) and archiving systems. - Balanced recognition mode
It helps to establish an optimal balance between the recognition speed and accuracy.
Optical Mark Recognition (OMR)
The ABBYY’s OMR technology recognizes simple checkmarks, grouped checkmarks, model checkmarks and checkmarks with “corrections” made by hand in different variations:
- Char Box Series
- Comb In Frame
- Grey Boxes
- Partitioned Frame
- Simple Comb
- Text In Frame
- Underlined Text
OMR delivers accuracy rate of 99.995 %
Optical Barcode Recognition (OBR)
- 1D and 2D barcode types. ABBYY OCR SDK supports recognition of popular types of 1D and 2D barcodes. See The Full List of Supported Barcodes
- Fast barcode extraction. This feature enables automated detection and recognition of barcodes at any angle on a document. It works both for 1D and 2D barcodes
User languages
ABBYY FineReader Engine provides an API for creating and editing recognition languages, creating copies of predefined recognition languages and adjusting them, and adding new words to user languages.
Below are two examples illustrating how user languages can help you to improve recognition quality:
- In documents filled out by hand, the values in the form fields usually belong to a specific set such as city names, countries, zip codes, product codes, sums, etc. To improve the quality of ICR recognition, you can use user languages to describe the information which may be entered in each field.
- If a document contains "structures" such as product codes, telephone numbers, passport numbers etc., recognition errors may occur. This happens because the program reads such structures letter by letter. To improve the recognition of product codes and the like, you can create a new recognition language which will help the program to read specific types of data correctly.
Pattern Training
In the vast majority of cases FineReader Engine can successfully read texts without prior training. However, in such cases as recognition of decorative or outlined fonts or bulk input of low print quality documents, preliminary pattern training will prove useful.
The OCR SDK allows you to create and use user patterns or import them from the ABBYY FineReader desktop application (Professional or Corporate Edition). FineReader Engine is flexible and applicable to build up an application of any architecture, either it is a client workstation or a server-based solution.
<< Back