ABBYY
Back to Newsroom

ABBYY Releases OCR SDK for LINUX

September 16, 2003
ShutterStock

Available now for beta trial, ABBYY FineReader Engine 6.0 for Linux, is a full Software Development Kit (SDK) for integrating ABBYY OCR technologies into Linux applications.

Widely Acclaimed OCR Engine Now Available for Developers, VARs, and Integrators Programming for Linux Operating Environments

ABBYY®, a leading developer of document recognition and linguistic technologies, today announced that it has made it’s optical character recognition (OCR) platform accessible for the Linux community. Available now for beta trial, ABBYY FineReader Engine 6.0 for Linux, is a full Software Development Kit (SDK) for integrating ABBYY OCR technologies into Linux applications.

FineReader Engine 6.0 for Linux marks ABBYY’s first offering specifically developed for the Linux community. Created with special consideration for the functionality and unique characteristics of the Linux platform, it offers access to advanced OCR technology, known on the market for superior accuracy and format retention. VARs, system integrators, third party developers, and corporate developers who want to build a variety of document management applications using LINUX-based development environment can now easily take advantage of technology that has already been chosen by numerous customers including Canon, Cardiff, Fujitsu, Kofax, Panasonic, Saperion, and ZyLab.

Providing higher accuracy and improved OCR functionality than ever before on Linux platform (in accordance with internal tests), FineReader Engine 6.0 for Linux enables hardware and software developers, VARs, and system integrators to integrate ABBYY’s advanced recognition technologies into content management, data capture, archiving applications developed for Linux platform.

The FineReader Engine can be used in image or document processing applications that require accurate and flexible OCR functionality. Some examples of applications for which  FineReader Engine is ideal for include:

  • Archiving and document processing applications requiring speed and flexible access to OCR functions.
     
  • Control systems for verification of bank checks, passports, tickets, credit cards, driver licenses, registered car numbers, etc.
     
  • Document conversion applications for output to PDF, HTML, RTF with full page layout retention.
     
  • Fax processing applications in which OCR is used to convert faxed images into editable text.
     
  • Content creation and management applications such as Web publishing systems, Intranet archiving applications, and press/media-clipping solutions.

Technology advantage

Thanks to IPA Technology and Multilevel Document Analysis, FineReader delivers unprecedented recognition accuracy. Also FineReader maintains higher quality of accuracy on multi-lingual documents than any other OCR software*.

ABBYY FineReader is a truly innovative technology that differs substantially from its predecessors. Moving beyond previous versions that focus on text alone, the new FineReader technology is designed to recognize documents as a whole. The objective of the this technology is to produce a file that mirrors the paper original – analyzing not just text, but also formatting, such as image placement, font size and spacing, column placement, and text orientation.

ABBYY FineReader Engine 6.0 functionality

ABBYY FineReader Engine for Linux has the same quality of recognition and document analysis and has the same advanced wide spectrum of OCR functions as Windows version of ABBYY FineReader Engine.

Users of FineReader Engine for Linux benefit from the following features offered in core ABBYY OCR technology:

1. Recognition:

  • Unsurpassed recognition quality.
    Due to IPA technology and MDA method usage, the recognition technology received more than 100 awards, editor choices and independent comparative tests of major worldwide influential IT magazines since 1998.
     
  • Extended language support of multilingual documents.
    FineReader technology recognizes documents on 177 languages, providing almost the same superior quality of multilingual document recognition as for the single language documents. This is a unique powerful advantage for developers with global market products. FineReader recognizes more languages than any other OCR system. 34 of these languages have dictionary/morphology support.
     
  • "Intelligent Background Filtering" and “Adaptive Binarization” Technologies for OCR (Machine Print Recognition).
    Uses individual recognition parameters for each image fragment increasing accuracy on difficult to read documents such as newsprint, color documents, faxes, and copies. Also includes filters for screening out background noise.
     
  • Recognition Module for Dot Matrix Recognition.
    The module recognizes dot matrix text of all types. Tested with several thousand samples on a variety of printers including dot matrix, daisy wheel, chain and band printers, draft and Near Letter Quality (NLQ).
     
  • OCR-A, OCR-B and MICR (E13B) Font Support.
    Recognizes characters frequently used by banks, credit card companies and other financial institutions.
     
  • Fast Mode recognition.
    Designed for high-volume document processing applications where speed is more important than accuracy. This mode increases processing speed by 200-250%, making it particularly useful with document management and archiving systems.
     
  • Recognition of vertical texts in tables.
     
  • Recognition of typewritten documents.

2. Document Analysis:

  • Superior document analysis and full layout retention.
    Using MDA and IPA-technology principles, FineReader retains the exact look of printed documents, be it wrap-around text, columns, tables, non-rectangular pictures, varying fonts or varying spacing between characters.
     
  • Auto-detection of page orientation (90, 180, 270 degrees);
     
  • Auto-detection of text blocks, tables, barcodes and pictures. Manual block zoning also supported (adding, removing and editing blocks);
     
  • Auto-detection of vertical text in table cells.

3. Export and Synthesis:

  • Exact layout retention in RTF and PDF formats.
    Thanks to MDA method and IPA technology, FineReader has superior document analysis which make possible to save full formatting of recognized text in PDF and RTF formats. The exported files have the same look as original printed documents including special elements of RTF format that are completely innovative on Linux platform;
     
  • Multiple levels of text format retention (from simple text with no formatting to complete page layout retention, including columns, tables, frames, fonts, font size, paragraph styles, borders, etc.);
     
  • Full picture and text color retention;
     
  • Saving non-rectangular pictures;
     
  • Recreation of bullets and numbering.
     
  • Export to PDF format:
     
  • Includes “text and image”, “text over image”, “text under image”, and “image-only” option for saving PDF files;
     
  • Support for multilingual PDF files, eliminating the need to install additional fonts.
     
  • Can also replace uncertain characters with their corresponding character images when saving in PDF format;

4. Advanced functionality:

  • Opening images from files and memory.
     
  • A set of functions for image preprocessing.
    • Automatic image deskewing;
    • Skew calculation based on the information of reference blocks;
    • Rotation (90, 180, 270 degrees) and mirroring of images;
    • Inverting of images;
    • Image despeckling, 
    • Image scaling and clipping;
    • Dual-page splitting;
    • Preview creation;
    • Text color or background manipulation inside rectangles.
       
  • Full access to recognized text including the detailed information of character formatting and coordinates.
     
  • A set of functions to post-editing and post-formatting of the recognized text before its exporting.

FineReader OCR functions are easily accessible via an application programming interface (API), ABBYY has also developed a special utility for Linux programmers that allows them to access a variety of FineReader Engine controls using the Command Line interface, a preferred programming method for the Linux community. The API for FineReader Engine 6.0 for Linux is identical to that of FineReader Engine 6.0 for Windows, making it easy for developers working with multiple platform applications to use the same API on different platforms.

Trial version

ABBYY offers a time-limited fully functional version of ABBYY FineReader Engine 6.0 for Linux for FREE evaluation, so that our prospective customers can test it in real working conditions without any limitation of functionality. To obtain an evaluation copy, please contact your ABBYY salesperson.

Pricing and Availability

ABBYY FineReader Engine 6.0 for Linux is now available. ABBYY offers flexible pricing options that allow developers to select the type of licensing model that is best suited to their product and sales strategy. For additional product information visit the company’s website at http://www.abbyy.com/

Connect with us