ABBYY FineReader Engine ABBYY FineReader Engine

The most comprehensive OCR SDK for software developers

Integrate AI-powered OCR features into your applications.

Powerful PDF processing tools

The SDK allows conversion of scans, digital photos, Tiff, JPEG, BMP and other image formats into many types of searchable PDF and PDF/A formats. In addition, it allows importing PDF and PDF /A files and processing them in different ways.

For both types of PDF processing - the conversion of documents into PDF as well as processing of PDFs and PDF/A files, ABBYY FineReader Engine offers a broad range of options and tools and allows to process and create electronic documents according to the PDF/A-3 standards and ZUGFeRD-compliant electronic invoices.

Conversion of documents into PDF

ABBYY FineReader Engine allows saving of recognized texts in PDF and PDF/A formats. The PDF format is used in electronic archives for data storage, where small file size with high visual data quality are of significant importance. A special compression technology (MRC = Mixed Raster Content) is used to minimize the size of PDF and PDF/A files.

PDF Input and Processing

  • Intelligent PDF processing

    ABBYY FineReader Engine analyzes internal information within the source PDF files such as:

    • annotations
    • metadata
    • text objects
    • font dictionaries
    • content stream

    The FineReader SDK enhances PDF conversion performance and speed through efficient and accurate text detection. If text is embedded in the PDF file, the OCR engine examines the integrity of the text layer, and decides whether to extract the text or apply OCR on a block-by-block basis. This functionality can be tuned off if needed.

  • Capture of internal PDF information

    ABBYY FineReader Engine extracts internal PDF links, hyperlinks and document properties such as subject, author, title, keywords and bookmarks.

  • Extraction of attachments from PDF/A-3

    During the opening of PDF/A-3 files, the embedded data, such as XML or other data can be extracted passed to other systems - for example when processing electronic invoices.

PDF Output

  • PDF security and encryption support

    ABBYY FineReader Engine supports a variety of PDF security settings:

    • "Open File" password settings designed to prevent unauthorized access to a document.
    • Ability to restrict certain operations, such as printing, editing or extracting file content, by assigning permission passwords.
    • Support for the latest encryption standards.
  • Output in Tagged PDF format

    A tagged PDF can be "reflowed" to fit different page or screen sizes such as screens of smartphones, tablets or reader devices used by visually impaired users.

  • Page size settings

    Ability to set the size for all pages of an output file.

  • Metadata export

    ABBYY FineReader Engine enables export of metadata (bookmarks, hyperlinks, cross-references, etc.)

  • Conversion to PDF/A format

    PDF/A format is a standard for long-term preservation of documents. ABBYY’s technologies allow saving documents to PDF/A formats of different compliance levels: PDF/A-1a, PDF/A-1b, PDF/A-2a, PDF/-2b, PDF/A-2u and PDF/A-3a and PDF/A-3b.

  • Chinese, Japanese, Korean to PDF export

    Enables conversion of documents in Chinese (both simplified and traditional), Japanese and Korean into PDF format.

PDF (PDF/A) MRC compression

  • A special compression technology called MRC (Mixed Raster Content) is used to minimize the size of PDF and PDF/A files. Compressed files are significantly smaller while maintaining the original’s visual quality. MRC PDF compression technology is ideal when color documents are scanned and processed. The MRC technology offers better document appearance and up to 8-10x smaller file size than JPEG compression.
  • Document image files can be large due to image background, which often makes up to 90% of the file size. The MRC compression technology locates the color background and compresses to a high degree. Picture objects (diagrams, graphs, logos, photos, drawings, stamps, signatures, etc.) are also slightly compressed, but only to the extent, that does not lower visual quality. The resulting PDF is significantly smaller.
PDF conversion
PDF conversion
PDF conversion
  • The MRC technology also analyzes the outlines of similar characters in the document, creates an average character template and applies it instead of a single character. This way, possible defects are corrected and character outlines become more precise. The resulting PDF document will contain an unobtrusive background with fine text and pictures, while providing smaller size and better readability than the original. This “reconstruction” of the document can be useful when you process low quality images.

PDF Export Profiles: Quality and Speed Tuning

To achieve the optimal PDF conversions results, developers can use predefined export profiles that contain pre-set options for different requirements.

PDF Export Profile Description
MaxQuality Optimizes the PDF (PDF/A) export to deliver the highest visual quality of the output file.
Balanced PDF (PDF/A) exports settings are balanced between the visual quality of the resulting file, its size and the time needed for processing.
MinSize PDF (PDF/A) exports are optimized to deliver minimum file size.
MaxSpeed PDF (PDF/A) exports are optimized to reach the highest processing speed.

Start benefiting from ABBYY FineReader Engine today

contact us