Text export and document reconstruction
After the document was analyzed and text areas recognized, the results will be exported according to the predefined options.
The ABBYY FineReader Engine OCR API offers a wide range of options for exporting recognition results, for example in form of plain text, XML, different types of PDF and PDF/A formats, many image-only formats as well as editable Microsoft® Office documents and others. If required, it provides a completely reconstructed document – including all formatting and re-created original document structure.
During the export step, it is possible to:
- Specify various levels of layout reconstruction and text format retention - from simple text without formatting to complete retention of page layout - including columns, tables, frames, fonts, font size, paragraph styles, borders, etc.
- Access detailed information about each recognized character
- Leverage functions for editing and formatting of recognized text prior to the actual export
- Export the recognized text to a variety of saving formats
- Fully retain layout of documents – including the logical structure and document formatting elements such as table of content, headers, footers, footnotes and other elements
- Replace uncertainly recognized characters with corresponding character images when saving to PDF
- Fully retain color of pictures and text
The full list of export formats can be found here