The PDF format was originally intended to display the exact same content and layout regardless of operating system, device, or software application it is viewed on. Nowadays however, it has become a necessity to be able to search through PDF documents, extract information or convert complete documents into editable formats. This is not always easy, especially if the PDF is an image-only file created with a scanner. OCR and PDF conversion technology is needed in order to search, extract and repurpose information from them.
What is OCR?
Optical Character Recognition (OCR) or text recognition unlocks the information “trapped” in a scanned/photographed image of a document. OCR software such as ABBYY FineReader “reads” the content of a document (text and structure) by interpreting character images and assigning them an electronic equivalent. This makes it possible to transfer the content and layout of the document into searchable and editable formats.
Click here for a more detailed description on what OCR is and how it works ›
Some examples of common conversion tasks:
PDF conversion to searchable PDF and PDF/A
- Creating PDF documents with a scanner results in image-only PDFs without a text layer.
Converting scanned PDF documents into PDFs containing selectable and searchable text enables easy management, copying and indexing of the content as well as full-text search. Organizations, especially in the legal, education or public sectors, can thus benefit from fast access to information via electronic applications such as eDiscovery or DMS systems.
- Converting PDF documents to the PDF/A format enables long-term archiving while meeting the latest compliance standards for archiving processes.
PDF conversion to Word
- Converting PDF documents to Microsoft® Word® gives you access to information locked in a PDF file (e.g. contracts, reports, articles, etc.).
- By converting a PDF into a Microsoft Word document, you can easily edit or change its content without wasting time re-typing and re-formatting.
- The original layout, including images, tables and columns, is exactly reproduced.
PDF conversion to Excel
- By converting a PDF document to Microsoft Excel® spreadsheets become available for further editing or calculation.
- The original row and column structure is preserved, allowing you to easily format and work with numerical data from price lists, analytical reports or newspaper articles, directly in Excel.
- If you are interested only in the tables contained in a large multipage document, the text outside of the tables can be ignored in the conversion process.
PDF conversion to RTF
- Converting PDF to RTF allows you to edit the text from the original PDF on different operating systems such as Mac®, Linux® or Windows®.
- Once converted to RTF, the document can be opened and further used in any text processing software, whether it is an older version of Microsoft Word, WordPerfect®, OpenOfficeTM Writer or Apple® Pages.
PDF conversion to EPUB
- Converting PDFs to the EPUB format makes reading electronic books or articles on a mobile phone or tablet significantly more convenient, allowing you to change the font size and adjust the text flow to fit the screen.
PDF conversion to HTML
- Converting PDF documents to HTML in order to publish the information on the Web can save significant amounts of time and effort by preserving the document structure of the original.