Library

Learn how ABBYY technologies work and
how they help boost productivity.

Convert PDF to searchable and editable formats

The PDF format was originally intended to display the exact same content and layout regardless of operating system, device, or software application it is viewed on. Nowadays however, it has become a necessity to be able to search through PDF documents, extract information or convert complete documents into editable formats. TThis is not always easy, especially in the case of PDFs created by scanning – “scanned” or "image-only" PDFs. PDF conversion tools must incorporate OCR technology to enable search, extraction and repurposing of information.

What is OCR?

Optical Character Recognition (OCR) or text recognition unlocks the information “trapped” in a scanned/photographed image of a document. OCR software such as ABBYY FineReader “reads” the content of a document (text and structure) by interpreting character images and assigning them an electronic equivalent. This makes it possible to transfer the content and layout of the document into searchable and editable formats.


OCR – Optical character recognition technology

Click here for a more detailed description on what OCR is and how it works ›



Some examples of common conversion tasks:

PDF to searchable PDF, PDF/A and PDF/UA

  • Creating PDF documents with a scanner results in image-only PDFs without a text layer.
  • Converting scanned PDF documents into PDFs containing selectable and searchable text enables easy management, copying and indexing of the content as well as full-text search. Organizations, especially in the legal, education or public sectors, can thus benefit from fast access to information via electronic applications such as eDiscovery or DMS systems.
  • Converting PDF documents to the PDF/A format enables long-term archiving while meeting the latest compliance standards for archiving processes.
  • Converting PDF documents to the PDF/UA format makes them accessible by assistive technologies (such as screen readers). Already mandatory for state and public institutions in many countries, universal accessibility of published information is gradually increasing its importance for socially responsible businesses around the world. Learn more about PDF/UA ›

PDF to Word

  • Converting PDF documents to Microsoft® Word® gives you access to information locked in a PDF file (e.g. contracts, reports, articles, etc.).
  • By converting a PDF into a Microsoft Word document, you can easily edit or change its content without wasting time re-typing and re-formatting.
  • The original layout, including images, tables and columns, is exactly reproduced.

PDF to Excel

  • By converting a PDF document to Microsoft Excel® spreadsheets become available for further editing or calculation.
  • The original row and column structure is preserved, allowing you to easily format and work with numerical data from price lists, analytical reports or newspaper articles, directly in Excel.
  • If you are interested only in the tables contained in a large multipage document, the text outside of the tables can be ignored in the conversion process.

PDF to RTF

  • Converting PDF to RTF allows you to edit the text from the original PDF on different operating systems such as Mac®, Linux® or Windows®.
  • Once converted to RTF, the document can be opened and further used in any text processing software, whether it is an older version of Microsoft Word, WordPerfect®, OpenOfficeTM Writer or Apple® Pages.

PDF to EPUB

  • Converting PDFs to the EPUB format makes reading electronic books or articles on a mobile phone or tablet significantly more convenient, allowing you to change the font size and adjust the text flow to fit the screen.

PDF to HTML

  • Converting PDF documents to HTML in order to publish the information on the Web can save significant amounts of time and effort by preserving the document structure of the original.