Boehringer Ingelheim has gathered in its document management system an extensive archive, filled with product licensing documents. A large part of the archived documents are up to 50 year old, and created a backlog of 30 million pages. Searching for any specific information was cost intensive, difficult and time consuming, and the possibility of doing a full-text search did not exist. To solve the problem, Boehringer Ingelheim was looking for a solution, which would make existing data accessible for a full-text search, and offer the possibility to save new documents in a searchable format. Boehringer Ingelheim has developed an interface that is user-friendly and offers flexible search options. The basis for this solution is text recognition technology (Optical Character Recognition – OCR). Only the precise conversion of scanned documents into digital files could assure that the information was available for search at any time. Therefore, a powerful OCR solution was needed, which would offer recognition precision, stability, easy oversight, and good integration options.
After a comprehensive evaluation of OCR products currently available on the market, Boehringer Ingelheim chose ABBYY Recognition Server, a very scalable server-based solution based on ABBYY’s award winning technology. The out-of-box OCR solution converts paper documents and images into text files and searchable PDFs. These files can later be categorized, archived, searched and saved in any desired document management system. Recognition Server was specifically designed to handle large amounts of documents. Recognition and conversion takes place on designated machines, while supervision is mainly conducted via a server manager. Incoming documents are automatically distributed to available processing stations and, in case one processing station is defect, its assignments will automatically be executed by another one. Implementing it as a service guarantees continuous availability, and that it can be used universally. Recognition Server has a flexible interface (API) that offers an easy to use and smooth integration to the already existing IT solution. The Management Console enables the person in charge at Boehringer Ingelheim to centrally define the parameters and administrate recognition stations, hereby determining the workflow of the specific workgroups and projects.
With this project, documents from the existing document management system are automatically read and recognized, and the results are attached as indexable reproductions to the original document. As Boehringer Ingelheim employees worldwide scan around 10,000 new pages a day, the newly scanned documents need to be handled with priority. ABBYY’s Recognition Server plays a central role in this solution and represents the foundation for the solution.
The powerful interface made it easy to integrate in existing IT structures, and also in the everyday practice ABBYY Recognition Server proved valuable and convinced the users. Planning, testing and installing Recognition Server was concluded on time and within the desired budget, which meant that the system integration was launched in September 2007.
For the Boehringer Ingelheim project, ABBYY Recognition Server was installed with 4 processing stations. The goal of this project is to recognize the archived documents within approximately 1 year. All new documents are also added to this system and Boehringer Ingelheim will therefore have, once this project is concluded, a complete archive with full-text searchable documents. Dr. Dirk Ehrhard, who is responsible for planning and implementing this project at Boehringer Ingelheim, is very delighted with the results achieved so far.