IMPACT Project to Improve Access to Historical Texts Through Technology Innovation and Collaboration
ABBYY, a leading provider of document recognition, document capture, and linguistic technologies and services, today announced the first results of its participation in the European Commission’s IMProving ACcess to Text (IMPACT) project. ABBYY provides Optical Character Recognition (OCR) software and expertise in recognizing historic fonts for this European wide research project which aims to transform Europe’s printed heritage into digitally available resources. Enriched by the experience gained in the four year project, ABBYY now offers products equipped with the most advanced text recognition technology for printed historic documents.
The IMPACT consortium brings together twenty-six European national and regional libraries, research institutions and technology partners sharing their knowledge and best practice, and developing innovative tools to enhance the capabilities of OCR engines. By the end of 2011, the project will significantly improve access to historical texts produced before the year 1900. The joint research and development efforts allow making an important step towards digitization of European cultural heritage.
Since 2008 ABBYY has played a key role in the IMPACT project by providing state-of-the-art OCR technology and expertise for digitizing historic fonts and extracting text from old documents. ABBYY’s OCR technology is particularly suited to recognize text on images of documents in various historic typefaces making it possible for those resources to be fully digitised, searchable and integrated with systems used for lexicon building. ABBYY research and development teams have been working closely with key members of the IMPACT project teams to address key technological issues faced by large library digitization projects. Using historically relevant samples collected by leading European libraries, ABBYY has delivered new technology advancements in image pre-processing and analysis of document layouts for better character recognition results.
“Previously, OCR software lacked the advanced features needed to satisfactorily transform scanned pages into full text or XML, particularly when it came to old books, magazines and newspapers,” said Aly Conteh, Executive Board member of the IMPACT Project and Digitization Programme Manager at the British Library. “Being part of the IMPACT project we have gained access to cutting-edge OCR from ABBYY which enables IMPACT members to process documents of often poor quality much more efficiently. And we know we can rely on this collaboration, as ABBYY is continuously working closely with the libraries and the other partners to improve the core technology.”
“Digitization not only can help preserve European intellectual heritage for future generations. It can also unlock and make those treasures available for millions researchers, students and regular readers around the globe,” explained Andrey Isaev, Director of the SDK Products Department at ABBYY. “ABBYY is excited to contribute to this worthwhile goal through OCR innovation and sharing of best practice. And we very much appreciate the close and productive collaboration with people who are just as enthusiastic about text recognition as ourselves.”
ABBYY’s participation in the IMPACT project builds on previous successful European digitization projects such as the METAe project where ABBYY developed FineReader XIX specifically designed an omnifont OCR to recognize Fraktur or “Black Letter” historic typeface such as those found on text published from 1800 to 1938 , and the Digitization-on-Demand project which has the goal to digitize millions of books to make them available in electronic form. The Fraunhofer Institute for Media Communication (IMK) in Germany also successfully digitized the archive of the Neue Zürcher Zeitung newspaper with the ABBYY FineReader SDK with historic font OCR enabled. The latest enhancements to ABBYY recognition technologies are now available commercially for use by institutions and organizations around the world. ABBYY FineReader Engine SDK and ABBYY Recognition Server 3.0 deliver some of ABBYY’s recent improvements in the areas of recognition of specialized texts and documents.
More information on the IMPACT project can be found here: http://www.impact-project.eu/. For more information on ABBYY OCR technology for recognizing historical texts visit http://www.frakturschrift.com.