Microsoft utilizes ABBYY FineReader Engine to achieve unmatched accuracy in data capture for Live Search Books.
Leading OCR Technology Selected for Massive Book Scanning Project
ABBYY, a leading provider of document recognition, data capture, and linguistic technologies, is pleased to announce that Microsoft Corp. has selected its FineReader® Engine software development kit (SDK) to accurately convert the text from books to searchable electronic data for the extremely high-volume Live Search Books project.
Microsoft is helping to bring together the world’s books into a collective, globally accessible and searchable archive. The scanning of book titles from a number of large library collections is underway, including the University of California, the University of Toronto, The British Library, Yale University, The New York Public Library and Cornell University.
ABBYY’s OCR technology has been selected for a series of high-profile book scanning projects, including the Gutenberg-DE Project, The Million Book Project at Bibliotheca Alexandrina, and The Million Book Project, an international undertaking led by Carnegie Mellon University. As part of its efforts to support library and book digitization projects, ABBYY developed a special version of its FineReader OCR software, called FineReader XIX, to recognize the elaborate "Fraktur" or "black letter" texts from the period between 1600 and 1938, and old European type fonts found in books printed in the 17th to 20th centuries. The special omnifont OCR system was first developed as part of the METAe project, a research and development project co-funded by the European Commission (5th Framework, IST Programme, Area "Digital Heritage and Cultural Content").
ABBYY is passionate about efforts of converting printed materials to searchable and accessible electronic formats. Through its close partnerships with large-scale book conversion projects it has achieved nearly 100 percent accuracy in book scanning.
“It is exciting to announce this alliance with Microsoft and to be part of this large undertaking to bring the world’s books to electronic format for both preservation purposes and to make them more accessible and searchable via the Web,” said Dean Tang, CEO at ABBYY USA. “Through our extensive experience with book scanning and significant investment of R&D resources in this area, our latest version of FineReader has evolved to a level of unmatched accuracy, converting books of varying quality, typefaces and languages while retaining the formatting elements of even very sophisticated layouts, and keeping the context of the original books intact.”
Book scanning projects require the most accurate and reliable technology. Converting books and manuscripts from an array of sources, and with unprecedented scale and diversity presents unique challenges. ABBYY FineReader Engine was selected for this project due to its demonstrated accuracy, scalability, stability and overall throughput.
“In our electronic world, the ability to quickly find relevant information is expected and required, and access to full-text books is no exception,” said Jay Girotto, Group Program Manager at Microsoft. “As a key part of our strategy to create an extensive, searchable online library, we have carefully selected technology partners with reliable and proven solutions. ABBYY is a leading OCR vendor, and its solid capture and conversion capabilities hold up even for this extremely large-scale project.”
For more information about ABBYY and its products, visit www.ABBYY.com.