Select Region
Global Web Site - English
North America - English
Western Europe - English
Western Europe - Deutsch
Western Europe - Français
Western Europe - Italiano
Russia - Русский
Ukraine - Русский
Czech Republic - Čeština
China - 简体中文
Greece - Ελληνικά
Brazil - Português
South America - Español
Turkey - Türkçe
Document Analysis
Document Analysis is a set of functions for automatic detection of the following objects on a page:
Additionally document analysis provides some special features to prepare image for OCR:
This preparation is significantly important to specify which fields on page should be recognized and what should be kept in initial form.
And also there is an ability to designate the field for recognition manually. In this case you have to set field’s coordinates and type of data inside. It is used in Field-Level Recognition scenario mostly for data capture.
ABBYY FineReader Engine 10 provides 3 automatic and 1 manual type of document analysis:
This is default document analysis type which searches all objects: text blocks, pictures, tables, barcodes and separators. The results of this analysis are used for document structure and layout retrieval in content reuse scenario. All pictures and diagrams are preserved in original form without recognizing text on them.
This is a preprocessing engine for converting semi-structured documents, such as invoices, payment drafts, bills, waybills, business cards, agreements, health claim forms, etc. It has been designed to accurately locate all the text on these documents, including characters and numbers — even if this information is located within stamps, pictures, logos or small-text areas.
Unlike the standard full-page document analysis, this one assumes that all printed information on documents is text. It also ensures that important text information is not identified as graphic elements and words or numerical values are not separated into multiple characters. As a result, maximum information about the text, including its coordinates, is available for analysis, field-by-field processing and parsing at subsequent processing stages by other systems.
Automatically detects and recognizes all text on documents including text embedded in pictures, charts, and diagrams. Developers may choose to use this mode of document analysis to extract exhaustive full-text information on documents needed for document index building (as in DMS, CMS, Archiving systems).
This case does not need any analysis because the recognition field is directly defined by user or application. Recognizer receives the coordinates of field and type of text and process OCR in specified zone.
Learn more about full-text and field-level recognition >>
Image Import
Image Processing
Document Analysis
OCR and Other Recognition Technologies
Receiving and Exporting Recognized Text