The exact mechanisms that allow humans to recognize objects are yet to be understood, but the three basic principles are already well known by scientists – integrity, purposefulness and adaptability (IPA). These principles constitute the core of ABBYY FineReader OCR allowing it to replicate natural or human-like recognition.
Let’s take a look on how FineReader OCR recognizes text. First, the program analyzes the structure of document image. It divides the page into elements such as blocks of texts, tables, images, etc. The lines are divided into words and then - into characters. Once the characters have been singled out, the program compares them with a set of pattern images. It advances numerous hypotheses about what this character is. Basing on these hypotheses the program analyzes different variants of breaking of lines into words and words into characters. After processing huge number of such probabilistic hypotheses, the program finally takes the decision, presenting you the recognized text.
In addition, ABBYY FineReader provides dictionary support for 48 languages. This enables secondary analysis of the text elements on word level. With dictionary support, the program ensures even more accurate analysis and recognition of documents and simplifies further verification of recognition results.