ABBYY FlexiCapture technology enables automated extraction of relevant data from paper-based documents and forms. It intelligently identifies incoming documents by their type (e.g. invoices, contracts, purchase orders, etc.) and analyzes them according to predefined rules.
FlexiCapture is an important part of ABBYY's data and document capture technology. In a software product, these technologies usually work together to efficiently perform various document processing tasks. Thus, ABBYY's OCR technology recognizes machine-print text on documents and forms while Intelligent Character Recognition (ICR) technology reads isolated hand-print characters. They are complemented by technologies for recognizing checkmarks (OMR) and barcodes.
The key role of FlexiCapture technology in the data capture process is to analyze documents with flexible layouts (or semi-structured documents) such as invoices. FlexiCapture uses logic and hypotheses to accurately locate and recognize relevant data despite its variable location on diverse documents in a stream.
Fixed forms (or structured documents) are created using a single master copy with geometrical position of all form elements standardized in relation to each other and page edges. Fixed machine-readable forms have reference blocks to aid form field identification. The software can then compensate for any skew and scanning defects it encounters, and detect the form type (also called template matching). Once a template has been matched, recognition can begin. Each type of a structured document requires creation of a separate template, which in the case of simple forms takes just a few minutes.
Unlike fixed forms, the layouts and geometrical positions of objects on semi-structured and unstructured documents do not have to be standardized neither in relation to each other nor to page edges. This greatly complicates processing and requires more sophisticated document analysis. That is when ABBYY FlexiCapture technology takes the leading role. It enables data capture solutions with the capability to classify and recognize documents with flexible layouts at the same accuracy rates with which structured forms are processed.
FlexiCapture technology allows customers to create for each document type a logical definition of data layout called a FlexiLayout™, which is then interpreted by the data capture system. The FlexiLayout enables the system to easily find necessary fields on the document by using a set of unique algorithms. Once located, the data in the fields can be captured using the OCR/ICR/OMR and barcode-reading functionality. The technology can find data fields anywhere on the document, using any information available: relation to other objects on the page, contents of the field, its size, lines drawn around it, etc.
FlexiCapture technology is flexible and powerful enough to be applied to a wide variety of data capture tasks. The end solution that is delivered to the customer is the FlexiLayout and a software product (see below for products supporting FlexiCapture technology) that has the FlexiLayout interpreter incorporated in it.
Because it is based on Artificial Intelligence principles that imitate the way humans recognize objects (see the IPA Principles below), FlexiCapture technology never relies on any fixed presumptions: you may specify any object or its properties as tentative.
The principles of Integrity, Purposefulness and Adaptability take recognition a step further by using artificial intelligence to train the computer to analyze documents in the same way that the human brain would analyze them:
Built-in Adaptability allows FlexiCapture to more precisely generate hypotheses about specific objects based on the information collected from other parts of the image. With further technology improvements, its adaptability will go even further, making automatic adjustments to improve the FlexiLayout based on the analysis of the real documents being processed. In other words, the system learns and trains itself over time.
Using the principles of IPA, FlexiCapture generates a set of hypotheses based on rules provided by a human operator, and then picks the best hypothesis for the whole set of objects on the page. This last point is very important as the technology makes decisions not by analyzing each object separately, but instead, by taking into account the relationships between all the objects and the characteristics of each object. Only then does it determine the best match for the whole set of objects.
In practice, the situation with capturing data from documents can be more complicated than it may seem at first glance. For instance, if the anchor object (e.g. the word “Address” before or below the field with a person’s address) is a text string, then it is quite possible that the text is not perfectly readable and that the OCR may capture only a part of the text, or the whole text with some mistakes. Also, the same word, or even several words, found in the anchor text could also be written somewhere else.
It can also be that on some forms the field is on a single line, while on others it is on several lines, and it is not obvious how to distinguish lines belonging to the field from the other text on the form. Sometimes there is no anchor text for a field, and one must rely on borders or lines drawn nearby. In this case, there is a high probability that the line is not uniform due to a not-so-perfect scan or careless handling of the paper. The great advantage of FlexiCapture technology is that its document analysis works despite such variations.