The flexible and scalable architecture of ABBYY FineReader Engine allows leveraging multi-core CPUs and processing images in parallel on multiple threads. This way, the processing speed can be significantly increased.
By default, ABBYY FineReader Engine automatically detects whether to use multiprocessing or not. This depends on several factors, such as the number of available physical or logical CPU cores of the computing system, number of CPU cores defined within the license, and number of pages the document contains. If needed, the developer can easily change multiprocessing settings and tune the number of processes that should be run.
It generally means processing books, long reports, etc. In this case, you can recognize pages of the document in parallel and then perform synthesis and export in the main process. You can also, when using a pool of Engines, process several multi-page documents simultaneously, but the memory consumption can be huge and even lead to "out of memory" errors.
For parallel processing of multi-page documents, we recommend using FRDocument. It is the most easy-to-code multiprocessing way, because you do not have to implement any additional interfaces.
In this case opening, pre-processing, analysis and recognition are performed in parallel; synthesis is performed sequentially in the main process, and then export is again performed in parallel.
This is the case when you process invoices, contracts, letters, etc. Parallel processing is recommended as one-page documents do not depend on each other and do not require large amounts of memory at once.
FineReader Engine provides two options that can be used in this scenario:
The advantages of this method is that it can be used when you do not know in advance the number of documents, they can be of different types, and must be processed directly once they arrive. This method requires more implementation effort: you have to implement interfaces for a file adapter and a custom source of images. Opening, pre-processing, analysis and recognition are performed in parallel.
This method is the most efficient in speed and automatically eliminates all difficulties related to multi-threading: all operations with the ABBYY FineReader Engine objects are serialized by means of COM.