Our open-source, cross-platform machine learning library, NeoML, appeared on GitHub about a year ago. Ever since, the ABBYY team has been hard at work updating the framework: we moved to Azure DevOps for regular assemblies, added support for new platforms, expanded the list of algorithms available, improved the productivity, and—the most important thing that you’ve all been waiting so patiently for—integrated a Python interface into the library.
Why did ABBYY make its development open source?
Open-source code is the main source of innovation in today’s software development. For that reason, we are striving to make the NeoML library accessible to an even wider user base. First, this enables us to obtain help from other machine learning specialists from all over the world who use, test, and refine our technology. Furthermore, we can study the opportunities to use ABBYY algorithms in new business scenarios. Finally, applying the company’s technologies to a variety of problems will enable us in the future to create an ecosystem that works on our software.
What has changed since the first version was released?
- NeoML 2.0 has become faster: the speed of running traditional algorithms on a range of problems has improved tenfold, while training neural networks is now 30% faster. This optimization is particularly useful to specialists and companies that train ML models on cloud services, and will also help simplify the development of mobile applications for customers.
- The library supports about 20 new methods of machine learning, including 10 new network layers and new methods of optimization. This means that members of the business and science community will now be able to supplement applications with resources for object identification, classification, regression, clustering, semantic segmentation, and verification, using the very latest platforms and architectures to perform these tasks.
- The new version supports the automatic calculation of gradients—an important function for rapidly running neural networks from various architectures.
- The framework also allows you to work in new environments including Apple M1 processors and graphics processors on Linux, plus integrated Intel models—thereby considerably increasing the opportunities to develop applications for customers.
Now, the library is also accessible to users of Python—one of the most popular programming languages for data analysis and machine learning.
The new Python interface has not only allowed us to increase the library’s scope of application, but also to refine how NeoML works. Many scenarios for the use of external ML frameworks were closed off, simplifying and optimizing the way ABBYY’s engineers work. The process itself of creating the new interface helped us to look at the existing platform from a more critical perspective and correct some flaws. In addition, it’s easier to get started and begin experimenting with the library in Python itself.
Furthermore, the Python version of NeoML is fully compatible with the C++ version. Any model using the library’s standard elements can be loaded and used in the С++ version, and vice versa.
How and where can NeoML be used?
ABBYY applies NeoML tools to all of its products involving natural language processing and computer vision tasks, including the Intelligent Document Processing platform ABBYY FlexiCapture and ABBYY Vantage, mobile applications, and other innovative solutions for the global market.
Furthermore, NeoML is already being used by independent developers and researchers from the USA, Canada, Germany, the Netherlands, Brazil, China, India, Vietnam, South Korea, and other countries.
Access to the source code can be obtained from the official project repository on GitHub. NeoML can be used on Windows, Linux, macOS, iOS, and Android. The library supports both CPU and GPU processors. The framework’s open-source code is provided subject to Apache License 2.0.
In the near future, our engineers at ABBYY intend to enhance the Python interface, tackle distributed machine learning and JIT compliance, as well as add new methods and optimize old ones. Stay tuned!