The understanding of the power of AI is increasing with its usage and ongoing research. Convolutional Neural Networks excel at object recognition, object detection and image segmentation; LSTMs and transformers lead the way in sequence analysis, including translation and search engine tasks. Loosely based on the micro-level architecture of the brain, these networks can – like the brain – be trained for specific, usually very bespoke, applications. The process of utilising this trained network in an application is called inference.
Taking the relatively simple case of object detection, training - carried out on chips such as the Nervana Neural Network Processor - consists of a dataset of many thousands or millions of labelled images. The more the better. Training is the process of inference, comparison of the output of the network with the expected label, followed by backpropagation of the error using calculus. There are many pretrained networks, such as ResNet and YOLO, available from online sources (some of which have suitable licences) that can be fine-tuned through a process of transfer learning for similar applications.
Moving beyond the training phase, each application has its own list of inference and associated software requirements. Explicit conditions on power consumption, working environment, security and performance will have an impact on processor choice. Additionally, image resolution and network structure parameters have an effect on latency, throughput and accuracy of the system. The decisions and compromises made in the requirements stage are affected by both hardware and software capabilities.
The Intel OpenVINO (Open Visual Inferencing and Neural Network Optimization) software platform is fast becoming a noted tool for efficient inference across Intel devices. The built-in Inference Engine allows easy integration with the different hardware, of which FPGAs are a prime example. Until recently the platform has centred around vision applications, supporting a wide range of CNN architectures for classification, segmentation or object detection, however the product is expanding to cover translation and NLP.
OpenVINO consists of a Python-based Model Optimiser, which acts as a built-in converter and platform-agnostic optimisation tool, and the Inference Engine. The Model Optimiser can take a network trained with any framework (Tensorflow, PyTorch, MXNet, Caffe, etc) and through cleaning the training hooks and merging superfluous layers effectively improves inference times.
Integrated plugins allow an easy route for inference on FPGAs on accelerator cards, and a separate flow which encompasses the embedded realm. The advantages of embedded are the low-power bespoke solution for many applications. Working together, the Programmable Solutions Group (including the newly acquired Omnitek) is able to provide IP for image preprocessing to security functions such as weights encryption) to quantized and binary options for exceptionally fast inference.