14–16 Mar 2023
European Space Research and Technology Centre (ESTEC)
Europe/Amsterdam timezone
Presentations available

Intel’s Solution for Fast and Efficient Inference

15 Mar 2023, 11:10
30m
Erasmus High Bay (European Space Research and Technology Centre (ESTEC))

Erasmus High Bay

European Space Research and Technology Centre (ESTEC)

Keplerlaan 1 2201AZ Noordwijk ZH The Netherlands
Poster Session Poster Session

Speaker

Ruth Abra

Description

The power of AI is increasing with ongoing research and investment. Convolutional Neural Networks excel at object recognition, object detection and image segmentation; transformers lead the way in sequence analysis, including translation, chatbot and search engine tasks. Loosely based on the micro-level architecture of the brain, these networks can – like the brain – be trained for specific, usually very bespoke, applications. The process of utilising this trained network in an application is called inference.

Taking the relatively simple case of object detection, training consists of a dataset of many thousands or millions of labelled images. The more the better. Training is the process of inference, comparison of the output of the network with the expected label, followed by backpropagation of the error using calculus. There are many pretrained networks, such as the ResNet family and YOLO, available from online sources that can be fine-tuned through a process of transfer learning for similar applications.

Moving beyond the training phase, each application has its own list of inference and associated software requirements. Explicit conditions on power consumption, working environment, security and performance will have an impact on processor choice. Additionally, image resolution and network structure parameters have an effect on latency, throughput and accuracy of the system. The decisions and compromises made in the requirements stage are affected by both hardware and software capabilities.

The Intel OpenVINO (Open Visual Inferencing and Neural Network Optimization) software platform is fast becoming a noted tool for efficient inference across Intel devices. The built-in Inference Engine allows easy integration with the different hardware, of which FPGAs are a prime example. Until recently the platform has centred around vision applications, supporting a wide range of CNN architectures for classification, segmentation or object detection, however the product is expanding to cover translation and other natural language processing.

OpenVINO consists of a Python-based Model Optimiser, which acts as a built-in converter and platform-agnostic optimisation tool, and the Inference Engine. The Model Optimiser can take a network from any training framework (Tensorflow, PyTorch, MXNet, Caffe, etc) and after removing the training hooks, condenses it to improve inference times.

Integrated plugins allow an easy route for inference on FPGAs on accelerator cards or embedded platforms. The main advantage of embedded devices is that they are the low-power, bespoke solution for many applications. Intel's Programmable Solutions Group is able to provide IP for image preprocessing, to security functions such as weights encryption, to quantized and binary options for exceptionally fast inference.

Primary author

Ruth Abra

Co-authors

Mr Dmitry Denisenko (Intel Corporation) Richard Allen (ex-Intel Corporation) Mr Tim Vanderhoek (Intel Corporation) Ms Sarah Wolstencroft (ex-Intel Corporation) Mr Mark Gibson (ex-Intel Corporation)

Presentation materials