Top

The Journal of Supercomputing

Published in:

24-02-2021

Support NNEF execution model for NNAPI

Authors: Yuan-Ming Chang, Chia-Yu Sung, Yu-Chien Sheu, Meng-Shiun Yu, Min-Yih Hsu, Jenq-Kuen Lee

Published in: The Journal of Supercomputing | Issue 9/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

With growing applications such as image recognition, speech recognition, ADAS, and AIoT, artificial intelligence (AI) frameworks are becoming popular in various industries. Currently, many choices for neural network frameworks exist for executing AI models in applications, especially for training/inference purposes, including TensorFlow, Caffe, MXNet, PyTorch, Core ML, TensorFlow Lite, and NNAPI. With so many different emerging frameworks, exchange formats are needed for different AI frameworks. Given this requirement, the Khronos group created a standard draft known as the Neural Network Exchange Format (NNEF). However, because NNEF is new, conversion tools for various AI frameworks that would allow the exchange of various AI frameworks remain missing. In this work, we fill this gap by devising NNAPI conversion tools for NNEF. Our work allows NNEF to execute inference tasks on host and Android platforms and flexibly invokes Android neural networks through the API (NNAPI) on the Android platform to speed up inference operations. We invoke NNAPI by dividing the input NNEF model into multiple submodels and let NNAPI execute these submodels. We develop an algorithm named BFSelector that is based on a classic breadth-first search and includes cost constraints to determine how to divide the input model. Our preliminary experimental results show that our support of NNEF on NNAPI can obtain a speedup of 1.32 to 22.52 times over the baseline for API 27 and of 4.56 to 211 times over the baseline for API 28, where the baseline is the NNEF-to-Android platform conversion without invoking NNAPI. The experiment includes AI models such as LeNet, AlexNet, MobileNet_V1, MobileNet_V2, VGG-16, and VGG-19.

previous article Performance and energy task migration model for heterogeneous clusters

next article Real-time energy data compression strategy for reducing data traffic based on smart grid AMI networks

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

TensorFlow. https://www.tensorflow.org/

Caffe. http://caffe.berkeleyvision.org/

MXNet. https://mxnet.apache.org/

PyTorch. https://pytorch.org/

Core ML. https://developer.apple.com/documentation/coreml/

TensorFlow Lite. https://www.tensorflow.org/lite/

NNAPI. https://developer.android.com/ndk/guides/neuralnetworks

The Khronos Group. https://www.khronos.org/

NNEF Overview. https://www.khronos.org/nnef

10.

Sherstinsky A (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (ISTM) network. Phys D Nonlinear Phenom 404:132306MathSciNetCrossRef

11.

Protocol buffers. https://developers.google.com/protocol-buffers//

12.

Google. https://www.google.com/

13.

Android Studio. https://developer.android.com/studio/

14.

LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef

15.

Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

16.

Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861

17.

Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520

18.

Simonyan K, Zisserman A Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

19.

LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef

20.

Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRef

21.

He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp. 1026–1034

22.

Tensorflow Lite. https://www.tensorflow.org/lite

23.

KhronosGroup. KhronosGroup/NNEF-Tools. GitHub, 12 June 2019. github.com/KhronosGroup/NNEF-Tools/

24.

The Web’s Scaffolding Tool for Modern Webapps. Yeoman. yeoman.io/

25.

MNIST. http://yann.lecun.com/exdb/mnist/

26.

CIFAR-10. https://www.cs.toronto.edu/~kriz/cifar.html

27.

ImageNet. http://www.image-net.org/

28.

Chen T, Moreau T, Jiang Z, Zheng L, Yan E, Shen H, Cowan M, Wang L, Hu Y, Ceze L et al (2018) \(\{\)TVM\(\}\): An automated end-to-end optimizing compiler for deep learning, In: Proceedings of the 13th \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{\)OSDI\(\}\) 18), pp 578–594

29.

Roesch J, Lyubomirsky S, Weber L, Pollock J, Kirisame M, Chen T, Tatlock Z (2018) Relay: a new IR for machine learning frameworks. In: Proceedings of the 2nd ACM SIGPLAN international workshop on machine learning and programming languages, pp 58–68

30.

Lai M-Y, Sung C-Y, Lee J-K, Hung M-Y (2020) Enabling android nnapi flow for tvm runtime. In: Proceedings of the 49th international conference on parallel processing-ICPP: workshops, pp 1–8

31.

Hung M-Y, Lai M-Y, Sung C-Y, Lee J-K (2020) A generic method to utilize vendor-specific AI accelerator on android mobile for TVM, TVM and deep learning compilation conference, Seattle,

32.

Lee C-L, Chao C-T, Lee J-K, Hung M-Y, Huang C-W (2019) Accelerate DNN performance with sparse matrix compression in halide. In: Proceedings of the 48th international conference on parallel processing: workshops, pp 1–6

33.

Develop applications and solutions that emulate human vision with the Intel Distribution of OpenVINO toolkit. https://software.intel.com/en-us/openvino-toolkit

34.

Compute Library for Deep Neural Networks. https://github.com/intel/clDNN

35.

Yu M-S, Chen T-L, Lee J-K (2020) Accelerating nnef framework on opencl devices using cldnn, In: Proceedings of the International Workshop on OpenCL, pp. 1–2

36.

Bai J, Lu F, Zhang K et al, Onnx: Open neural network exchange, GitHub repository

37.

ONNX runtime, https://www.onnxruntime.ai/

Title: Support NNEF execution model for NNAPI
Authors: Yuan-Ming Chang
Chia-Yu Sung
Yu-Chien Sheu
Meng-Shiun Yu
Min-Yih Hsu
Jenq-Kuen Lee
Publication date: 24-02-2021
Publisher: Springer US
Published in: The Journal of Supercomputing / Issue 9/2021
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-021-03625-7

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 9/2021

Heterogeneity-aware elastic scaling of streaming applications on cloud platforms

Distributed application provisioning over Ethereum-based private and permissioned blockchain: availability modeling, capacity, and costs planning

Optimal multilevel media stream caching in cloud-edge environment

cRedit-based and reputation retrieval system

Quantum-inspired binary chaotic salp swarm algorithm (QBCSSA)-based dynamic task scheduling for multiprocessor cloud computing systems

PyDTNN: A user-friendly and extensible framework for distributed deep learning

Premium Partner