Skip to main content
Top
Published in: The Journal of Supercomputing 9/2021

24-02-2021

Support NNEF execution model for NNAPI

Authors: Yuan-Ming Chang, Chia-Yu Sung, Yu-Chien Sheu, Meng-Shiun Yu, Min-Yih Hsu, Jenq-Kuen Lee

Published in: The Journal of Supercomputing | Issue 9/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

With growing applications such as image recognition, speech recognition, ADAS, and AIoT, artificial intelligence (AI) frameworks are becoming popular in various industries. Currently, many choices for neural network frameworks exist for executing AI models in applications, especially for training/inference purposes, including TensorFlow, Caffe, MXNet, PyTorch, Core ML, TensorFlow Lite, and NNAPI. With so many different emerging frameworks, exchange formats are needed for different AI frameworks. Given this requirement, the Khronos group created a standard draft known as the Neural Network Exchange Format (NNEF). However, because NNEF is new, conversion tools for various AI frameworks that would allow the exchange of various AI frameworks remain missing. In this work, we fill this gap by devising NNAPI conversion tools for NNEF. Our work allows NNEF to execute inference tasks on host and Android platforms and flexibly invokes Android neural networks through the API (NNAPI) on the Android platform to speed up inference operations. We invoke NNAPI by dividing the input NNEF model into multiple submodels and let NNAPI execute these submodels. We develop an algorithm named BFSelector that is based on a classic breadth-first search and includes cost constraints to determine how to divide the input model. Our preliminary experimental results show that our support of NNEF on NNAPI can obtain a speedup of 1.32 to 22.52 times over the baseline for API 27 and of 4.56 to 211 times over the baseline for API 28, where the baseline is the NNEF-to-Android platform conversion without invoking NNAPI. The experiment includes AI models such as LeNet, AlexNet, MobileNet_V1, MobileNet_V2, VGG-16, and VGG-19.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
10.
go back to reference Sherstinsky A (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (ISTM) network. Phys D Nonlinear Phenom 404:132306MathSciNetCrossRef Sherstinsky A (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (ISTM) network. Phys D Nonlinear Phenom 404:132306MathSciNetCrossRef
14.
go back to reference LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef
15.
go back to reference Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
16.
go back to reference Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
17.
go back to reference Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520 Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
18.
go back to reference Simonyan K, Zisserman A Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 Simonyan K, Zisserman A Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
19.
go back to reference LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef
20.
go back to reference Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRef Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRef
21.
go back to reference He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp. 1026–1034 He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp. 1026–1034
23.
go back to reference KhronosGroup. KhronosGroup/NNEF-Tools. GitHub, 12 June 2019. github.com/KhronosGroup/NNEF-Tools/ KhronosGroup. KhronosGroup/NNEF-Tools. GitHub, 12 June 2019. github.com/KhronosGroup/NNEF-Tools/
24.
go back to reference The Web’s Scaffolding Tool for Modern Webapps. Yeoman. yeoman.io/ The Web’s Scaffolding Tool for Modern Webapps. Yeoman. yeoman.io/
28.
go back to reference Chen T, Moreau T, Jiang Z, Zheng L, Yan E, Shen H, Cowan M, Wang L, Hu Y, Ceze L et al (2018) \(\{\)TVM\(\}\): An automated end-to-end optimizing compiler for deep learning, In: Proceedings of the 13th \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{\)OSDI\(\}\) 18), pp 578–594 Chen T, Moreau T, Jiang Z, Zheng L, Yan E, Shen H, Cowan M, Wang L, Hu Y, Ceze L et al (2018) \(\{\)TVM\(\}\): An automated end-to-end optimizing compiler for deep learning, In: Proceedings of the 13th \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{\)OSDI\(\}\) 18), pp 578–594
29.
go back to reference Roesch J, Lyubomirsky S, Weber L, Pollock J, Kirisame M, Chen T, Tatlock Z (2018) Relay: a new IR for machine learning frameworks. In: Proceedings of the 2nd ACM SIGPLAN international workshop on machine learning and programming languages, pp 58–68 Roesch J, Lyubomirsky S, Weber L, Pollock J, Kirisame M, Chen T, Tatlock Z (2018) Relay: a new IR for machine learning frameworks. In: Proceedings of the 2nd ACM SIGPLAN international workshop on machine learning and programming languages, pp 58–68
30.
go back to reference Lai M-Y, Sung C-Y, Lee J-K, Hung M-Y (2020) Enabling android nnapi flow for tvm runtime. In: Proceedings of the 49th international conference on parallel processing-ICPP: workshops, pp 1–8 Lai M-Y, Sung C-Y, Lee J-K, Hung M-Y (2020) Enabling android nnapi flow for tvm runtime. In: Proceedings of the 49th international conference on parallel processing-ICPP: workshops, pp 1–8
31.
go back to reference Hung M-Y, Lai M-Y, Sung C-Y, Lee J-K (2020) A generic method to utilize vendor-specific AI accelerator on android mobile for TVM, TVM and deep learning compilation conference, Seattle, Hung M-Y, Lai M-Y, Sung C-Y, Lee J-K (2020) A generic method to utilize vendor-specific AI accelerator on android mobile for TVM, TVM and deep learning compilation conference, Seattle,
32.
go back to reference Lee C-L, Chao C-T, Lee J-K, Hung M-Y, Huang C-W (2019) Accelerate DNN performance with sparse matrix compression in halide. In: Proceedings of the 48th international conference on parallel processing: workshops, pp 1–6 Lee C-L, Chao C-T, Lee J-K, Hung M-Y, Huang C-W (2019) Accelerate DNN performance with sparse matrix compression in halide. In: Proceedings of the 48th international conference on parallel processing: workshops, pp 1–6
35.
go back to reference Yu M-S, Chen T-L, Lee J-K (2020) Accelerating nnef framework on opencl devices using cldnn, In: Proceedings of the International Workshop on OpenCL, pp. 1–2 Yu M-S, Chen T-L, Lee J-K (2020) Accelerating nnef framework on opencl devices using cldnn, In: Proceedings of the International Workshop on OpenCL, pp. 1–2
36.
go back to reference Bai J, Lu F, Zhang K et al, Onnx: Open neural network exchange, GitHub repository Bai J, Lu F, Zhang K et al, Onnx: Open neural network exchange, GitHub repository
Metadata
Title
Support NNEF execution model for NNAPI
Authors
Yuan-Ming Chang
Chia-Yu Sung
Yu-Chien Sheu
Meng-Shiun Yu
Min-Yih Hsu
Jenq-Kuen Lee
Publication date
24-02-2021
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 9/2021
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-021-03625-7

Other articles of this Issue 9/2021

The Journal of Supercomputing 9/2021 Go to the issue

Premium Partner