Skip to main content
Erschienen in: Peer-to-Peer Networking and Applications 6/2021

16.08.2021

A DNN inference acceleration algorithm combining model partition and task allocation in heterogeneous edge computing system

verfasst von: Lei Shi, Zhigang Xu, Yabo Sun, Yi Shi, Yuqi Fan, Xu Ding

Erschienen in: Peer-to-Peer Networking and Applications | Ausgabe 6/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Edge intelligence, as a new computing paradigm, aims to allocate Artificial Intelligence (AI)-based tasks partly on the edge to execute for reducing latency, consuming energy and improving privacy. As the most important technique of AI, Deep Neural Networks (DNNs) have been widely used in various fields. And for those DNN based tasks, a new computing scheme named DNN model partition can further reduce the execution time. This computing scheme partitions the DNN task into two parts, one will be executed on the end devices and the other will be executed on edge servers. However, in a complex edge computing system, it is difficult to coordinate DNN model partition and task allocation. In this work, we study this problem in the heterogeneous edge computing system. We first establish the mathematical model of adaptive DNN model partition and task offloading. The mathematical model contains a large number of binary variables, and the solution space will be too large to be solved directly in a multi-task scenario. Then we use dynamic programming and greedy strategy to reduce the solution space under the premise of a good solution, and propose our offline algorithm named GSPI. Then considering the actual situation, we subsequently proposed the online algorithm. Through our experiments and simulations, we proved that compared with end-only and server-only, our proposed GSPI algorithm can reduce the system time cost by 30% on average and the online algorithm can reduce the system time cost by 28% on average.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In BMVC Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In BMVC
2.
Zurück zum Zitat Chen C, Seff A, Kornhauser AL, Xiao J (2015) Deepdriving: Learning affordance for direct perception in autonomous driving. 2015 IEEE International Conference on Computer Vision (ICCV) 2722–2730 Chen C, Seff A, Kornhauser AL, Xiao J (2015) Deepdriving: Learning affordance for direct perception in autonomous driving. 2015 IEEE International Conference on Computer Vision (ICCV) 2722–2730
3.
Zurück zum Zitat Chan W, Jaitly N, Le QV, Vinyals O (2016) Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 4960–4964 Chan W, Jaitly N, Le QV, Vinyals O (2016) Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 4960–4964
4.
Zurück zum Zitat Snyder T, Byrd G (2017) The internet of everything. Computer 50(6):8–9CrossRef Snyder T, Byrd G (2017) The internet of everything. Computer 50(6):8–9CrossRef
5.
Zurück zum Zitat Pandey P, Singh S, Singh S (2010) Cloud computing. In ICWET Pandey P, Singh S, Singh S (2010) Cloud computing. In ICWET
6.
Zurück zum Zitat Shi W, Cao J, Zhang Q, Li Y, Xu L (2016) Edge computing: Vision and challenges. IEEE Internet of Things J 3:637–646CrossRef Shi W, Cao J, Zhang Q, Li Y, Xu L (2016) Edge computing: Vision and challenges. IEEE Internet of Things J 3:637–646CrossRef
7.
Zurück zum Zitat Long C, Cao Y, Jiang T, Zhang Q (2018) Edge computing framework for cooperative video processing in multimedia iot systems. IEEE Trans Multimedia 20:1126–1139CrossRef Long C, Cao Y, Jiang T, Zhang Q (2018) Edge computing framework for cooperative video processing in multimedia iot systems. IEEE Trans Multimedia 20:1126–1139CrossRef
8.
Zurück zum Zitat Deschamps-Sonsino A (2018) Smarter homes. In Apress Deschamps-Sonsino A (2018) Smarter homes. In Apress
9.
Zurück zum Zitat Alba E, Chicano F, Luque G (2016) Smart cities. In Lect Notes Comput Sci Alba E, Chicano F, Luque G (2016) Smart cities. In Lect Notes Comput Sci
10.
Zurück zum Zitat Zhou Z, Chen X, Li E, Zeng L, Luo K, Zhang J (2019) Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proc IEEE 107:1738–1762CrossRef Zhou Z, Chen X, Li E, Zeng L, Luo K, Zhang J (2019) Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proc IEEE 107:1738–1762CrossRef
11.
Zurück zum Zitat Kang Y, Hauswald J, Gao C, Rovinski A, Mudge TN, Mars J, Tang L (2017) Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. In ASPLOS ’17 Kang Y, Hauswald J, Gao C, Rovinski A, Mudge TN, Mars J, Tang L (2017) Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. In ASPLOS ’17
12.
Zurück zum Zitat Han S, Mao H, Dally WJ (2015) Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. CoRR, abs/1510.00149 Han S, Mao H, Dally WJ (2015) Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. CoRR, abs/1510.00149
13.
Zurück zum Zitat Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. ArXiv, abs/1704.04861 Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. ArXiv, abs/1704.04861
14.
Zurück zum Zitat Ma N, Zhang X, Zheng H-T, Sun J (2018) Shufflenet v2: Practical guidelines for efficient cnn architecture design. In ECCV Ma N, Zhang X, Zheng H-T, Sun J (2018) Shufflenet v2: Practical guidelines for efficient cnn architecture design. In ECCV
15.
Zurück zum Zitat Sandler M, Howard AG, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 4510–4520 Sandler M, Howard AG, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 4510–4520
16.
Zurück zum Zitat Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 6848–6856 Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 6848–6856
17.
Zurück zum Zitat Xu M, Qian F, Zhu M, Huang F, Pushp S, Liu X (2020) Deepwear: Adaptive local offloading for on-wearable deep learning. IEEE Trans Mob Comput 19:314–330CrossRef Xu M, Qian F, Zhu M, Huang F, Pushp S, Liu X (2020) Deepwear: Adaptive local offloading for on-wearable deep learning. IEEE Trans Mob Comput 19:314–330CrossRef
18.
Zurück zum Zitat Li E, Zeng L, Zhou Z, Chen X (2020) Edge ai: On-demand accelerating deep neural network inference via edge computing. IEEE Trans Wirel Commun 19:447–457CrossRef Li E, Zeng L, Zhou Z, Chen X (2020) Edge ai: On-demand accelerating deep neural network inference via edge computing. IEEE Trans Wirel Commun 19:447–457CrossRef
19.
Zurück zum Zitat Teerapittayanon S, McDanel B, Kung HT (2016) Branchynet: Fast inference via early exiting from deep neural networks. 2016 23rd International Conference on Pattern Recognition (ICPR) 2464–2469 Teerapittayanon S, McDanel B, Kung HT (2016) Branchynet: Fast inference via early exiting from deep neural networks. 2016 23rd International Conference on Pattern Recognition (ICPR) 2464–2469
20.
Zurück zum Zitat Hu C, Bao WS, Wang D, Liu F (2019) Dynamic adaptive dnn surgery for inference acceleration on the edge. IEEE INFOCOM 2019 - IEEE Conference on Computer Communications 1423–1431 Hu C, Bao WS, Wang D, Liu F (2019) Dynamic adaptive dnn surgery for inference acceleration on the edge. IEEE INFOCOM 2019 - IEEE Conference on Computer Communications 1423–1431
21.
Zurück zum Zitat Ko JH, Na T, Amir MF, Mukhopadhyay S (2018) Edge-host partitioning of deep neural networks with feature space encoding for resource-constrained internet-of-things platforms. 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) 1–6 Ko JH, Na T, Amir MF, Mukhopadhyay S (2018) Edge-host partitioning of deep neural networks with feature space encoding for resource-constrained internet-of-things platforms. 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) 1–6
22.
Zurück zum Zitat Lin B, Huang Y, Zhang J, Hu J, Chen X, Li J (2020) Cost-driven off-loading for dnn-based applications over cloud, edge, and end devices. IEEE Trans Ind Inf 16:5456–5466CrossRef Lin B, Huang Y, Zhang J, Hu J, Chen X, Li J (2020) Cost-driven off-loading for dnn-based applications over cloud, edge, and end devices. IEEE Trans Ind Inf 16:5456–5466CrossRef
23.
Zurück zum Zitat Shi C, Chen L, Shen C, Song L, Xu J (2019) Privacy-aware edge computing based on adaptive dnn partitioning. 2019 IEEE Global Communications Conference (GLOBECOM) 1–6 Shi C, Chen L, Shen C, Song L, Xu J (2019) Privacy-aware edge computing based on adaptive dnn partitioning. 2019 IEEE Global Communications Conference (GLOBECOM) 1–6
24.
Zurück zum Zitat Mao Y, Zhang J, Letaief KB (2016) Dynamic computation offloading for mobile-edge computing with energy harvesting devices. IEEE J Sel Areas Commun 34:3590–3605CrossRef Mao Y, Zhang J, Letaief KB (2016) Dynamic computation offloading for mobile-edge computing with energy harvesting devices. IEEE J Sel Areas Commun 34:3590–3605CrossRef
25.
Zurück zum Zitat Tran TX, Pompili D (2019) Joint task offloading and resource allocation for multi-server mobile-edge computing networks. IEEE Trans Veh Technol 68:856–868CrossRef Tran TX, Pompili D (2019) Joint task offloading and resource allocation for multi-server mobile-edge computing networks. IEEE Trans Veh Technol 68:856–868CrossRef
26.
Zurück zum Zitat Mohammed T, Joe-Wong C, Babbar R, Francesco MD (2020) Distributed inference acceleration with adaptive dnn partitioning and offloading. IEEE INFOCOM 2020 - IEEE Conference on Computer Communications 854–863 Mohammed T, Joe-Wong C, Babbar R, Francesco MD (2020) Distributed inference acceleration with adaptive dnn partitioning and offloading. IEEE INFOCOM 2020 - IEEE Conference on Computer Communications 854–863
27.
Zurück zum Zitat Huang Y, Wang F, Wang F, Liu J (2019) Deepar: A hybrid device-edge-cloud execution framework for mobile deep learning applications. IEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) 892–897 Huang Y, Wang F, Wang F, Liu J (2019) Deepar: A hybrid device-edge-cloud execution framework for mobile deep learning applications. IEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) 892–897
28.
Zurück zum Zitat Qassim H, Feinzimer D, Verma A (2017) Residual squeeze vgg16. ArXiv, abs/1705.03004 Qassim H, Feinzimer D, Verma A (2017) Residual squeeze vgg16. ArXiv, abs/1705.03004
29.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Commun ACM 60:84–90CrossRef Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Commun ACM 60:84–90CrossRef
30.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778
31.
Zurück zum Zitat Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1–9 Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1–9
Metadaten
Titel
A DNN inference acceleration algorithm combining model partition and task allocation in heterogeneous edge computing system
verfasst von
Lei Shi
Zhigang Xu
Yabo Sun
Yi Shi
Yuqi Fan
Xu Ding
Publikationsdatum
16.08.2021
Verlag
Springer US
Erschienen in
Peer-to-Peer Networking and Applications / Ausgabe 6/2021
Print ISSN: 1936-6442
Elektronische ISSN: 1936-6450
DOI
https://doi.org/10.1007/s12083-021-01223-1

Weitere Artikel der Ausgabe 6/2021

Peer-to-Peer Networking and Applications 6/2021 Zur Ausgabe

Premium Partner