Skip to main content
Top
Published in: Peer-to-Peer Networking and Applications 1/2022

06-10-2021

Deep action: A mobile action recognition framework using edge offloading

Authors: Deyu Zhang, Heguo Zhang, Sijing Duan, Yunzhen Luo, Fucheng Jia, Feng Liu

Published in: Peer-to-Peer Networking and Applications | Issue 1/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Recording users’ lives as short-form videos has been an emerging trend with the advance of mobile devices. The videos contain a wealth of information that requires a significant amount of computation to retrieve. In this paper, we propose Deep action, a framework that leverages edge offloading to enable human actions recognition on mobile devices. Deep action first samples frames from a video according to the accuracy requirement. The sampled frames are then compressed and fed into deep learning models to generate an action label. Considering the varying conditions of the wireless connection, we design an online scheduler to strategically offload compressed video snippets to the edge server. Furthermore, we use OpenCL to implement the video compression-related operations on mobile GPU, such that the model inference and video compression can operate in parallel on the mobile device. We implement Deep action on the Android OS and evaluate it on a commercial off-the-shelf mobile device and an edge server. The performance evaluation demonstrates that Deep action brings up to 19 × and 13 × execution speedup, compared to the local-only and remote-only strategies, respectively.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
4.
go back to reference Afsar P, Cortez P, Santos H Automatic human action recognition from video using hidden markov model. IEEE 18th International Conference on Computational Science and Engineering, pp. 105–109 Afsar P, Cortez P, Santos H Automatic human action recognition from video using hidden markov model. IEEE 18th International Conference on Computational Science and Engineering, pp. 105–109
5.
go back to reference Afzal M, Shah N, Muhammad T (2019) Web video classification with visual and contextual semantics. Int J Commun Syst 32(13):1–15CrossRef Afzal M, Shah N, Muhammad T (2019) Web video classification with visual and contextual semantics. Int J Commun Syst 32(13):1–15CrossRef
6.
go back to reference Chang MJ, Hsieh JT, Fang CY, Chen SW (2019) A vision-based human action recog- nition system for moving cameras through deep learning. In: Proceedings of the 2019 2nd International Conference on Signal Processing and Machine Learning, pp. 85–91 Chang MJ, Hsieh JT, Fang CY, Chen SW (2019) A vision-based human action recog- nition system for moving cameras through deep learning. In: Proceedings of the 2019 2nd International Conference on Signal Processing and Machine Learning, pp. 85–91
7.
go back to reference Chen TYH, Ravindranath L, Deng S, Bahl P, Balakrishnan H (2015) Glimpse: Con- tinuous, real-time object recognition on mobile devices. In: Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, SenSys ’15, pp. 155–168. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2809695.2809711 Chen TYH, Ravindranath L, Deng S, Bahl P, Balakrishnan H (2015) Glimpse: Con- tinuous, real-time object recognition on mobile devices. In: Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, SenSys ’15, pp. 155–168. Association for Computing Machinery, New York, NY, USA. https://​doi.​org/​10.​1145/​2809695.​2809711
8.
go back to reference Fangbemi AS, Liu B, Yu NH, Zhang Y (2018) Efficient human action recognition in- terface for augmented and virtual reality applications based on binary descriptor. International Conference on Augmented Reality, Virtual Reality and Computer Graph- ics (AVR), pp. 252–260. Springer Fangbemi AS, Liu B, Yu NH, Zhang Y (2018) Efficient human action recognition in- terface for augmented and virtual reality applications based on binary descriptor. International Conference on Augmented Reality, Virtual Reality and Computer Graph- ics (AVR), pp. 252–260. Springer
9.
go back to reference Hossain MS, Muhammad G, Abdul W, Song B, Gupta BB (2018) Cloud-assisted secure video transmission and sharing framework for smart cities. Futur Gener Comput Syst 83:596–606CrossRef Hossain MS, Muhammad G, Abdul W, Song B, Gupta BB (2018) Cloud-assisted secure video transmission and sharing framework for smart cities. Futur Gener Comput Syst 83:596–606CrossRef
10.
go back to reference Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
11.
go back to reference Huynh LN, Lee Y, Balan RK (2017) DeepMon: Mobile GPU-based deep learning frame- work for continuous vision applications. In: Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services, pp. 82–95 Huynh LN, Lee Y, Balan RK (2017) DeepMon: Mobile GPU-based deep learning frame- work for continuous vision applications. In: Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services, pp. 82–95
12.
go back to reference Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size. arXiv preprint arXiv:1602.07360 Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size. arXiv preprint arXiv:1602.07360
13.
go back to reference Ibrar M, Wang L, Muntean GM, Chen J, Shah N, Akbar A (2021) IHSF: An intelligent solution for improved performance of reliable and time-sensitive flows in hybrid SDN- based FC IoT systems. IEEE Internet Things J 8(5):3130–3142CrossRef Ibrar M, Wang L, Muntean GM, Chen J, Shah N, Akbar A (2021) IHSF: An intelligent solution for improved performance of reliable and time-sensitive flows in hybrid SDN- based FC IoT systems. IEEE Internet Things J 8(5):3130–3142CrossRef
14.
go back to reference Jararweh Y, Alsmirat M, Al-Ayyoub M, Benkhelifa E, Darabseh A, Gupta B, Doulat A (2017) Software-defined system support for enabling ubiquitous mobile edge com- puting. Comput J 60(10):1443–1457CrossRef Jararweh Y, Alsmirat M, Al-Ayyoub M, Benkhelifa E, Darabseh A, Gupta B, Doulat A (2017) Software-defined system support for enabling ubiquitous mobile edge com- puting. Comput J 60(10):1443–1457CrossRef
15.
go back to reference Jegham I, Khalifa AB, Alouani I, Mahjoub MA (2020) Vision-based human action recognition: An overview and real world challenges. Forensic Science International: Digital Investigation 32:200901 Jegham I, Khalifa AB, Alouani I, Mahjoub MA (2020) Vision-based human action recognition: An overview and real world challenges. Forensic Science International: Digital Investigation 32:200901
16.
go back to reference Kaushik S, Gandhi C (2019) Ensure hierarchal identity based data security in cloud envi- ronment. International Journal of Cloud Applications and Computing (IJCAC) 9(4):21–36CrossRef Kaushik S, Gandhi C (2019) Ensure hierarchal identity based data security in cloud envi- ronment. International Journal of Cloud Applications and Computing (IJCAC) 9(4):21–36CrossRef
17.
go back to reference Kumar A (2019) Design of secure image fusion technique using cloud for privacy-preserving and copyright protection. International Journal of Cloud Applications and Computing (IJCAC) 9(3):22–36CrossRef Kumar A (2019) Design of secure image fusion technique using cloud for privacy-preserving and copyright protection. International Journal of Cloud Applications and Computing (IJCAC) 9(3):22–36CrossRef
18.
go back to reference Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE
19.
go back to reference Li D, Deng L, Gupta BB, Wang H, Choi C (2019) A novel cnn based security guaran- teed image watermarking generation scenario for smart city applications. Inf Sci 479:432–447CrossRef Li D, Deng L, Gupta BB, Wang H, Choi C (2019) A novel cnn based security guaran- teed image watermarking generation scenario for smart city applications. Inf Sci 479:432–447CrossRef
20.
go back to reference Liu L, Zhong R, Zhang W, Liu Y, Zhang J, Zhang L, Gruteser M (2018) Cutting the cord: Designing a high-quality untethered vr system with low latency remote render- ing. In: Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services, pp. 68–80 Liu L, Zhong R, Zhang W, Liu Y, Zhang J, Zhang L, Gruteser M (2018) Cutting the cord: Designing a high-quality untethered vr system with low latency remote render- ing. In: Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services, pp. 68–80
21.
go back to reference Lu Z, Chan K, Pu S, Porta TL (2019) Crowdvision: A computing platform for video crowdprocessing using deep learning. IEEE Trans Mobile Comput 18(7):1513–1526CrossRef Lu Z, Chan K, Pu S, Porta TL (2019) Crowdvision: A computing platform for video crowdprocessing using deep learning. IEEE Trans Mobile Comput 18(7):1513–1526CrossRef
22.
go back to reference Lv F, Nevatia R (2007) Single view human action recognition using key pose matching and viterbi path searching. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE Lv F, Nevatia R (2007) Single view human action recognition using key pose matching and viterbi path searching. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE
24.
go back to reference Natarajan P, Nevatia R (2008) Online, real-time tracking and recognition of human actions. In: 2008 IEEE Workshop on Motion and video Computing, pp. 1–8. IEEE Natarajan P, Nevatia R (2008) Online, real-time tracking and recognition of human actions. In: 2008 IEEE Workshop on Motion and video Computing, pp. 1–8. IEEE
25.
go back to reference Tejero-de Pablos A, Nakashima Y, Yokoya N, D´ıaz-Pernas FJ, Mart´ınez-Zarzuela M (2016) Flexible human action recognition in depth video sequences using masked joint trajectories. EURASIP Journal on Image and Video Processing 2016(1), pp. 1–12 Tejero-de Pablos A, Nakashima Y, Yokoya N, D´ıaz-Pernas FJ, Mart´ınez-Zarzuela M (2016) Flexible human action recognition in depth video sequences using masked joint trajectories. EURASIP Journal on Image and Video Processing 2016(1), pp. 1–12
26.
go back to reference Ran X, Chen H, Zhu X, Liu Z, Chen J (2018) DeepDecision: A mobile deep learning framework for edge video analytics. In: 2018 IEEE Conference on Computer Commu- nications (INFOCOM), pp. 1421–1429 Ran X, Chen H, Zhu X, Liu Z, Chen J (2018) DeepDecision: A mobile deep learning framework for edge video analytics. In: 2018 IEEE Conference on Computer Commu- nications (INFOCOM), pp. 1421–1429
27.
go back to reference Richardson IE (2004) H.264 and MPEG-4 video compression: video coding for next- generation multimedia. John Wiley & Sons pp. 159–220 Richardson IE (2004) H.264 and MPEG-4 video compression: video coding for next- generation multimedia. John Wiley & Sons pp. 159–220
28.
go back to reference Shechtman E, Irani M (2007) Space-time behavior-based correlation-or-how to tell if two underlying motion fields are similar without computing them? IEEE Trans Pattern Anal Mach Intell 29(11):2045–2056CrossRef Shechtman E, Irani M (2007) Space-time behavior-based correlation-or-how to tell if two underlying motion fields are similar without computing them? IEEE Trans Pattern Anal Mach Intell 29(11):2045–2056CrossRef
29.
go back to reference Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. arXiv preprint arXiv:1406.2199 Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. arXiv preprint arXiv:1406.2199
30.
go back to reference Su L, Lu Y, Wu F, Li S, Gao W (2009) Complexity-constrained H.264 video encoding. IEEE Trans Circuits Syst Video Technol 19(4):477–490 Su L, Lu Y, Wu F, Li S, Gao W (2009) Complexity-constrained H.264 video encoding. IEEE Trans Circuits Syst Video Technol 19(4):477–490
31.
go back to reference Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp. 4489–4497 Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp. 4489–4497
32.
go back to reference Tran D, Ray J, Shou Z, Chang SF, Paluri M (2017) Convnet architecture search for spatiotemporal feature learning. arXiv preprint arXiv:1708.05038 Tran D, Ray J, Shou Z, Chang SF, Paluri M (2017) Convnet architecture search for spatiotemporal feature learning. arXiv preprint arXiv:1708.05038
33.
go back to reference Valery O, Liu P, Wu J (2017) CPU/GPU collaboration techniques for transfer learning on mobile devices. In: 2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS), pp. 477–484 Valery O, Liu P, Wu J (2017) CPU/GPU collaboration techniques for transfer learning on mobile devices. In: 2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS), pp. 477–484
34.
go back to reference Wang F, Zhang C, Liu J, Zhu Y, Pang H, Sun L (2019) Intelligent edge-assisted crowdcast with deep reinforcement learning for personalized qoe. IEEE Conference on Computer Communications, pp. 910–918. IEEE Wang F, Zhang C, Liu J, Zhu Y, Pang H, Sun L (2019) Intelligent edge-assisted crowdcast with deep reinforcement learning for personalized qoe. IEEE Conference on Computer Communications, pp. 910–918. IEEE
35.
go back to reference Wu C, Zaheer M, Hu H, Manmatha R, Smola AJ, Kr¨ahenbu¨hl P (2018) Compressed video action recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6026–6035 Wu C, Zaheer M, Hu H, Manmatha R, Smola AJ, Kr¨ahenbu¨hl P (2018) Compressed video action recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6026–6035
36.
go back to reference Xu M, Qian F, Zhu M, Huang F, Pushp S, Liu X (2019) Deepwear: Adaptive local offloading for on-wearable deep learning. IEEE Trans Mob Comput 19(2):314–330CrossRef Xu M, Qian F, Zhu M, Huang F, Pushp S, Liu X (2019) Deepwear: Adaptive local offloading for on-wearable deep learning. IEEE Trans Mob Comput 19(2):314–330CrossRef
37.
go back to reference Xu M, Zhu M, Liu Y, Lin FX, Liu X (2018) DeepCache: Principled cache for mobile deep vision. In: Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, pp. 129–144 Xu M, Zhu M, Liu Y, Lin FX, Liu X (2018) DeepCache: Principled cache for mobile deep vision. In: Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, pp. 129–144
38.
go back to reference Zhang S, Wei Z, Nie J, Huang L, Zhen L (2017) A review on human activity recognition using vision-based method. Journal of Healthcare Engineering 2017(3):1–31 Zhang S, Wei Z, Nie J, Huang L, Zhen L (2017) A review on human activity recognition using vision-based method. Journal of Healthcare Engineering 2017(3):1–31
39.
go back to reference Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6848–6856 Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6848–6856
40.
go back to reference Zhou Z, Chen X, Li E, Zeng L, Luo K, Zhang J (2019) Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proc IEEE 107(8):1738–1762CrossRef Zhou Z, Chen X, Li E, Zeng L, Luo K, Zhang J (2019) Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proc IEEE 107(8):1738–1762CrossRef
41.
go back to reference Zhu Y, Li X, Liu C, Zolfaghari M, Xiong Y, Wu C, Zhang Z, Tighe J, Man- matha R, Li M (2020) A comprehensive study of deep video action recognition. arXiv preprint arXiv:2012.06567 Zhu Y, Li X, Liu C, Zolfaghari M, Xiong Y, Wu C, Zhang Z, Tighe J, Man- matha R, Li M (2020) A comprehensive study of deep video action recognition. arXiv preprint arXiv:2012.06567
Metadata
Title
Deep action: A mobile action recognition framework using edge offloading
Authors
Deyu Zhang
Heguo Zhang
Sijing Duan
Yunzhen Luo
Fucheng Jia
Feng Liu
Publication date
06-10-2021
Publisher
Springer US
Published in
Peer-to-Peer Networking and Applications / Issue 1/2022
Print ISSN: 1936-6442
Electronic ISSN: 1936-6450
DOI
https://doi.org/10.1007/s12083-021-01232-0

Other articles of this Issue 1/2022

Peer-to-Peer Networking and Applications 1/2022 Go to the issue

Premium Partner