nach oben

The Journal of Supercomputing

Erschienen in:

07.01.2022

Driver attention prediction based on convolution and transformers

verfasst von: Chao Gou, Yuchen Zhou, Dan Li

Erschienen in: The Journal of Supercomputing | Ausgabe 6/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In recent years, studying how drivers allocate their attention while driving is critical in achieving human-like cognitive ability for autonomous vehicles. And it has been an active topic in the community of human–machine augmented intelligence for self-driving. However, existing state-of-the-art methods for driver attention prediction are mainly built upon convolutional neural network (CNN) with local receptive field which has a limitation to capture the long-range dependencies. In this work, we propose a novel Attention prediction method based on CNN and Transformer which is termed as ACT-Net. In particular, CNN and Transformer are combined as a block which is further stacked to form the deep model. Through this design, both local and long-range dependencies are captured that both are crucial for driver attention prediction. Exhaustive comparison experiments over other state-of-the-art techniques conducted on widely used dataset of BDD-A and private collected data on BDD-X validate the effectiveness of the proposed ACT-Net.

Vorheriger Artikel PanoVILD: a challenging panoramic vision, inertial and LiDAR dataset for simultaneous localization and mapping

Nächster Artikel Stealth assessment strategy in distributed systems using optimal deep learning with game based learning

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nanning Z, Liu Z, Pengju R, Ma Y, Chen ST, Yu S, Xue J, Chen B, Wang F (2017) Hybrid-augmented intelligence: collaboration and cognition. Front Inf Technol Electron Eng 18:153–179CrossRef

A Tawari, B Kang (2017) A computational framework for drivers visual attention using a fully convolutional architecture. IEEE Intelligent Vehicles Symposium (IV), pp. 887–894

Palazzi A, Abati D, Solera F, Cucchiara R et al (2018) Predicting the drivers focus of attention: the dr (eye) ve project. IEEE Trans Pattern Anal Mach Intell 41(7):1720–1733CrossRef

Y Xia, J Kim, J Canny, K Zipser, T Canas-Bajo, D Whitney (2020) Periphery-fovea multi-resolution driving model guided by human attention. In: The IEEE Winter Conference on Applications of Computer Vision, pp. 1767–1775

A Pal, Mondal S, Christensen H (2020) looking at the right stuff guided semantic-gaze for autonomous driving. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11880–11889

A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, L Kaiser, Illia Polosukhin (2017) Attention is all you need. ArXiv, abs/1706.03762

Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH, Zhang L. (2020)Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. ArXiv, abs/2012.15840

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth 16x16 words: transformers for image recognition at scale. ArXiv, abs/2010.11929

Han K, Xiao A, Wu E, Guo J, Xu C, Wang Y (2021) Transformer in transformer. ArXiv, abs/2103.00112

10.

Deng T, Yan H, Qin L, Ngo T, Manjunath BS (2020) How do drivers allocate their potential attention? driving fixation prediction via convolutional neural networks. IEEE Trans Intell Transp Syst 21(5):2146–2154CrossRef

11.

Fang J, Yan D, Qiao J, Xue J, Yu H (2021) Dada: driver attention prediction in driving accident scenarios. In: IEEE Transactions on Intelligent Transportation Systems, pp. 1–13

12.

Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229

13.

Yan H, Li Z, Li W, Wang C, Wu M, Zhang C (2021) Contnet: Why not use convolution and transformer at the same time? ArXiv, abs/2104.13497

14.

Yang G, Tang H, Ding M, Sebe N, Ricci E (2021)Transformers solve the limited receptive field for monocular depth prediction. ArXiv, abs/2103.12091

15.

Xia Y, Zhang D, Kim J, Nakayama K, Zipser K, Whitney D (2018)Predicting driver attention in critical situations. In: Asian conference on computer vision, pp. 658–674. Springer

16.

Kim J, Rohrbach A, Darrell T, Canny J, Akata Z (2018) Textual explanations for self-driving vehicles. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 577–593,

17.

Moran J, Desimone R (1985) Selective attention gates visual processing in the extrastriate cortex. Science 229(4715):782–4CrossRef

18.

Alaparthi S, Mishra M (2020) Bidirectional encoder representations from transformers (bert): a sentiment analysis odyssey. ArXiv, abs/2007.01127

19.

Prakash A, Chitta K, Geiger A (2021) Multi-modal fusion transformer for end-to-end autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7077–7087

20.

Yuan Z, Song X, Bai L, Wang Z, Ouyang W (2021) Temporal-channel transformer for 3d lidar-based video object detection for autonomous driving. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2021.3082763CrossRef

21.

Sheng H, Cai S, Liu Y, Deng B, Huang J, Hua XS, Zhao MJ (2021) Improving 3d object detection with channel-wise transformer. In:Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.2743–2752

22.

Morando A, Victor T, Dozza M (2019) A reference model for driver attention in automation: Glance behavior changes during lateral and longitudinal assistance. IEEE Trans Intell Transp Syst 20:2999–3009CrossRef

23.

Fang J, Yan D, Qiao J, Xue J, Wang H, Li S (2019) Dada-2000: can driving accident be predicted by driver attention analyzed by a benchmark. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pp. 4303–4309. IEEE

24.

Lv K, Sheng H, Xiong Z, Li W, Zheng L (2020) Improving driver gaze prediction with reinforced attention. IEEE Trans Multimed 23:4198–4207CrossRef

25.

Deng T, Yan H, Li YJ (2018) Learning to boost bottom-up fixation prediction in driving environments via random forest. IEEE Trans Intell Transp Syst 19(9):3059–3067CrossRef

26.

Tawari A, Mallela P, Martin S (2018) Learning to attend to salient targets in driving videos using fully convolutional rnn. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 3225–3232. IEEE

27.

Lateef F, Kas M, Ruichek Y (2021) Saliency heat-map as visual attention for autonomous driving using generative adversarial network (gan). IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2021.3053178CrossRef

28.

Shirpour M, Beauchemin S, Bauer M (2021) Driver’s eye fixation prediction by deep neural network. In: VISIGRAPP (4: VISAPP), pp. 67–75

29.

Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille A (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40:834–848CrossRef

30.

Xu H, Gao Y, Yu F, Darrell T (2017) End-to-end learning of driving models from large-scale video datasets. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3530–3538

31.

Yu F, Xian W, Chen Y, Liu F, Liao M, Madhavan V, Darrell T (2018) Bdd100k: a diverse driving video database with scalable annotation tooling. ArXiv, abs/1805.04687

32.

Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271

33.

Meur O, Baccino T (2013) Methods for comparing scanpaths and saliency maps: strengths and weaknesses. Behav Res Methods 45:251–266CrossRef

34.

Wang W, Shen J, Xie J, Cheng MM, Ling H, Borji A (2021) Revisiting video saliency prediction in the deep learning era. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 43:220–237

35.

Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259CrossRef

36.

Harel J, Koch C, Perona P (2006) Graph-based visual saliency. In: Neural Information Processing Systems (NIPS), pp. 545–552

37.

Huang X, Shen C, Boix X, Zhao Q (2015) Salicon: Reducing the semantic gap in saliency prediction by adapting deep neural networks. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 262–270

Titel: Driver attention prediction based on convolution and transformers
verfasst von: Chao Gou
Yuchen Zhou
Dan Li
Publikationsdatum: 07.01.2022
Verlag: Springer US
Erschienen in: The Journal of Supercomputing / Ausgabe 6/2022
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-021-04151-2

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 6/2022

Correction to: AnonSURP: an anonymous and secure ultralightweight RFID protocol for deployment in internet of vehicles systems

Energy optimization for CAN bus and media controls in electric vehicles using deep learning algorithms

DewBCity: blockchain network-based dew-cloud modeling for distributed and decentralized smart cities

AI-based stroke prediction system using body motion biosignals during walking

Evaluating low-level software-based hardening techniques for configurable GPU architectures

Design and implementation of efficient QCA full-adders using fault-tolerant majority gates

Premium Partner