Skip to main content

2022 | OriginalPaper | Buchkapitel

A Coarse-to-Fine Human Visual Focus Estimation for ASD Toddlers in Early Screening

verfasst von : Xinming Wang, Zhihao Yang, Hanlin Zhang, Zuode Liu, Weihong Ren, Xiu Xu, Qiong Xu, Honghai Liu

Erschienen in: Intelligent Robotics and Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Human visual focus is a vital feature to uncover subjects’ underlying cognitive processes. To predict the subject’s visual focus, existing deep learning methods learn to combine the head orientation, location, and scene content for estimating the visual focal point. However, these methods mainly face three problems: the visual focal point prediction solely depends on learned spatial distribution heatmaps, the reasoning process in post-processing is non-learnable, and the learning of gaze salience representation could utilize more prior knowledge. Therefore, we propose a coarse-to-fine human visual focus estimation method to address these problems, for improving estimation performance. To begin with, we introduce a coarse-to-fine regression module, in which the coarse branch aims to estimate the subject’s possible attention area while the fine branch directly outputs the estimated visual focal point position, thus avoiding sequential reasoning and making visual focal point estimation is totally learnable. Furthermore, the human visual field prior is used to guide the learning of gaze salience for better encoding target-related representation. Extensive experimental results demonstrate that our method outperforms existing state-of-the-art methods on self-collected ASD-attention datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Chong, E., Ruiz, N., Wang, Y., Zhang, Y., Rozga, A., Rehg, J.M.: Connecting gaze, scene, and attention: generalized attention estimation via joint modeling of gaze and scene saliency. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 397–412. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_24CrossRef Chong, E., Ruiz, N., Wang, Y., Zhang, Y., Rozga, A., Rehg, J.M.: Connecting gaze, scene, and attention: generalized attention estimation via joint modeling of gaze and scene saliency. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 397–412. Springer, Cham (2018). https://​doi.​org/​10.​1007/​978-3-030-01228-1_​24CrossRef
3.
Zurück zum Zitat Chong, E., Wang, Y., Ruiz, N., Rehg, J.M.: Detecting attended visual targets in video. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June, 2020, pp. 5395–5405. Computer Vision Foundation / IEEE (2020) Chong, E., Wang, Y., Ruiz, N., Rehg, J.M.: Detecting attended visual targets in video. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June, 2020, pp. 5395–5405. Computer Vision Foundation / IEEE (2020)
4.
Zurück zum Zitat Dai, L., Liu, J., Ju, Z., Gao, Y.: Attention mechanism based real time gaze tracking in natural scenes with residual blocks. IEEE Trans. Cogn. Dev. Syst. 14, 1 (2021) Dai, L., Liu, J., Ju, Z., Gao, Y.: Attention mechanism based real time gaze tracking in natural scenes with residual blocks. IEEE Trans. Cogn. Dev. Syst. 14, 1 (2021)
5.
Zurück zum Zitat Fan, L., Chen, Y., Wei, P., Wang, W., Zhu, S.: Inferring shared attention in social scene videos. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June, 2018, pp. 6460–6468. Computer Vision Foundation/IEEE Computer Society (2018) Fan, L., Chen, Y., Wei, P., Wang, W., Zhu, S.: Inferring shared attention in social scene videos. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June, 2018, pp. 6460–6468. Computer Vision Foundation/IEEE Computer Society (2018)
6.
Zurück zum Zitat Fang, Y., Tang, J., Shen, W., Shen, W., Gu, X., Song, L., Zhai, G.: Dual attention guided gaze target detection in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19–25, 2021. pp. 11390–11399. Computer Vision Foundation / IEEE (2021) Fang, Y., Tang, J., Shen, W., Shen, W., Gu, X., Song, L., Zhai, G.: Dual attention guided gaze target detection in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19–25, 2021. pp. 11390–11399. Computer Vision Foundation / IEEE (2021)
8.
Zurück zum Zitat Liu, J., et al.: Early screening of autism in toddlers via response-to-instructions protocol. IEEE Trans. Cybern., 1–11 (2020) Liu, J., et al.: Early screening of autism in toddlers via response-to-instructions protocol. IEEE Trans. Cybern., 1–11 (2020)
9.
Zurück zum Zitat Massé, B., Lathuilière, S., Mesejo, P., Horaud, R.: Extended gaze following: Detecting objects in videos beyond the camera field of view. In: 2019 14th IEEE International Conference on Automatic Face Gesture Recognition (FG 2019), pp. 1–8 (2019) Massé, B., Lathuilière, S., Mesejo, P., Horaud, R.: Extended gaze following: Detecting objects in videos beyond the camera field of view. In: 2019 14th IEEE International Conference on Automatic Face Gesture Recognition (FG 2019), pp. 1–8 (2019)
11.
Zurück zum Zitat Recasens, A., Khosla, A., Vondrick, C., Torralba, A.: Where are they looking? In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems. vol. 28. Curran Associates, Inc. (2015) Recasens, A., Khosla, A., Vondrick, C., Torralba, A.: Where are they looking? In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems. vol. 28. Curran Associates, Inc. (2015)
12.
Zurück zum Zitat Recasens, A., Vondrick, C., Khosla, A., Torralba, A.: Following gaze in video. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October, 2017, pp. 1444–1452. IEEE Computer Society (2017) Recasens, A., Vondrick, C., Khosla, A., Torralba, A.: Following gaze in video. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October, 2017, pp. 1444–1452. IEEE Computer Society (2017)
13.
Zurück zum Zitat Tan, G., Xu, K., Liu, J., Liu, H.: A trend on autism spectrum disorder research: eye tracking-eeg correlative analytics. IEEE Trans. Cogn. Dev. Syst., 1 (2021) Tan, G., Xu, K., Liu, J., Liu, H.: A trend on autism spectrum disorder research: eye tracking-eeg correlative analytics. IEEE Trans. Cogn. Dev. Syst., 1 (2021)
14.
Zurück zum Zitat Wang, X., Zhang, J., Zhang, H., Zhao, S., Liu, H.: Vision-based gaze estimation: a review. IEEE Trans. Cogn. Dev. Syst., 1 (2021) Wang, X., Zhang, J., Zhang, H., Zhao, S., Liu, H.: Vision-based gaze estimation: a review. IEEE Trans. Cogn. Dev. Syst., 1 (2021)
15.
Zurück zum Zitat Wang, Z., Liu, J., He, K., Xu, Q., Xu, X., Liu, H.: Screening early children with autism spectrum disorder via response-to-name protocol. IEEE Trans. Industr. Inf. 17(1), 587–595 (2021)CrossRef Wang, Z., Liu, J., He, K., Xu, Q., Xu, X., Liu, H.: Screening early children with autism spectrum disorder via response-to-name protocol. IEEE Trans. Industr. Inf. 17(1), 587–595 (2021)CrossRef
16.
Zurück zum Zitat Yang, L., Dong, K., Dmitruk, A.J., Brighton, J., Zhao, Y.: A dual-cameras-based driver gaze mapping system with an application on non-driving activities monitoring. IEEE Trans. Intell. Transp. Syst. 21(10), 4318–4327 (2020)CrossRef Yang, L., Dong, K., Dmitruk, A.J., Brighton, J., Zhao, Y.: A dual-cameras-based driver gaze mapping system with an application on non-driving activities monitoring. IEEE Trans. Intell. Transp. Syst. 21(10), 4318–4327 (2020)CrossRef
17.
Zurück zum Zitat Yücel, Z., Salah, A.A., Meriçli,, Meriçli, T., Valenti, R., Gevers, T.: Joint attention by gaze interpolation and saliency. IEEE Trans. Cybern. 43(3), 829–842 (2013) Yücel, Z., Salah, A.A., Meriçli,, Meriçli, T., Valenti, R., Gevers, T.: Joint attention by gaze interpolation and saliency. IEEE Trans. Cybern. 43(3), 829–842 (2013)
18.
Zurück zum Zitat Zhao, H., Lu, M., Yao, A., Chen, Y., Zhang, L.: Learning to draw sight lines. Int. J. Comput. Vis. 128(5), 1076–1100 (2020)CrossRef Zhao, H., Lu, M., Yao, A., Chen, Y., Zhang, L.: Learning to draw sight lines. Int. J. Comput. Vis. 128(5), 1076–1100 (2020)CrossRef
19.
Zurück zum Zitat Zhuang, N., Ni, B., Xu, Y., Yang, X., Zhang, W., Li, Z., Gao, W.: Muggle: multi-stream group gaze learning and estimation. IEEE Trans. Circuits Syst. Video Technol. 30(10), 3637–3650 (2020)CrossRef Zhuang, N., Ni, B., Xu, Y., Yang, X., Zhang, W., Li, Z., Gao, W.: Muggle: multi-stream group gaze learning and estimation. IEEE Trans. Circuits Syst. Video Technol. 30(10), 3637–3650 (2020)CrossRef
Metadaten
Titel
A Coarse-to-Fine Human Visual Focus Estimation for ASD Toddlers in Early Screening
verfasst von
Xinming Wang
Zhihao Yang
Hanlin Zhang
Zuode Liu
Weihong Ren
Xiu Xu
Qiong Xu
Honghai Liu
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-031-13844-7_43