nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

Towards Capturing Sonographic Experience: Cognition-Inspired Ultrasound Video Saliency Prediction

verfasst von : Richard Droste, Yifan Cai, Harshita Sharma, Pierre Chatelain, Aris T. Papageorghiou, J. Alison Noble

Erschienen in: Medical Image Understanding and Analysis

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

For visual tasks like ultrasound (US) scanning, experts direct their gaze towards regions of task-relevant information. Therefore, learning to predict the gaze of sonographers on US videos captures the spatio-temporal patterns that are important for US scanning. The spatial distribution of gaze points on video frames can be represented through heat maps termed saliency maps. Here, we propose a temporally bidirectional model for video saliency prediction (BDS-Net), drawing inspiration from modern theories of human cognition. The model consists of a convolutional neural network (CNN) encoder followed by a bidirectional gated-recurrent-unit recurrent convolutional network (GRU-RCN) decoder. The temporal bidirectionality mimics human cognition, which simultaneously reacts to past and predicts future sensory inputs. We train the BDS-Net alongside spatial and temporally one-directional comparative models on the task of predicting saliency in videos of US abdominal circumference plane detection. The BDS-Net outperforms the comparative models on four out of five saliency metrics. We present a qualitative analysis on representative examples to explain the model’s superior performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Multi-task CNN for Structural Semantic Segmentation in 3D Fetal Brain Ultrasound

Nächstes Kapitel A Novel Deep Learning Based OCTA De-striping Method

In our implementation, for numerical stability, we compute \(log(\hat{s}^t_i)\) with a log-softmax function instead of computing the softmax and logarithm sequentially.

https://github.com/cvzoya/saliency.

Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. In: NIPS - Deep Learning Symposium (2016)

Bak, C., Kocak, A., Erdem, E., Erdem, A.: Spatio-temporal saliency networks for dynamic saliency prediction. IEEE Trans. Multimed. 20(7), 1688–1698 (2018)CrossRef

Ballas, N., Yao, L., Pal, C., Courville, A.: Delving deeper into convolutional networks for learning video representations. In: ICLR (2016)

Baumgartner, C.F., et al.: SonoNet: real-time detection and localisation of fetal standard scan planes in freehand ultrasound. IEEE Trans. Med. Imag. 36(11), 2204–2215 (2017)CrossRef

Bazzani, L., Larochelle, H., Torresani, L.: Recurrent mixture density network for spatiotemporal visual attention. In: ICLR (2017)

Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., Durand, F.: What do different evaluation metrics tell us about saliency models? IEEE Trans. Pattern Anal. Mach. Intell. 41(3), 740–757 (2019)CrossRef

Bylinskii, Z., et al.: MIT Saliency Benchmark. http://saliency.mit.edu/

Cai, Y., Sharma, H., Chatelain, P., Noble, J.A.: SonoEyeNet: standardized fetal ultrasound plane detection informed by eye tracking. In: ISBI (2018)

Cai, Y., Sharma, H., Chatelain, P., Noble, J.A.: Multi-task sonoeyenet: detection of fetal standardized planes assisted by generated sonographer attention maps. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 871–879. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_98CrossRef

10.

Chaabouni, S., Benois-pineau, J., Hadar, O.: Deep Learning for Saliency Prediction in Natural Video. arXiv:1604.08010 (2016)

11.

Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP (2014)

12.

Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS (2014)

13.

Clark, A.: Whatever next? predictive brains, situated agents, and the future of cognitive science. Behav. Brain Sci. 36(03), 181–204 (2013)CrossRef

14.

Droste, R., et al.: Ultrasound Image Representation Learning by Modeling Sonographer Visual Attention. Accepted at IPMI (2019)

15.

Gal, Y., Ghahramani, Z.: A theoretically grounded application of dropout in recurrent neural networks. In: NIPS (2016)

16.

Gao, Y., Alison Noble, J.: Detection and characterization of the fetal heartbeat in free-hand ultrasound sweeps with weakly-supervised two-streams convolutional networks. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10434, pp. 305–313. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66185-8_35CrossRef

17.

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

18.

Huang, W., Bridge, C.P., Noble, J.A., Zisserman, A.: Temporal heartnet: towards human-level automatic analysis of fetal cardiac screening video. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10434, pp. 341–349. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66185-8_39CrossRef

19.

Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)

20.

Jetley, S., Murray, N., Vig, E.: End-to-end saliency mapping via probability distribution prediction. In: CVPR (2016)

21.

Keskar, N.S., Socher, R.: Improving Generalization Performance by Switching from Adam to SGD. arXiv:1712.07628 (2017)

22.

Sharma, H., Droste, R., Chatelain, P., Drukker, L., Papageorghiou, A., Noble, J.A.: Spatio-temporal partitioning and description of full-length routine fetal anomaly ultrasound scans. Accepted at IEEE ISBI 2019 (2019)

23.

Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: NIPS (2014)

24.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)

25.

Song, H., Wang, W., Zhao, S., Shen, J., Lam, K.-M.: Pyramid dilated deeper ConvLSTM for video salient object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 744–760. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_44CrossRef

26.

Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance Normalization: The Missing Ingredient for Fast Stylization. arxiv:1607.08022 (2016)

27.

Wang, W., Shen, J., Guo, F., Cheng, M.M., Borji, A.: Revisiting video saliency: a large-scale benchmark and a new model. In: CVPR (2018)

28.

Wu, Y., He, K.: Group normalization. In: ECCV (2018)

29.

Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent Neural Network Regularization. arXiv:1409.2329 (2014)

Titel: Towards Capturing Sonographic Experience: Cognition-Inspired Ultrasound Video Saliency Prediction
verfasst von: Richard Droste
Yifan Cai
Harshita Sharma
Pierre Chatelain
Aris T. Papageorghiou
J. Alison Noble
Verlag: Springer International Publishing
Buch: Medical Image Understanding and Analysis
Print ISBN: 978-3-030-39342-7

Electronic ISBN: 978-3-030-39343-4

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-3-030-39343-4_15

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"