Skip to main content
Top

2019 | OriginalPaper | Chapter

Learning Where to Look While Tracking Instruments in Robot-Assisted Surgery

Authors : Mobarakol Islam, Yueyuan Li, Hongliang Ren

Published in: Medical Image Computing and Computer Assisted Intervention – MICCAI 2019

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Directing of the task-specific attention while tracking instrument in surgery holds great potential in robot-assisted intervention. For this purpose, we propose an end-to-end trainable multitask learning (MTL) model for real-time surgical instrument segmentation and attention prediction. Our model is designed with a weight-shared encoder and two task-oriented decoders and optimized for the joint tasks. We introduce batch-Wasserstein (bW) loss and construct a soft attention module to refine the distinctive visual region for efficient saliency learning. For multitask optimization, it is always challenging to obtain convergence of both tasks in the same epoch. We deal with this problem by adopting ‘poly’ loss weight and two phases of training. We further propose a novel way to generate task-aware saliency map and scanpath of the instruments on MICCAI robotic instrument segmentation dataset. Compared to the state of the art segmentation and saliency models, our model outperforms most of the evaluation metrics.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Chaurasia, A., Culurciello, E.: LinkNet: exploiting encoder representations for efficient semantic segmentation. In: 2017 IEEE Visual Communications and Image Processing (VCIP), pp. 1–4. IEEE (2017) Chaurasia, A., Culurciello, E.: LinkNet: exploiting encoder representations for efficient semantic segmentation. In: 2017 IEEE Visual Communications and Image Processing (VCIP), pp. 1–4. IEEE (2017)
3.
go back to reference Chen, Z., Zhao, Z., Cheng, X.: Surgical instruments tracking based on deep learning with lines detection and spatio-temporal context. In: 2017 Chinese Automation Congress (CAC), pp. 2711–2714. IEEE (2017) Chen, Z., Zhao, Z., Cheng, X.: Surgical instruments tracking based on deep learning with lines detection and spatio-temporal context. In: 2017 Chinese Automation Congress (CAC), pp. 2711–2714. IEEE (2017)
4.
go back to reference Dvornik, N., Shmelkov, K., Mairal, J., Schmid, C.: BlitzNet: a real-time deep network for scene understanding. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4154–4162 (2017) Dvornik, N., Shmelkov, K., Mairal, J., Schmid, C.: BlitzNet: a real-time deep network for scene understanding. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4154–4162 (2017)
5.
go back to reference Frogner, C., Zhang, C., Mobahi, H., Araya, M., Poggio, T.A.: Learning with a Wasserstein loss. In: Advances in Neural Information Processing Systems, pp. 2053–2061 (2015) Frogner, C., Zhang, C., Mobahi, H., Araya, M., Poggio, T.A.: Learning with a Wasserstein loss. In: Advances in Neural Information Processing Systems, pp. 2053–2061 (2015)
6.
go back to reference García-Peraza-Herrera, L.C., et al.: ToolNet: holistically-nested real-time segmentation of robotic surgical tools. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5717–5722. IEEE (2017) García-Peraza-Herrera, L.C., et al.: ToolNet: holistically-nested real-time segmentation of robotic surgical tools. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5717–5722. IEEE (2017)
7.
go back to reference Islam, M., Atputharuban, D.A., Ramesh, R., Ren, H.: Real-time instrument segmentation in robotic surgery using auxiliary supervised deep adversarial learning. IEEE Robot. Autom. Lett. 4, 2188–2195 (2019)CrossRef Islam, M., Atputharuban, D.A., Ramesh, R., Ren, H.: Real-time instrument segmentation in robotic surgery using auxiliary supervised deep adversarial learning. IEEE Robot. Autom. Lett. 4, 2188–2195 (2019)CrossRef
8.
go back to reference Jiang, M., Huang, S., Duan, J., Zhao, Q.: SALICON: saliency in context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1072–1080 (2015) Jiang, M., Huang, S., Duan, J., Zhao, Q.: SALICON: saliency in context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1072–1080 (2015)
9.
go back to reference Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2106–2113. IEEE (2009) Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2106–2113. IEEE (2009)
10.
go back to reference Liu, N., Han, J.: A deep spatial contextual long-term recurrent convolutional network for saliency detection. IEEE Trans. Image Process. 27(7), 3264–3274 (2018)MathSciNetCrossRef Liu, N., Han, J.: A deep spatial contextual long-term recurrent convolutional network for saliency detection. IEEE Trans. Image Process. 27(7), 3264–3274 (2018)MathSciNetCrossRef
12.
go back to reference Nekrasov, V., Dharmasiri, T., Spek, A., Drummond, T., Shen, C., Reid, I.: Real-time joint semantic segmentation and depth estimation using asymmetric annotations. arXiv preprint arXiv:1809.04766 (2018) Nekrasov, V., Dharmasiri, T., Spek, A., Drummond, T., Shen, C., Reid, I.: Real-time joint semantic segmentation and depth estimation using asymmetric annotations. arXiv preprint arXiv:​1809.​04766 (2018)
13.
go back to reference Ngu, J.C.Y., Tsang, C.B.S., Koh, D.C.S.: The da Vinci Xi: a review of its capabilities, versatility, and potential role in robotic colorectal surgery. Robot. Surg.: Res. Rev. 4, 77–85 (2017)CrossRef Ngu, J.C.Y., Tsang, C.B.S., Koh, D.C.S.: The da Vinci Xi: a review of its capabilities, versatility, and potential role in robotic colorectal surgery. Robot. Surg.: Res. Rev. 4, 77–85 (2017)CrossRef
14.
go back to reference Palazzi, A., Abati, D., Calderara, S., Solera, F., Cucchiara, R.: Predictingthe driver’s focus of attention: the DR (eye) VE project. IEEE Trans. Pattern Anal. Mach. Intell. 41, 1720–1733 (2018)CrossRef Palazzi, A., Abati, D., Calderara, S., Solera, F., Cucchiara, R.: Predictingthe driver’s focus of attention: the DR (eye) VE project. IEEE Trans. Pattern Anal. Mach. Intell. 41, 1720–1733 (2018)CrossRef
15.
16.
go back to reference Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel matters-improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition, pp. 4353–4361 (2017) Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel matters-improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition, pp. 4353–4361 (2017)
17.
18.
go back to reference Shvets, A.A., Rakhlin, A., Kalinin, A.A., Iglovikov, V.I.: Automatic instrument segmentation in robot-assisted surgery using deep learning. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 624–628. IEEE (2018) Shvets, A.A., Rakhlin, A., Kalinin, A.A., Iglovikov, V.I.: Automatic instrument segmentation in robot-assisted surgery using deep learning. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 624–628. IEEE (2018)
19.
go back to reference Tian, Z., Shen, C., He, T., Yan, Y.: Decoders matter for semantic segmentation: data-dependent decoding enables flexible feature aggregation. arXiv preprint arXiv:1903.02120 (2019) Tian, Z., Shen, C., He, T., Yan, Y.: Decoders matter for semantic segmentation: data-dependent decoding enables flexible feature aggregation. arXiv preprint arXiv:​1903.​02120 (2019)
20.
go back to reference Zhao, Z., Voros, S., Weng, Y., Chang, F., Li, R.: Tracking-by-detection of surgical instruments in minimally invasive surgery via the convolutional neural network deep learning-based method. Comput. Assist. Surg. 22(sup1), 26–35 (2017)CrossRef Zhao, Z., Voros, S., Weng, Y., Chang, F., Li, R.: Tracking-by-detection of surgical instruments in minimally invasive surgery via the convolutional neural network deep learning-based method. Comput. Assist. Surg. 22(sup1), 26–35 (2017)CrossRef
Metadata
Title
Learning Where to Look While Tracking Instruments in Robot-Assisted Surgery
Authors
Mobarakol Islam
Yueyuan Li
Hongliang Ren
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-32254-0_46

Premium Partner