Skip to main content
Erschienen in: International Journal of Computer Assisted Radiology and Surgery 10/2022

10.06.2022 | Original Article

A parallel network utilizing local features and global representations for segmentation of surgical instruments

verfasst von: Xinan Sun, Yuelin Zou, Shuxin Wang, He Su, Bo Guan

Erschienen in: International Journal of Computer Assisted Radiology and Surgery | Ausgabe 10/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Purpose

Automatic image segmentation of surgical instruments is a fundamental task in robot-assisted minimally invasive surgery, which greatly improves the context awareness of surgeons during the operation. A novel method based on Mask R-CNN is proposed in this paper to realize accurate instance segmentation of surgical instruments.

Methods

A novel feature extraction backbone is built, which could extract both local features through the convolutional neural network branch and global representations through the Swin-Transformer branch. Moreover, skip fusions are applied in the backbone to fuse both features and improve the generalization ability of the network.

Results

The proposed method is evaluated on the dataset of MICCAI 2017 EndoVis Challenge with three segmentation tasks and shows state-of-the-art performance with an mIoU of 0.5873 in type segmentation and 0.7408 in part segmentation. Furthermore, the results of ablation studies prove that the proposed novel backbone contributes to at least 17% improvement in mIoU.

Conclusion

The promising results demonstrate that our method can effectively extract global representations as well as local features in the segmentation of surgical instruments and improve the accuracy of segmentation. With the proposed novel backbone, the network can segment the contours of surgical instruments’ end tips more precisely. This method can provide more accurate data for localization and pose estimation of surgical instruments, and make a further contribution to the automation of robot-assisted minimally invasive surgery.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Sang H, Wang S, Li J, He C, La Z, Wang X (2011) Control design and implementation of a novel master–slave surgery robot system, MicroHand A. Int J Med Robot Comput Assist Surg 7(3):334–347CrossRef Sang H, Wang S, Li J, He C, La Z, Wang X (2011) Control design and implementation of a novel master–slave surgery robot system, MicroHand A. Int J Med Robot Comput Assist Surg 7(3):334–347CrossRef
2.
Zurück zum Zitat Choi B, Jo K, Choi S, Choi J (2017) Surgical-tools detection based on Convolutional Neural Network in laparoscopic robot-assisted surgery. In: 2017 39th annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 1756–1759. https://doi.org/10.1109/EMBC.2017.8037183 Choi B, Jo K, Choi S, Choi J (2017) Surgical-tools detection based on Convolutional Neural Network in laparoscopic robot-assisted surgery. In: 2017 39th annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 1756–1759. https://​doi.​org/​10.​1109/​EMBC.​2017.​8037183
3.
Zurück zum Zitat Caccianiga G, Mariani A, de Paratesi CG, Menciassi A, De Momi E (2021) Multi-sensory guidance and feedback for simulation-based training in robot assisted surgery: a preliminary comparison of visual, haptic, and visuo-haptic. IEEE Robot Autom Lett 6(2):3801–3808CrossRef Caccianiga G, Mariani A, de Paratesi CG, Menciassi A, De Momi E (2021) Multi-sensory guidance and feedback for simulation-based training in robot assisted surgery: a preliminary comparison of visual, haptic, and visuo-haptic. IEEE Robot Autom Lett 6(2):3801–3808CrossRef
5.
Zurück zum Zitat Jo Y, Kim YJ, Moon H, Kim S (2018) Development of virtual reality-vision system in robot-assisted laparoscopic surgery. In: 2018 18th international conference on control, automation and systems (ICCAS), pp 1708–1712 Jo Y, Kim YJ, Moon H, Kim S (2018) Development of virtual reality-vision system in robot-assisted laparoscopic surgery. In: 2018 18th international conference on control, automation and systems (ICCAS), pp 1708–1712
6.
Zurück zum Zitat Jin A, Yeung S, Jopling J, Krause J, Azagury D, Milstein A, Fei-Fei L (2018) Tool detection and operative skill assessment in surgical videos using region-based convolutional neural networks. In: 2018 IEEE winter conference on applications of computer vision (WACV), pp 691–699. https://doi.org/10.1109/WACV.2018.00081 Jin A, Yeung S, Jopling J, Krause J, Azagury D, Milstein A, Fei-Fei L (2018) Tool detection and operative skill assessment in surgical videos using region-based convolutional neural networks. In: 2018 IEEE winter conference on applications of computer vision (WACV), pp 691–699. https://​doi.​org/​10.​1109/​WACV.​2018.​00081
11.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
12.
Zurück zum Zitat Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Comput Sci Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Comput Sci
13.
Zurück zum Zitat Hasan SMK, Linte CA (2019) U-NetPlus: a modified encoder-decoder U-Net architecture for semantic and instance segmentation of surgical instruments from laparoscopic images. In: 2019 41st annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 7205–7211. https://doi.org/10.1109/EMBC.2019.8856791 Hasan SMK, Linte CA (2019) U-NetPlus: a modified encoder-decoder U-Net architecture for semantic and instance segmentation of surgical instruments from laparoscopic images. In: 2019 41st annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 7205–7211. https://​doi.​org/​10.​1109/​EMBC.​2019.​8856791
14.
15.
17.
Zurück zum Zitat Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Paper presented at the Proceedings of the 31st international conference on neural information processing systems, Long Beach, California, USA Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Paper presented at the Proceedings of the 31st international conference on neural information processing systems, Long Beach, California, USA
20.
Zurück zum Zitat Forte M-P, Gourishetti R, Javot B, Engler T, Gomez ED, Kuchenbecker KJ (2022) Design of interactive augmented reality functions for robotic surgery and evaluation in dry-lab lymphadenectomy. Int J Med Robot Comput Assist Surg 18(2):e2351. https://doi.org/10.1002/rcs.2351CrossRef Forte M-P, Gourishetti R, Javot B, Engler T, Gomez ED, Kuchenbecker KJ (2022) Design of interactive augmented reality functions for robotic surgery and evaluation in dry-lab lymphadenectomy. Int J Med Robot Comput Assist Surg 18(2):e2351. https://​doi.​org/​10.​1002/​rcs.​2351CrossRef
22.
Zurück zum Zitat Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. Paper presented at the CVPR 2021 Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. Paper presented at the CVPR 2021
23.
Zurück zum Zitat Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth 16x16 words: transformers for image recognition at scale. Paper presented at the ICLR2021 Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth 16x16 words: transformers for image recognition at scale. Paper presented at the ICLR2021
24.
Zurück zum Zitat He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969 He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
26.
Zurück zum Zitat Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125 Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
27.
Zurück zum Zitat Oksuz K, Cam BC, Akbas E, Kalkan S (2021) Rank & sort loss for object detection and instance segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3009–3018 Oksuz K, Cam BC, Akbas E, Kalkan S (2021) Rank & sort loss for object detection and instance segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3009–3018
29.
Zurück zum Zitat Shvets AA, Rakhlin A, Kalinin AA, Iglovikov VI (2018) Automatic instrument segmentation in robot-assisted surgery using deep learning. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA), pp 624–628. https://doi.org/10.1109/ICMLA.2018.00100 Shvets AA, Rakhlin A, Kalinin AA, Iglovikov VI (2018) Automatic instrument segmentation in robot-assisted surgery using deep learning. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA), pp 624–628. https://​doi.​org/​10.​1109/​ICMLA.​2018.​00100
30.
Zurück zum Zitat González C, Bravo-Sánchez L, Arbelaez P (2020) Isinet: an instance-based approach for surgical instrument segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 595–605 González C, Bravo-Sánchez L, Arbelaez P (2020) Isinet: an instance-based approach for surgical instrument segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 595–605
31.
Zurück zum Zitat Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241 Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241
32.
Zurück zum Zitat Jin Y, Cheng K, Dou Q, Heng P-A (2019) Incorporating temporal prior from motion flow for instrument segmentation in minimally invasive surgery video. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 440–448 Jin Y, Cheng K, Dou Q, Heng P-A (2019) Incorporating temporal prior from motion flow for instrument segmentation in minimally invasive surgery video. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 440–448
Metadaten
Titel
A parallel network utilizing local features and global representations for segmentation of surgical instruments
verfasst von
Xinan Sun
Yuelin Zou
Shuxin Wang
He Su
Bo Guan
Publikationsdatum
10.06.2022
Verlag
Springer International Publishing
Erschienen in
International Journal of Computer Assisted Radiology and Surgery / Ausgabe 10/2022
Print ISSN: 1861-6410
Elektronische ISSN: 1861-6429
DOI
https://doi.org/10.1007/s11548-022-02687-z

Weitere Artikel der Ausgabe 10/2022

International Journal of Computer Assisted Radiology and Surgery 10/2022 Zur Ausgabe

Premium Partner