Skip to main content

2020 | OriginalPaper | Buchkapitel

Towards Accurate and Interpretable Surgical Skill Assessment: A Video-Based Method Incorporating Recognized Surgical Gestures and Skill Levels

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nowadays, surgical skill assessment becomes increasingly important for surgical training, given the explosive growth of automation technologies. Existing work on skill score prediction is limited and deserves more promising outcomes. The challenges lie on complicated surgical tasks and new subjects as trial performers. Moreover, previous work mostly provides local feedback involving each individual video frame or clip that does not manifest human-interpretable semantics itself. To overcome these issues and facilitate more accurate and interpretable skill score prediction, we propose a novel video-based method incorporating recognized surgical gestures (segments) and skill levels (for both performers and gestures). Our method consists of two correlated multi-task learning frameworks. The main task of the first framework is to predict final skill scores of surgical trials and the auxiliary tasks are to recognize surgical gestures and to classify performers’ skills into self-proclaimed skill levels. The second framework, which is based on gesture-level features accumulated until the end of each previously identified gesture, incrementally generates running intermediate skill scores for feedback decoding. Experiments on JIGSAWS dataset show our first framework on C3D features pushes state-of-the-art prediction performance further to 0.83, 0.86 and 0.69 of Spearman’s correlation for the three surgical tasks under LOUO validation scheme. It even achieves 0.68 when generalizing across these tasks. For the second framework, additional gesture-level skill levels and captions are annotated by experts. The trend of predicted intermediate skill scores indicating problematic gestures is demonstrated as interpretable feedback. It turns out such trend resembles human’s scoring process.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Additional annotations for JIGSAWS dataset can be accessed via request.
 
Literatur
1.
Zurück zum Zitat Ahmidi, N., et al.: A dataset and benchmarks for segmentation and recognition of gestures in robotic surgery. IEEE Trans. Biomed. Eng. 64(9), 2025–2041 (2017)CrossRef Ahmidi, N., et al.: A dataset and benchmarks for segmentation and recognition of gestures in robotic surgery. IEEE Trans. Biomed. Eng. 64(9), 2025–2041 (2017)CrossRef
2.
Zurück zum Zitat Benmansour, M., Handouzi, W., Malti, A.: A neural network architecture for automatic and objective surgical skill assessment. In: CISTEM, pp. 1–5. IEEE (2018) Benmansour, M., Handouzi, W., Malti, A.: A neural network architecture for automatic and objective surgical skill assessment. In: CISTEM, pp. 1–5. IEEE (2018)
3.
Zurück zum Zitat Birkmeyer, J.D., et al.: Surgical skill and complication rates after bariatric surgery. N. Engl. J. Med. 369, 1434–1442 (2013)CrossRef Birkmeyer, J.D., et al.: Surgical skill and complication rates after bariatric surgery. N. Engl. J. Med. 369, 1434–1442 (2013)CrossRef
5.
Zurück zum Zitat Ershad, M., Rege, R., Majewicz, A.: Surgical skill level assessment using automatic feature extraction methods. In: Medical Imaging: Image-Guided Procedures, Robotic Interventions, and Modeling, vol. 10576 (2018) Ershad, M., Rege, R., Majewicz, A.: Surgical skill level assessment using automatic feature extraction methods. In: Medical Imaging: Image-Guided Procedures, Robotic Interventions, and Modeling, vol. 10576 (2018)
6.
Zurück zum Zitat Fard, M.J., et al.: Machine learning approach for skill evaluation in robotic-assisted surgery. In: WCECS, vol. 1 (2016) Fard, M.J., et al.: Machine learning approach for skill evaluation in robotic-assisted surgery. In: WCECS, vol. 1 (2016)
7.
Zurück zum Zitat Fard, M.J., et al.: Automated robot-assisted surgical skill evaluation: predictive analytics approach. Int. J. Med. Robot. Comput. Assist. Surg. 14(1), e1850 (2018)CrossRef Fard, M.J., et al.: Automated robot-assisted surgical skill evaluation: predictive analytics approach. Int. J. Med. Robot. Comput. Assist. Surg. 14(1), e1850 (2018)CrossRef
8.
Zurück zum Zitat Farha, Y.A., Gall, J.: MS-TCN: multi-stage temporal convolutional network for action segmentation. In: CVPR, pp. 3575–3584. IEEE (2019) Farha, Y.A., Gall, J.: MS-TCN: multi-stage temporal convolutional network for action segmentation. In: CVPR, pp. 3575–3584. IEEE (2019)
9.
Zurück zum Zitat Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L., Muller, P.-A.: Evaluating surgical skills from kinematic data using convolutional neural networks. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11073, pp. 214–221. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00937-3_25CrossRef Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L., Muller, P.-A.: Evaluating surgical skills from kinematic data using convolutional neural networks. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11073, pp. 214–221. Springer, Cham (2018). https://​doi.​org/​10.​1007/​978-3-030-00937-3_​25CrossRef
10.
Zurück zum Zitat Funke, I., Mees, S.T., Weitz, J., Speidel, S.: Video-based surgical skill assessment using 3D convolutional neural networks. IJCARS 14(7), 1217–1225 (2019) Funke, I., Mees, S.T., Weitz, J., Speidel, S.: Video-based surgical skill assessment using 3D convolutional neural networks. IJCARS 14(7), 1217–1225 (2019)
11.
Zurück zum Zitat Funke, I., Bodenstedt, S., Oehme, F., von Bechtolsheim, F., Weitz, J., Speidel, S.: Using 3D convolutional neural networks to learn spatiotemporal features for automatic surgical gesture recognition in video. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11768, pp. 467–475. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32254-0_52CrossRef Funke, I., Bodenstedt, S., Oehme, F., von Bechtolsheim, F., Weitz, J., Speidel, S.: Using 3D convolutional neural networks to learn spatiotemporal features for automatic surgical gesture recognition in video. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11768, pp. 467–475. Springer, Cham (2019). https://​doi.​org/​10.​1007/​978-3-030-32254-0_​52CrossRef
12.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778. IEEE (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778. IEEE (2016)
13.
Zurück zum Zitat Karpathy, A., et al.: Large-scale video classification with convolutional neural networks. In: CVPR, pp. 1725–1732. IEEE (2014) Karpathy, A., et al.: Large-scale video classification with convolutional neural networks. In: CVPR, pp. 1725–1732. IEEE (2014)
14.
Zurück zum Zitat Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: CVPR, pp. 7482–7491. IEEE (2018) Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: CVPR, pp. 7482–7491. IEEE (2018)
15.
Zurück zum Zitat Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015) Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
20.
Zurück zum Zitat Martin, J.A., et al.: Objective structured assessment of technical skill (OSATS) for surgical residents. Br. J. Surg. 84(2), 273–278 (1997) Martin, J.A., et al.: Objective structured assessment of technical skill (OSATS) for surgical residents. Br. J. Surg. 84(2), 273–278 (1997)
21.
Zurück zum Zitat Parmar, P., Morris, B.T.: Learning to score olympic events. In: CVPR-W, pp. 20–28. IEEE (2017) Parmar, P., Morris, B.T.: Learning to score olympic events. In: CVPR-W, pp. 20–28. IEEE (2017)
22.
Zurück zum Zitat Parmar, P., Morris, B.T.: Action quality assessment across multiple actions. In: WACV, pp. 1468–1476. IEEE (2019) Parmar, P., Morris, B.T.: Action quality assessment across multiple actions. In: WACV, pp. 1468–1476. IEEE (2019)
23.
Zurück zum Zitat Parmar, P., Morris, B.T.: What and how well you performed? A multitask learning approach to action quality assessment. In: CVPR, pp. 304–313. IEEE (2019) Parmar, P., Morris, B.T.: What and how well you performed? A multitask learning approach to action quality assessment. In: CVPR, pp. 304–313. IEEE (2019)
24.
Zurück zum Zitat Paszke, A., et al.: Automatic differentiation in pytorch. In: NIPS-W (2017) Paszke, A., et al.: Automatic differentiation in pytorch. In: NIPS-W (2017)
25.
Zurück zum Zitat Regenbogen, S., et al.: Patterns of technical error among surgical malpractice claims: an analysis of strategies to prevent injury to surgical patients. Ann. Surg. 246(5), 705–711 (2007)CrossRef Regenbogen, S., et al.: Patterns of technical error among surgical malpractice claims: an analysis of strategies to prevent injury to surgical patients. Ann. Surg. 246(5), 705–711 (2007)CrossRef
26.
Zurück zum Zitat Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)MathSciNetCrossRef Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)MathSciNetCrossRef
27.
Zurück zum Zitat Tao, L., Elhamifar, E., Khudanpur, S., Hager, G.D., Vidal, R.: Sparse hidden markov models for surgical gesture classification and skill evaluation. In: Abolmaesumi, P., Joskowicz, L., Navab, N., Jannin, P. (eds.) IPCAI 2012. LNCS, vol. 7330, pp. 167–177. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30618-1_17CrossRef Tao, L., Elhamifar, E., Khudanpur, S., Hager, G.D., Vidal, R.: Sparse hidden markov models for surgical gesture classification and skill evaluation. In: Abolmaesumi, P., Joskowicz, L., Navab, N., Jannin, P. (eds.) IPCAI 2012. LNCS, vol. 7330, pp. 167–177. Springer, Heidelberg (2012). https://​doi.​org/​10.​1007/​978-3-642-30618-1_​17CrossRef
28.
Zurück zum Zitat Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: ICCV, pp. 4489–4497. IEEE (2015) Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: ICCV, pp. 4489–4497. IEEE (2015)
30.
Zurück zum Zitat Xiang, X., Tian, Y., Reiter, A., Hager, G.D., Tran, T.D.: S3D: Stacking segmental P3D for action quality assessment. In: ICIP, pp. 928–932. IEEE (2018) Xiang, X., Tian, Y., Reiter, A., Hager, G.D., Tran, T.D.: S3D: Stacking segmental P3D for action quality assessment. In: ICIP, pp. 928–932. IEEE (2018)
31.
Zurück zum Zitat Zhou, K., Qiao, Y., Xiang, T.: Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward. In: AAAI (2018) Zhou, K., Qiao, Y., Xiang, T.: Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward. In: AAAI (2018)
33.
Zurück zum Zitat Zia, A., Hung, A., Essa, I., Jarc, A.: Surgical activity recognition in robot-assisted radical prostatectomy using deep learning. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11073, pp. 273–280. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00937-3_32CrossRef Zia, A., Hung, A., Essa, I., Jarc, A.: Surgical activity recognition in robot-assisted radical prostatectomy using deep learning. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11073, pp. 273–280. Springer, Cham (2018). https://​doi.​org/​10.​1007/​978-3-030-00937-3_​32CrossRef
Metadaten
Titel
Towards Accurate and Interpretable Surgical Skill Assessment: A Video-Based Method Incorporating Recognized Surgical Gestures and Skill Levels
verfasst von
Tianyu Wang
Yijie Wang
Mian Li
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-59716-0_64

Premium Partner