Skip to main content
Top

2021 | OriginalPaper | Chapter

Intelligent and Interactive Video Annotation for Instance Segmentation Using Siamese Neural Networks

Authors : Jan Schneegans, Maarten Bieshaar, Florian Heidecker, Bernhard Sick

Published in: Pattern Recognition. ICPR International Workshops and Challenges

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Training machine learning models in a supervised manner requires vast amounts of labeled data. These labels are typically provided by humans manually annotating samples using a variety of tools. In this work, we propose an intelligent annotation tool to combine the fast and efficient labeling capabilities of modern machine learning models with the reliable and accurate, but slow, correction capabilities of human annotators. We present our approach to interactively condition a model on previously predicted and manually annotated or corrected instances and explore an iterative workflow combining the advantages of the intelligent model and the human annotator for the task of instance segmentation in videos. Thereby, the intelligent model conducts the bulk of the work, performing instance detection, tracking, and segmentation, and enables the human annotator to correct individual frames and instances selectively. The proposed approach avoids the computational cost of online retraining by being based on the one-shot learning paradigm. For this purpose, we use Siamese neural networks to transfer annotations from one video frame to another. Multiple interaction options regarding the choice of the additional input data to the neural network, e.g., model predictions or manual corrections, are explored to refine the given model’s labeling performance and speed up the annotation process.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Acuna, D., Ling, H., Kar, A., Fidler, S.: Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 859–868 (2018) Acuna, D., Ling, H., Kar, A., Fidler, S.: Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 859–868 (2018)
2.
go back to reference Asano, Y.M., Rupprecht, C., Vedaldi, A.: A critical analysis of self-supervision, or what we can learn from a single image. In: ICLR, pp. 1–16. Vienna, Austria (2020) Asano, Y.M., Rupprecht, C., Vedaldi, A.: A critical analysis of self-supervision, or what we can learn from a single image. In: ICLR, pp. 1–16. Vienna, Austria (2020)
3.
go back to reference Bianco, S., Ciocca, G., Napoletano, P., Schettini, R.: An interactive tool for manual, semi-automatic and automatic video annotation. CVIU 131, 88–99 (2015) Bianco, S., Ciocca, G., Napoletano, P., Schettini, R.: An interactive tool for manual, semi-automatic and automatic video annotation. CVIU 131, 88–99 (2015)
4.
go back to reference Castrejón, L., Kundu, K., Urtasun, R., Fidler, S.: Annotating object instances with a polygon-RNN. In: CVPR, pp. 4485–4493. Honolulu, HI, USA (2017) Castrejón, L., Kundu, K., Urtasun, R., Fidler, S.: Annotating object instances with a polygon-RNN. In: CVPR, pp. 4485–4493. Honolulu, HI, USA (2017)
5.
go back to reference Fagot-Bouquet, L., Rabarisoa, J., Pham, Q.: Fast and accurate video annotation using dense motion hypotheses. In: ICIP, pp. 3122–3126. Paris, France (2014) Fagot-Bouquet, L., Rabarisoa, J., Pham, Q.: Fast and accurate video annotation using dense motion hypotheses. In: ICIP, pp. 3122–3126. Paris, France (2014)
6.
go back to reference Falk, T., et al.: U-Net: deep learning for cell counting, detection, and morphometry. Nature Methods 16, 67–70 (2018)CrossRef Falk, T., et al.: U-Net: deep learning for cell counting, detection, and morphometry. Nature Methods 16, 67–70 (2018)CrossRef
7.
go back to reference Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. TPAMI 28(4), 594–611 (2006)CrossRef Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. TPAMI 28(4), 594–611 (2006)CrossRef
8.
go back to reference Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: ICLR, pp. 1–26. Vancouver, BC, Canada (2017) Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: ICLR, pp. 1–26. Vancouver, BC, Canada (2017)
9.
go back to reference Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML, pp. 1–8. Lille, France (2015) Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML, pp. 1–8. Lille, France (2015)
10.
go back to reference Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: ICLR, pp. 1–19. New Orleans, LA, USA (2019) Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: ICLR, pp. 1–19. New Orleans, LA, USA (2019)
11.
go back to reference Nagaraja, N., Schmidt, F.R., Brox, T.: Video segmentation with just a few strokes. In: ICCV, pp. 3235–3243. Santiago, Chile (2015) Nagaraja, N., Schmidt, F.R., Brox, T.: Video segmentation with just a few strokes. In: ICCV, pp. 3235–3243. Santiago, Chile (2015)
12.
go back to reference Perazzi, F., et al.: A benchmark dataset and evaluation methodology for video object segmentation. In: CVPR, pp. 724–732. Las Vegas, NV, USA (2016) Perazzi, F., et al.: A benchmark dataset and evaluation methodology for video object segmentation. In: CVPR, pp. 724–732. Las Vegas, NV, USA (2016)
13.
go back to reference Pont-Tuset, J., Perazzi, F., Caelles, S., Arbeláez, P., Sorkine-Hornung, A., Van Gool, L.: The 2017 DAVIS challenge on video object segmentation. arXiv:1704.00675 (2017) Pont-Tuset, J., Perazzi, F., Caelles, S., Arbeláez, P., Sorkine-Hornung, A., Van Gool, L.: The 2017 DAVIS challenge on video object segmentation. arXiv:​1704.​00675 (2017)
15.
go back to reference Vicente, S., Rother, C., Kolmogorov, V.: Object cosegmentation. In: CVPR, pp. 2217–2224. Colorado Springs, CO, USA (2011) Vicente, S., Rother, C., Kolmogorov, V.: Object cosegmentation. In: CVPR, pp. 2217–2224. Colorado Springs, CO, USA (2011)
16.
go back to reference Vondrick, C., Ramanan, D.: Video annotation and tracking with active learning. Adv. Neural Inf. Process. Syst. 24, 28–36 (2011) Vondrick, C., Ramanan, D.: Video annotation and tracking with active learning. Adv. Neural Inf. Process. Syst. 24, 28–36 (2011)
17.
go back to reference Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.: Fast online object tracking and segmentation: a unifying approach. In: CVPR, pp. 1328–1338. Salt Lake City, UT, USA (2018) Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.: Fast online object tracking and segmentation: a unifying approach. In: CVPR, pp. 1328–1338. Salt Lake City, UT, USA (2018)
18.
go back to reference Wang, Y., Yao, Q., Kwok, J., Ni, L.M.: Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. 53, 1–34 (2019) Wang, Y., Yao, Q., Kwok, J., Ni, L.M.: Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. 53, 1–34 (2019)
Metadata
Title
Intelligent and Interactive Video Annotation for Instance Segmentation Using Siamese Neural Networks
Authors
Jan Schneegans
Maarten Bieshaar
Florian Heidecker
Bernhard Sick
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-68799-1_27

Premium Partner