Top

The Journal of Supercomputing

Published in:

25-04-2023

Interactive object annotation based on one-click guidance

Authors: Yijin Xiong, Xin Gao, Guoying Zhang

Published in: The Journal of Supercomputing | Issue 14/2023

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Due to the large workload of manual annotation of datasets, uneven data quality and high professional thresholds have been a problem. Based on the idea of semi-automatic annotation, this article discusses the method use of interactive methods to obtain accurate annotations of objects. We propose a method of human–machine interactive object annotation based on one-click guidance. Specifically, we click on a point close to the center of the object and use the prior information of this point to give a guide to the model. The advantages of our method are fourfold: (1) the simulated click method is transferable and can be labeled across datasets; (2) clicks help to eliminate irrelevant areas within the bounding box; (3) the operation is more convenient and does not require artificial boxes, we only need to give the relevant location information; (4) our method supports additional click annotations for further correction. To verify the effectiveness of the proposed method, we conducted a lot of experiments on the KITTI and PASCAL VOC2012 datasets, and the results proved that our model has improved average IoU by 18.1% and 14.6% compared with Anno-Mage and CVAT, respectively. Our method focuses on improving the accuracy and efficiency of annotation, and provides a new idea for the field of semi-automatic annotation.

previous article A survey on automation approaches of smart contract generation

next article Embedding channel pruning within the CNN architecture design using a bi-level evolutionary approach

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Real E, Shlens J, Mazzocchi S, Pan X, Vanhoucke V (2017) Youtube-boundingboxes: a large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5296–5305

Nandhini P, Kuppuswami S, Malliga S, DeviPriya R (2022) Enhanced rank attack detection algorithm (e-rad) for securing rpl-based iot networks by early detection and isolation of rank attackers. J Supercomput 1–24

Suseendran G, Akila D, Vijaykumar H, Jabeen TN, Nirmala R, Nayyar A (2022) Multi-sensor information fusion for efficient smart transport vehicle tracking and positioning based on deep learning technique. J Supercomput 1–26

Varga V, Lőrincz A (2020) Reducing human efforts in video segmentation annotation with reinforcement learning. Neurocomputing 405:247–258CrossRef

Kishorekumar R, Deepa P (2020) A framework for semantic image annotation using legion algorithm. J Supercomput 76(6):4169–4183CrossRef

Pham T-N, Nguyen V-H, Huh J-H (2023) Integration of improved yolov5 for face mask detector and auto-labeling to generate dataset for fighting against covid-19. J Supercomput 1–27

Boukthir K, Qahtani AM, Almutiry O, Dhahri H, Alimi AM (2022) Reduced annotation based on deep active learning for Arabic text detection in natural scene images. Pattern Recogn Lett 157:42–48CrossRef

Russell BC, Torralba A, Murphy KP, Freeman WT (2008) Labelme: a database and web-based tool for image annotation. Int J Comput Vis 77(1–3):157–173CrossRef

Su H, Deng J, Fei-Fei L (2012) Crowdsourcing annotations for visual object detection. In: Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence

10.

Acuna D, Ling H, Kar A, Fidler S (2018) Efficient interactive annotation of segmentation datasets with polygon-rnn++. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 859–868

11.

Vondrick C, Patterson D, Ramanan D (2013) Efficiently scaling up crowdsourced video annotation. Int J Comput Vis 101(1):184–204CrossRef

12.

Mottaghi R, Chen X, Liu X, Cho N-G, Lee S-W, Fidler S, Urtasun R, Yuille A (2014) The role of context for object detection and semantic segmentation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 891–898

13.

Zhang S, Liew JH, Wei Y, Wei S, Zhao Y (2020) Interactive object segmentation with inside–outside guidance. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12234–12244

14.

Pacha S, Murugan SR, Sethukarasi R (2020) Semantic annotation of summarized sensor data stream for effective query processing. J Supercomput 76(6):4017–4039CrossRef

15.

Schembera B (2021) Like a rainbow in the dark: metadata annotation for hpc applications in the age of dark data. J Supercomput 77(8):8946–8966CrossRef

16.

Ling H, Gao J, Kar A, Chen W, Fidler S (2019) Fast interactive object annotation with curve-gcn. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5257–5266

17.

Gao X, Zhang G, Xiong Y (2022) Multi-scale multi-modal fusion for object detection in autonomous driving based on selective kernel. Measurement 194:111001CrossRef

18.

Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The Pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338CrossRef

19.

Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 3354–3361

20.

Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the kitti dataset. Int J Robot Res 32(11):1231–1237CrossRef

21.

Tzutalin: LabelImg. https://github.com/tzutalin/labelImg (2015)

22.

Dutta A, Zisserman A (2019) The via annotation software for images, audio and video. In: Proceedings of the 27th ACM International Conference on Multimedia, pp 2276–2279

23.

Yu F, Xian W, Chen Y, Liu F, Liao M, Madhavan V, Darrell T (2018) Bdd100k: a diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687 2(5), 6

24.

christopher5106: FastAnnotationTool. https://github.com/christopher5106/FastAnnotationTool (2016)

25.

virajmavani: Anno-mage. https://github.com/virajmavani/semi-auto-image-annotation-tool (2018)

26.

OpenVINO: CVAT. https://github.com/openvinotoolkit/cvat (2020)

27.

Wang B, Wu V, Wu B, Keutzer K (2019) Latte: accelerating lidar point cloud annotation via sensor fusion, one-click annotation, and tracking. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC). IEEE, pp 265–272

28.

Piewak F, Pinggera P, Schafer M, Peter D, Schwarz B, Schneider N, Enzweiler M, Pfeiffer D, Zollner M (2018) Boosting lidar-based semantic labeling by cross-modal training data generation. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pp 0–0

29.

Yue X, Wu B, Seshia SA, Keutzer K, Sangiovanni-Vincentelli AL (2018) A lidar point cloud generator: from a virtual world to autonomous driving. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, pp 458–464

30.

Dosovitskiy A, Ros G, Codevilla F, Lopez A, Koltun V (2017) Carla: an open urban driving simulator. In: Conference on Robot Learning. PMLR, pp 1–16

31.

Maninis K-K, Caelles S, Pont-Tuset J, Van Gool L (2018) Deep extreme cut: from extreme points to object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 616–625

32.

Papadopoulos DP, Uijlings JR, Keller F, Ferrari V (2017) Extreme clicking for efficient object annotation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 4930–4939

33.

Fails JA, Olsen Jr DR (2003) Interactive machine learning. In: Proceedings of the 8th International Conference on Intelligent User Interfaces, pp 39–45

34.

Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28:91–99

35.

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778

36.

Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125

37.

Li X, Wang W, Hu X, Yang J (2019) Selective kernel networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 510–519

38.

Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141

Title: Interactive object annotation based on one-click guidance
Authors: Yijin Xiong
Xin Gao
Guoying Zhang
Publication date: 25-04-2023
Publisher: Springer US
Published in: The Journal of Supercomputing / Issue 14/2023
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-023-05279-z

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 14/2023

BejaGNN: behavior-based Java malware detection via graph neural network

A novel cloud model based on multiplicative unbalanced linguistic term set

Q-learning-based UAV-mounted base station positioning in a disaster scenario for connectivity to the users located at unknown positions

TTLA: two-way trust between clients and fog servers using Bayesian learning automata

A selective hybrid system for state-of-charge forecasting of lithium–ion batteries

Parallel optimization of method of characteristics based on Sunway Bluelight II supercomputer

Premium Partner