Training machine learning models in a supervised manner requires vast amounts of labeled data. These labels are typically provided by humans manually annotating samples using a variety of tools. In this work, we propose an intelligent annotation tool to combine the fast and efficient labeling capabilities of modern machine learning models with the reliable and accurate, but slow, correction capabilities of human annotators. We present our approach to interactively condition a model on previously predicted and manually annotated or corrected instances and explore an iterative workflow combining the advantages of the intelligent model and the human annotator for the task of instance segmentation in videos. Thereby, the intelligent model conducts the bulk of the work, performing instance detection, tracking, and segmentation, and enables the human annotator to correct individual frames and instances selectively. The proposed approach avoids the computational cost of online retraining by being based on the one-shot learning paradigm. For this purpose, we use Siamese neural networks to transfer annotations from one video frame to another. Multiple interaction options regarding the choice of the additional input data to the neural network, e.g., model predictions or manual corrections, are explored to refine the given model’s labeling performance and speed up the annotation process.
Anzeige
Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.