Introduction
-
Hypothesis 1 (H1): Keypoint learning can tackle the challenges typical of fetoscopic videos acquired during TTTS surgery and provide robust keypoints for mosaicking without relying on the segmentation of anatomical structures in the FoV.
-
Hypothesis 2 (H2): Mosaicking performance can be boosted by filtering irrelevant keypoints using semantic information and rejecting inconsistent homographies.
Contribution
Related work
Proposed method
SuperPoint: the keypoint proposal network
Keypoint proposal computation
KPN training
Video ID | Frame number | Frame resolution | Placenta position |
---|---|---|---|
[Pixels] | |||
1 | 400 | 470 \(\times \) 470 | Posterior |
2 | 300 | 540 \(\times \) 540 | Posterior |
3 | 150 | 550 \(\times \) 550 | Anterior |
4 | 200 | 640 \(\times \) 640 | Posterior |
5 | 200 | 720 \(\times \) 720 | Anterior |
6 | 200 | 720 \(\times \) 720 | Posterior |
Semantic keypoint rejection
Registration for mosaicking
Homography estimation
Inconsistent homography filtering
Experimental setup
Dataset
KPN | Irrelevant keypoint | Inconsistent homography | |
---|---|---|---|
rejection | filtering | ||
E0 | X\(^*\) | ||
E1 | X | ||
E2 | X | X | |
Proposed | X | X | X |
Video 1 | Video 2 | Video 3 | |
---|---|---|---|
SIFT | \(0.662 \pm 0.115\) | \(0.732 \pm 0.120\) | \(0.749 \pm 0.279\) |
Bano et al. [8] | \(\mathbf {0.757 \pm 0.081}\) | \(\mathbf {0.788 \pm 0.050}\) | \(0.839 \pm 0.208\) |
Pre-trained SuperPoint (E0) | \(0.528 \pm 0.247\) | \(0.202 \pm 0.264\) | \(0.219 \pm 0.266\) |
Vanilla SuperPoint (E1) | \(0.731 \pm 0.116\) | \(0.740\pm 0.079\) | \(0.809 \pm 0.174\) |
Semantic KPN (E2) | \(0.730 \pm 0.112\) | \(0.743\pm 0.071\) | \(0.813 \pm 0.172\) |
Proposed | \(0.750 \pm 0.081\) | \(0.766 \pm 0.048\) | \(\mathbf {0.884 \pm 0.075}\) |
Video 4 | Video 5 | Video 6 | |
SIFT | \(0.660 \pm 0.347\) | \(0.5164 \pm 0.402\) | \(0.485 \pm 0.389\) |
Bano et al. [8] | \(0.745 \pm 0.257\) | \(0.890 \pm 0.070\) | \(0.861 \pm 0.205\) |
Pre-trained SuperPoint (E0) | \(0.322 \pm 0.362\) | \(0.341 \pm 0.284\) | \(0.209 \pm 0.336\) |
Vanilla SuperPoint (E1) | \(0.801\pm 0.111\) | \(0.829\pm 0.091\) | \(0.817 \pm 0.076\) |
Semantic KPN (E2) | \(0.818 \pm 0.111\) | \(0.832 \pm 0.090\) | \(0.817 \pm 0.073\) |
Proposed | \(\mathbf {0.870 \pm 0.125}\) | \(\mathbf {0.897 \pm 0.012}\) | \(\mathbf {0.909 \pm 0.021}\) |
Implementation details
Performance metrics
Comparison with the literature and ablation study
-
Experiment 0 (E0): SuperPoint pre-trained on MS-COCO 2014 dataset, without any fine-tuning on fetoscopy data.
-
Experiment 1 (E1): Vanilla KPN, as described in “Keypoint proposal computation” section. Here, both irrelevant keypoint rejection and inconsistent homography filtering are excluded.
-
Experiment 2 (E2): Semantic KPN, as described in “SuperPoint: the keypoint proposal network” section. Only inconsistent homography filtering is hence excluded.
Results
Video 1 | Video 2 | Video 3 | |
---|---|---|---|
E2 | \(0.735 \pm 0.154\) | \(0.710 \pm 0.014\) | \(0.811 \pm 0.210\) |
Proposed | \(0.751 \pm 0.098\) | \(0.771 \pm 0.072\) | \(0.886 \pm 0.091\) |
Video 4 | Video 5 | Video 6 | |
E2 | \(0.810 \pm 0.140\) | \(0.802 \pm 0.320\) | \(0.791 \pm 0.164\) |
Proposed | \(0.872 \pm 0.132\) | \(0.896 \pm 0.022\) | \(0.901 \pm 0.051\) |