Introduction
-
We propose a stereo spatial decoupling network (TSDNets) to explore the spatial guidance relationship of the object from three directions of horizontal, vertical, and depth of the medical image. Specifically, the attention in this paper is more accurate than the traditional one-way attention mechanism screening features from multiple perspectives. At the same time, the attention mechanism in this paper plays a role in filtering features, and it is not a feature fusion directly, so compared with the traditional attention mechanism [27, 33], the parameter operation is less.
-
We developed a cross-feature screening module (CFSM) that uses a two-gate threshold screening strategy to generate three types of features, namely important features, secondary features, and redundant features, and targets them for deep feature fusion.
-
We constructs a semantic guided decoupling module (SGDM), which implements feature selection by setting different gate thresholds for shallow features and deep features respectively, thereby extracting more discriminative features.
Related work
TSDNets
Cross feature screening module (CFSM)
Semantic guided decoupling module(SGDM)
Feature fusion
Feature visualization for CFSM and SGDM
Experiments and results
Datasets
Evaluation metrics
Comparison to other state-of-the-art methods
Model | ChinaSet | COVID-19 | ISIC | ChinaSet | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
OA | AA | Kappa | OA | AA | Kappa | OA | AA | Kappa | Parameters | FLOPs | |
DenseNet-121 | 0.8863 | 0.8902 | 0.7730 | 0.9818 | 0.9822 | 0.9726 | 0.7540 | 0.6008 | 0.5059 | 7.2 M | 14.9 M |
ResNet-50 | 0.8566 | 0.8655 | 0.7122 | 0.9870 | 0.9874 | 0.9804 | 0.7605 | 0.6190 | 0.4929 | 23.5 M | 47.4 M |
VGG-16 | 0.8712 | 0.8730 | 0.7426 | 0.9779 | 0.9783 | 0.9668 | 0.7202 | 0.5146 | 0.3863 | 14.8 M | 29.6 M |
Xception | 0.8787 | 0.8870 | 0.7580 | 0.9844 | 0.9847 | 0.9765 | 0.7404 | 0.5044 | 0.4424 | 21.2 M | 42.8 M |
InceptionV3 | 0.8712 | 0.8750 | 0.7427 | 0.9831 | 0.9837 | 0.9746 | 0.7852 | 0.6053 | 0.5670 | 22.0 M | 44.2 M |
AlexNet | 0.8787 | 0.8870 | 0.7580 | 0.9792 | 0.9795 | 0.9687 | 0.7973 | 0.6533 | 0.5835 | 4.3 M | 8.7 M |
ReLSNet | 0.8787 | 0.8870 | 0.7580 | 0.9870 | 0.9874 | 0.9804 | 0.7888 | 0.6418 | 0.5878 | 18.6 M | 37.1 M |
SRC-MT | 0.8939 | 0.8892 | 0.7882 | 0.9870 | 0.9874 | 0.9804 | 0.7958 | 0.6795 | 0.5792 | 25.9 M | 52.1 M |
Transformer | 0.8712 | 0.8750 | 0.7427 | 0.9870 | 0.9874 | 0.9804 | 0.7864 | 0.6506 | 0.5824 | 16.8 M | 33.7 M |
Swin-Transformer | 0.8787 | 0.8870 | 0.7580 | 0.9883 | 0.9889 | 0.9824 | 0.8012 | 0.6931 | 0.5903 | 28.9 M | 57.4 M |
TSDNets | 0.9167 | 0.9238 | 0.8336 | 0.9883 | 0.9889 | 0.9824 | 0.8044 | 0.7144 | 0.6005 | 32.3 M | 65.2 M |
Ablation study
Effects of features
Dataset | Model | OA | AA | Kappa |
---|---|---|---|---|
COVID-19 | \(Model\_nof_{cg}\) | 0.9850 | 0.9844 | 0.9766 |
\(Model\_nof_{1}\) | 0.9825 | 0.9818 | 0.9727 | |
\(Model\_nof_{2}\) | 0.9810 | 0.9805 | 0.9707 | |
\(Model\_nof_{3}\) | 0.9733 | 0.9727 | 0.9590 | |
TSDNets | 0.9889 | 0.9883 | 0.9824 | |
ChinaSet | \(Model\_nof_{cg}\) | 0.9084 | 0.9015 | 0.8034 |
\(Model\_nof_{1}\) | 0.8623 | 0.8560 | 0.7126 | |
\(Model\_nof_{2}\) | 0.8966 | 0.8864 | 0.7732 | |
\(Model\_nof_{3}\) | 0.8715 | 0.8636 | 0.7278 | |
TSDNets | 0.9238 | 0.9167 | 0.8336 | |
ISIC | \(Model\_nof_{cg}\) | 0.7962 | 0.6850 | 0.5927 |
\(Model\_nof_{1}\) | 0.7903 | 0.6605 | 0.5641 | |
\(Model\_nof_{2}\) | 0.7913 | 0.6492 | 0.5833 | |
\(Model\_nof_{3}\) | 0.7686 | 0.5553 | 0.5314 | |
TSDNets | 0.8044 | 0.7144 | 0.6006 |
Dataset | Model | OA | AA | Kappa |
---|---|---|---|---|
COVID-19 | \(T_1=0.5\), \(T_2=0.3\), \(T_3=0.5\) | 0.9876 | 0.9870 | 0.9805 |
\(T_1=0.6\), \(T_2=0.2\), \(T_3=0.5\) | 0.9813 | 0.9805 | 0.9707 | |
\(T_1=0.6\), \(T_2=0.2\), \(T_3=0.4\) | 0.9863 | 0.9857 | 0.9785 | |
\(T_1=0.6\), \(T_2=0.3\), \(T_3=0.6\) | 0.9848 | 0.9844 | 0.9766 | |
\(T_1=0.6\), \(T_2=0.4\), \(T_3=0.5\) | 0.9850 | 0.9844 | 0.9766 | |
\(T_1=0.7\), \(T_2=0.5\), \(T_3=0.5\) | 0.9812 | 0.9805 | 0.9707 | |
ChinaSet | \(T_1=0.5\), \(T_2=0.3\), \(T_3=0.5\) | 0.8931 | 0.8864 | 0.7731 |
\(T_1=0.6\), \(T_2=0.2\), \(T_3=0.5\) | 0.8804 | 0.8788 | 0.7573 | |
\(T_1=0.6\), \(T_2=0.2\), \(T_3=0.4\) | 0.8968 | 0.8939 | 0.7881 | |
\(T_1=0.6\), \(T_2=0.3\), \(T_3=0.6\) | 0.8968 | 0.8939 | 0.7881 | |
\(T_1=0.6\), \(T_2=0.4\), \(T_3=0.5\) | 0.8903 | 0.8864 | 0.7730 | |
\(T_1=0.7\), \(T_2=0.5\), \(T_3=0.5\) | 0.8799 | 0.8787 | 0.7577 | |
ISIC | \(T_1=0.5\), \(T_2=0.3\), \(T_3=0.5\) | 0.6174 | 0.7757 | 0.5371 |
\(T_1=0.6\), \(T_2=0.2\), \(T_3=0.5\) | 0.6134 | 0.7767 | 0.5563 | |
\(T_1=0.6\), \(T_2=0.2\), \(T_3=0.4\) | 0.6398 | 0.8034 | 0.5883 | |
\(T_1=0.6\), \(T_2=0.3\), \(T_3=0.6\) | 0.6533 | 0.8120 | 0.5933 | |
\(T_1=0.6\), \(T_2=0.4\), \(T_3=0.5\) | 0.6189 | 0.7857 | 0.5703 | |
\(T_1=0.7\), \(T_2=0.5\), \(T_3=0.5\) | 0.6573 | 0.7868 | 0.5512 |