Introduction
- Performance comparison of three different classification methods: SVM classifier with oriented fast and rotated binary robust independent elementary features (ORB), transfer learning of VGG16 and InceptionV3, and training capsule network from scratch.
- An analysis of the effects of data augmentation, network complexity, fine-tuned convolutional layer, and other preventing overfitting mechanics on the classification of small chest X-ray dataset by transfer learning of CNN.
Literature review
ORB and SVM application on medical image classification
CNN on medical image classification
Capsule neural network on medical image classification
Experimental design
Data neural network on medical image classification
Training dataset | Testing dataset | |
---|---|---|
Normal | 1349 (25.7%) | 234 (37.5%) |
Bacteria | 2538 (48.5%) | 242 (38.7%) |
Virus | 1345 (25.7%) | 148 (23.7%) |
Total | 5232 (100%) | 624 (100%) |
Environment setup
Hardware
Software
Data augmentation design
Augmentation model | Augmentation parameters |
---|---|
Aug0 | No Aug |
Aug1 | Rotation range = 0.05, shear range = 0.05, zoom range = 0.05, horizontal flip = True, vertical flip=True |
Aug2 | Rotation range = 3,width shift range = 0.05, height shift range = 0.05, shear range = 0.05, zoom range = 0.05, f fill mode = ’constant’, cval = 0., horizontal flip = True, vertical flip = True |
ORB and SVM application experiments design
Transfer learning experiments design
Configuration | VGG16 | InceptionV3 | |
---|---|---|---|
Training parameter | Training parameter | ||
Model1 | GAPFC(4096) → FC(4096) → Softmax | 18,890,754 | 25,182,210 |
Model2 | GAP → Softmax | 1026 | 4098 |
Model3 | GAP → FC(512) → Dropout(0.5) → FC(256) → Dropout(0.5) → FC(128) → Dropout(0.5) → Softmax | 427,138 | 1,213,570 |
Model4 | GAP → FC(512) → Dropout(0.5) → Softmax | 263,682 | 1,050,114 |
Model5 | GAP → FC(512) → Dropout(0.5)→ FC(512) → Dropout(0.5) → FC(256) → Dropout(0.5) → Softmax | 657,154 | 1,443,586 |
Configuration | |
---|---|
ConvLayer Model | 1 Best classification model with an unfrozen ConvLayer |
ConvLayer Model2 | Smaller classification model with an unfrozen ConvLayer |
ConvLayer Model3 | Better model in previous two model with two an unfrozen ConvLayer |
Configuration | |
---|---|
Test1 | Aug0 |
Test2 | Aug1 |
Test3 | Aug1 with 64 feature maps and 64 input size |
Test4 | Aug1 with 64 feature maps and 128 input size |
Test5 | Aug1 with 32 feature maps and 64 input size |
Test6 | Aug1 with 32 feature maps and 128 input size |
Test7 | Aug1 with 32 feature maps and 48 input size |
Test8 | Aug1 with 24 feature maps and 64 input size |
Test9 | Aug1 with 16 feature maps and 64 input size |
Test10 | Aug1 with 32 feature maps, 64 input size and half primary capsule (4) |
Test11 | Aug1 with 32 feature maps, 64 input size, half primary capsule (4) and half capsule channel (16) |
Test12 | Aug1 with 32 feature maps, 64 input size, half primary capsule (4) and one fourth capsule channel (8) |
Test13 | Aug1 with 24 feature maps, 64 input size, half primary capsule (4) and half capsule channel (16) |
Test14 | Aug1 with 32 feature maps, 64 input size, half primary capsule (4) and half capsule channel (16) with more image by augmentation (10,000) |
Capsule neural network design
Experimental results
Aug algorithms | Total training images | VGG16 accuracy |
---|---|---|
Aug0 | 5232 | 0.882 |
Aug1 | 5232 | 0.898 |
Aug2 | 5232 | 0.895 |
Aug1 | 10,000 | 0.902 |
Aug2 | 10,000 | 0.879 |
ORB and SVM classification
Augmentation | Accuracy |
---|---|
No Aug | 0.74 |
Aug 20,000 images | 0.776 |
Transfer learning classification
VGG16 | Inception V3 | |
---|---|---|
Model1 | 0.881 | 0.629 |
Model2 | 0.631 | 0.818 |
Model3 | 0.898 | 0.875 |
Model4 | 0.873 | 0.857 |
Model5 | 0.885 | 0.869 |
Model | VGG16 |
---|---|
Model3 with last unfrozen ConvLayer | 0.883 |
Model2 with last unfrozen ConvLayer | 0.924 |
Model2 with last two unfrozen ConvLayer | 0.9 |
Configuration | VGG16 |
---|---|
Model2 with last unfrozen ConvLayer, lr 0.0009 and lr decay 0.8 | 0.9 |
Model2 with last unfrozen ConvLayer, lr 0.001 and lr decay 0.5 | 0.873 |
Model2 with last unfrozen ConvLayer and 20,000 augmentation image | 0.871 |
Model2 with last unfrozen ConvLayer lr 0.0005, lr decay 0.5 and 10,000 augmentation image | 0.902 |
Model3 with last unfrozen ConvLayer drop rate 0.7 | 0.885 |
Model3 with drop rate 0.7 | 0.906 |
Model3 with drop rate 0.7 and 20,000 augmentation image | 0.922 |
Model3 with drop rate 0.7 and 30,000 augmentation image | 0.922 |
Model2 with last unfrozen ConvLayer, batch normal layer, drop rate 0.5 | 0.912 |
Model2 with last unfrozen ConvLayer, batch normal layer, drop rate 0.5 and 20,000 augmentation image | 0.906 |
Model2 with last unfrozen ConvLayer, batch normal layer, dropout 0.7, fc layer, dropout 0.5 | 0.916 |
Model2 with last unfrozen ConvLayer, batch normal layer, dropout 0.7, fc layer, dropout 0.5 and 20,000 augmentation image | 0.875 |
Capsule neural network
Sr. no | Configuration | CapsNet |
---|---|---|
1 | Aug0 | 0.748 |
2 | Aug1 | 0.788 |
3 | Aug1 with 64 feature maps and 64 input size | 0.737 |
4 | Aug1 with 64 feature maps and 128 input size | 0.627 |
5 | Aug1 with 32 feature maps and 64 input size | 0.798 |
6 | Aug1 with 32 feature maps and 128 input size | 0.784 |
7 | Aug1 with 32 feature maps and 48 input size | 0.756 |
8 | Aug1 with 24 feature maps and 64 input size | 0.798 |
9 | Aug1 with 16 feature maps and 64 input size | 0.765 |
10 | Aug1 with 32 feature maps, 64 input size and half primary capsule (4) | 0.811 |
11 | Aug1 with 32 feature maps, 64 input size, half primary capsule (4) and half capsule channel (16) | 0.825 |
12 | Aug1 with 32 feature maps, 64 input size, half primary capsule (4) and one fourth capsule channel (8) | 0.752 |
13 | Aug1 with 24 feature maps, 64 input size, half primary capsule (4) and half capsule channel (16) | 0.825 |
14 | Aug1 with 32 feature maps, 64 input size, half primary capsule (4) and half capsule channel (16) with more image by augmentation (10,000) | 0.788 |
Verify on OCT dataset
No. | Model | Accuracy |
---|---|---|
1 | Model2 with last ConvLayer | 0.934 |
2 | Model3 | 0.828 |
3 | Inceptionv3 Model2 | 0.791 |
4 | Model2 with last two unfrozen ConvLayer | 0.921 |
5 | VGG16 with last ConvLayer → 4096 FC → 0.7 | 0.954 |
Dropout → 2048 FC → 0.5 dropout | ||
6 | VGG16 with last ConvLayer → 4096 FC → 0.7 | 0.937 |
Dropout → 2048 FC → 0.7 | ||
Dropout → 2048 FC → 0.5 dropout | ||
7 | VGG16 with last ConvLayer → 4096 | 0.938 |
FC → 0.8 dropout → 2048 | ||
FC → 0.7 dropout |
Discussion
The effects of data augmentation
Model | Accuracy without augmentation | Accuracy with augmentation |
---|---|---|
ORB and SVM | 0.74 | 0.776 |
VGG16 | 0.883 | 0.923 |
INV3 | 0.844 | 0.875 |
Caps Net | 0.774 | 0.856 |
The finding on fine-tune of transfer learning
Learning rate | Decay rate | Training image | Dropout1 | Dropout2 | BNlayer | VGG16 |
---|---|---|---|---|---|---|
0.001 | 0.9 | 20,000 | 0.5 | NA | No | 0.871 |
0.001 | 0.5 | 5323 | 0.5 | NA | No | 0.873 |
0.001 | 0.9 | 20,000 | 0.7 | 0.5 | Yes | 0.875 |
0.0009 | 0.8 | 5323 | 0.5 | NA | No | 0.9 |
0.0005 | 0.5 | 10,000 | 0.5 | NA | No | 0.902 |
0.001 | 0.9 | 20,000 | 0.7 | NA | Yes | 0.906 |
0.001 | 0.9 | 5323 | 0.5 | NA | Yes | 0.912 |
0.001 | 0.9 | 5323 | 0.7 | 0.5 | Yes | 0.916 |
0.001 | 0.9 | 5323 | 0.5 | NA | No | 0.924 |
Model | Training image | Dropout1 | VGG16 |
---|---|---|---|
Model3 with last unfrozen ConvLayer | 5323 | 0.7 | 0.885 |
Model3 | 5323 | 0.5 | 0.898 |
Model3 | 5323 | 0.7 | 0.906 |
Model3 | 20,000 | 0.7 | 0.922 |
Model3 | 30,000 | 0.7 | 0.922 |
The finding on capsule network
Horizontal comparison
Model | Normal vs pneumonia | Bacteria vs virus | ||||
---|---|---|---|---|---|---|
Accuracy | Specificity | Recall | Accuracy | Specificity | Recall | |
Baseline | 0.776 | 0.809 | 0.776 | 0.643 | 0.64 | 0.585 |
VGG16 [34]a | 0.923 | 0.926 | 0.923 | 0.923 | 0.909 | 0.85 |
VGG16 [38]b | 0.938 | 0.944 | 0.938 | 0.915 | 0.917 | 0.879 |
Inception V3 | 0.869 | 0.854 | 0.869 | 0.851 | 0.86 | 0.779 |
CapsNet | 0.824 | 0.846 | 0.824 | 0.862 | 0.875 | 0.785 |
Stateof-art [3]c | 0.928 | 0.901 | 0.932 | 0.907 | 0.909 | 0.886 |