1 Introduction
2 The depthwise separable convolutions
3 The proposed method
Layers | Out size | K size | Stride |
N
| Out channels | |
---|---|---|---|---|---|---|
Method 1 | Method 2 | |||||
Input | 32 × 32 | |||||
Unit 1 | 16 × 16 | 3 × 3, 1 × 1 | 1 | 32 | 32 | 64 |
Unit 2 | 8 × 8 | 3 × 3, 1 × 1 | 1 | 64 | 64 | 128 |
Unit 3 | 4 × 4 | 3 × 3, 1 × 1 | 1 | 128 | 128 | 256 |
Fc1 | 384 | |||||
Fc2 | 192 |
Network structure | GoogLeNet | AlexNet | MobileNet | Method 1 | Method 2 |
---|---|---|---|---|---|
Number of training parameters | N + N + N*9 + N + N*5*5 + N | N*3*3 | N*1*1 + 3*3 | 3*3*3 + 2*N*1*1 | |
If N = 32 | 1216 | 288 | 41 | 91 |
4 Experiments
4.1 Experiments on MNIST
Methods | Iterations, 5000; batch_size, 128 | |
---|---|---|
Acc(%) | Number of training parameters | |
GoogLeNet | 97.98 | 6640 |
AlexNet | 97.80 | 864 |
MobileNet | 94.59 | 114 |
Our method 1 | 98.99 | 246 |
Our method 2 | 99.03 | 246 |
4.2 Experiments on CIFAR-10
Method | Iterations, 5000; batch_size, 128 | |
---|---|---|
Acc(%) | Number of training parameters | |
GoogLeNet | 76.5 | 6640 |
AlexNet | 75.7 | 2016 |
MobileNet | 65.6 | 251 |
Our method 1 | 72.4 | 529 |
Our method 2 | 71.1 | 529 |
4.3 Experiments on the SVHN
Method | Iterations, 10000; batch_size, 128 | |
---|---|---|
Acc(%) | Number of training parameters | |
GoogLeNet | 92.3 | 6640 |
AlexNet | 92.0 | 2016 |
MobileNet | 90.8 | 251 |
Our method 1 | 91.6 | 529 |
Our method 2 | 91.3 | 529 |
4.4 Experiments on Tiny ImageNet
Method | Iterations, 42000; batch_size, 256 | |
---|---|---|
Acc(%) | Number of training parameters | |
GoogleNet | 48.26 | 6640 |
AlexNet | 44.25 | 2016 |
MobileNet | 34.53 | 251 |
Our method 1 | 43.19 | 529 |
Our method 2 | 41.80 | 529 |