Introduction
Related work
Criterion-based pruning
NAS-based pruning
Attention mechanism
Methodology
Approach overview
Attention mechanism
Network pruning policy
Training objectives
Architectural design
Extension to multi-model pruning
Experiments
Experimental settings
Databases
Performance metrics
Network architecture
Training setting
Depth | Method | Acc. (%) | Acc.\(\downarrow \) (%) | FLOPs\(\downarrow \) (%) |
---|---|---|---|---|
56 | DCP-WOL | 93.12 | 0.47 | 53.2 |
DCP-L2-norm | 93.28 | 0.31 | 53.4 | |
DCP-L1-norm | 93.34 | 0.25 | 53.1 | |
DCP-A | 93.56 | 0.03 | 53.9 |
Depth | Method | Acc. (%) | Acc. \(\downarrow \) (%) | FLOPs \(\downarrow \) (%) |
---|---|---|---|---|
32 | SFP [13] | 92.08 | 0.55 | 41.5 |
MFIS [4] | 92.45 | 0.18 | 41.5 | |
Ours | 92.43 | 0.20 | 47.0 | |
TAS [9] | 93.16 | 0.73 | 49.4 | |
LFPC [12] | 92.12 | 0.51 | 52.6 | |
FPGM [14] | 92.82 | \(-\)0.19 | 53.2 | |
MFIS [4] | 92.14 | 0.49 | 53.2 | |
Ours | 92.37 | 0.26 | 55.4 | |
56 | PFEC [21] | 91.31 | 1.75 | 27.6 |
Ours | 93.58 | 0.01 | 44.7 | |
CS [15] | 93.31 | 0.40 | 50.0 | |
SFP [13] | 92.26 | 1.33 | 52.6 | |
FPGM [14] | 92.89 | 0.70 | 52.6 | |
MFIS [4] | 93.27 | 0.32 | 52.6 | |
TAS [9] | 93.69 | 0.77 | 52.7 | |
LFPC [12] | 93.34 | 0.25 | 52.9 | |
Ours | 93.56 | 0.03 | 53.9 | |
110 | PFEC [21] | 92.94 | 0.61 | 38.6 |
SFP [13] | 93.38 | 0.30 | 40.8 | |
Ours | 94.20 | \(-\)0.52 | 42.7 | |
FPGM [14] | 93.85 | \(-\)0.17 | 52.3 | |
MFIS [4] | 94.01 | \(-\)0.33 | 52.3 | |
TAS [9] | 94.33 | 0.64 | 53.0 | |
LFPC [12] | 93.79 | \(-\)0.11 | 60.3 | |
Ours | 94.24 | \(-\)0.56 | 55.5 |
Model | FLOPs\(\downarrow \) (%) | Semantic segmentation | ||
---|---|---|---|---|
(Higher better) | ||||
mIoU | Pixel Acc | \(\Delta \mathcal {T}\)(%) | ||
Uniform baseline | – | 26.6 | 57.9 | – |
Uniform baseline | 49.7 | 25.0 | 57.0 | − 3.8 |
Ours | 50.7 | 27.1 | 58.6 | + 1.5 |
Uniform baseline | 60.2 | 26.0 | 59.0 | \(-\) 0.2 |
Ours | 60.2 | 26.6 | 58.4 | + 0.4 |
Uniform baseline | 69.7 | 25.7 | 57.3 | \(-\) 2.2 |
Ours | 70.0 | 26.2 | 58.7 | − 0.1 |
Depth | Method | Baseline Top-1 | Baseline Top-5 | Pruned Top-1 | Pruned Top-5 | Top-1 | Top-5 | FLOPs\(\downarrow \) |
---|---|---|---|---|---|---|---|---|
Acc. (%) | Acc. (%) | Acc. (%) | Acc. (%) | Acc.\(\downarrow \) (%) | Acc.\(\downarrow \) (%) | (%) | ||
50 | SFP [13] | 76.15 | 92.87 | 62.14 | 84.60 | 14.01 | 8.27 | 41.8 |
LSTM-SEP [6] | 76.12 | 93.00 | – | – | 0.90 | 0.27 | 43.0 | |
TAS [9] | – | – | 76.20 | 93.07 | 1.26 | 0.48 | 43.5 | |
Ours | 76.15 | 92.87 | 75.66 | 92.51 | 0.49 | 0.36 | 50.9 | |
MetaPruning [30] | 76.60 | – | 75.40 | – | 1.20 | – | 51.2 | |
CS [15] | 76.13 | – | 75.56 | – | 0.56 | 0.36 | 51.3 | |
FPGM [14] | 76.15 | 92.87 | 74.83 | 92.32 | 1.32 | 0.55 | 53.5 | |
MFIS [4] | 76.15 | 92.87 | 75.23 | 92.50 | 0.92 | 0.37 | 53.5 | |
LFPC [12] | 76.15 | 92.87 | 74.46 | 92.04 | 1.69 | 0.83 | 60.8 | |
Ours | 76.15 | 92.87 | 74.61 | 92.18 | 1.54 | 0.69 | 60.9 |
Depth | Method | Acc. (%) | Acc. \(\downarrow \) (%) | FLOPs \(\downarrow \) (%) | ||||
---|---|---|---|---|---|---|---|---|
CIFAR-10 | CIFAR-100 | CIFAR-10 | CIFAR-100 | ALL | CIFAR-10 | CIFAR-100 | ||
32 | FPGM [14] | 92.75 | – | 0.43 | – | – | 41.5 | – |
– | 69.44 | – | 0.84 | – | – | 41.5 | ||
MFIS [4] | 92.96 | 70.12 | 0.22 | 0.16 | 0.19 | 41.5 | 41.5 | |
Ours | 91.02 | 75.47 | 1.61 | \(-\)5.19 | \(-\)1.79 | 42.6 | 42.1 | |
56 | FPGM [14] | 93.55 | – | 0.21 | – | – | 52.6 | – |
– | 70.51 | – | 1.28 | – | – | 52.6 | ||
MFIS [4] | 93.60 | 71.16 | 0.16 | 0.63 | 0.40 | 52.6 | 52.6 | |
Ours | 91.67 | 76.78 | 1.92 | \(-\)4.52 | \(-\)1.30 | 54.4 | 54.6 | |
110 | FPGM [14] | 94.22 | – |
\(-\)
0.17
| – | – | 52.3 | – |
– | 72.80 | – | 1.08 | – | – |
52.3
| ||
MFIS [4] | 94.22 | 73.04 | \(-\)0.17 | 0.84 | 0.34 | 52.3 |
52.3
| |
Ours | 92.50 | 76.48 | 1.18 | \(-\)2.60 | \(-\)0.71 | 55.9 | 51.5 |