Skip to main content
Erschienen in: Neural Processing Letters 2/2020

13.11.2019

Parameters Sharing in Residual Neural Networks

verfasst von: Dawei Dai, Liping Yu, Hui Wei

Erschienen in: Neural Processing Letters | Ausgabe 2/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Deep neural networks (DNN) have achieved great success in machine learning due to their powerful ability to learn and present knowledge. However, models of such DNN often have massive trainable parameters, which lead to big resource burden in practice. As a result, reducing the amount of parameters and preserving its competitive performance are always critical tasks in the field of DNN. In this paper, we focused on one type of convolution neural network that has many repeated or same-structure convolutional layers. Residual net and its variants are widely used, making the deeper model easy to train. One type block of such a model contains two convolutional layers, and each block commonly has two trainable parameter layers. However, we used only one layer of trainable parameters in the block, which means that the two convolutional layers in one block shared one layer of trainable parameters. We performed extensive experiments for different architectures of the Residual Net with trainable parameter sharing on the CIFAR-10, CIFAR-100, and ImageNet datasets. We found that the model with trainable parameter sharing can obtain fewer errors on the training datasets and had a very close recognition accuracy (within 0.5%), compared to the original models. The parameters of the new model were reduced by more than 1/3 of the total of the original.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Cireşan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. ArXiv preprint arXiv:1202.2745 Cireşan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. ArXiv preprint arXiv:​1202.​2745
2.
Zurück zum Zitat Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp 580–587) Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp 580–587)
3.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
4.
Zurück zum Zitat Shrikumar A, Greenside P, Kundaje A (2017) Learning important features through propagating activation differences. ArXiv preprint arXiv:1704.02685 Shrikumar A, Greenside P, Kundaje A (2017) Learning important features through propagating activation differences. ArXiv preprint arXiv:​1704.​02685
5.
Zurück zum Zitat Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, pp 818–833. Springer, Cham Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, pp 818–833. Springer, Cham
6.
Zurück zum Zitat Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929 Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929
7.
Zurück zum Zitat Bach S, Binder A, Montavon G, Klauschen F, Müller KR, Samek W (2015) On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7):e0130140CrossRef Bach S, Binder A, Montavon G, Klauschen F, Müller KR, Samek W (2015) On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7):e0130140CrossRef
8.
Zurück zum Zitat Zintgraf LM, Cohen TS, Adel T, Welling M (2017) Visualizing deep neural network decisions: prediction difference analysis. ArXiv preprint arXiv:1702.04595 Zintgraf LM, Cohen TS, Adel T, Welling M (2017) Visualizing deep neural network decisions: prediction difference analysis. ArXiv preprint arXiv:​1702.​04595
9.
Zurück zum Zitat Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. ArXiv preprint arXiv:1409.1556 Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. ArXiv preprint arXiv:​1409.​1556
11.
Zurück zum Zitat Yosinski J, Clune J, Nguyen A, Fuchs T, Lipson H (2015) Understanding neural networks through deep visualization. ArXiv preprint arXiv:1506.06579 Yosinski J, Clune J, Nguyen A, Fuchs T, Lipson H (2015) Understanding neural networks through deep visualization. ArXiv preprint arXiv:​1506.​06579
12.
Zurück zum Zitat Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99 Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
13.
Zurück zum Zitat Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
14.
Zurück zum Zitat Liu Z, Li J, Shen Z, Huang G, Yan S, Zhang C (2017) Learning efficient convolutional networks through network slimming. In: 2017 IEEE international conference on computer vision (ICCV), pp 2755–2763. IEEE Liu Z, Li J, Shen Z, Huang G, Yan S, Zhang C (2017) Learning efficient convolutional networks through network slimming. In: 2017 IEEE international conference on computer vision (ICCV), pp 2755–2763. IEEE
15.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
17.
Zurück zum Zitat Chrabaszcz P, Loshchilov I, Hutter F (2017) A downsampled variant of ImageNet as an alternative to the CIFAR datasets. ArXiv preprint arXiv:1707.08819 Chrabaszcz P, Loshchilov I, Hutter F (2017) A downsampled variant of ImageNet as an alternative to the CIFAR datasets. ArXiv preprint arXiv:​1707.​08819
18.
Zurück zum Zitat Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110CrossRef Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110CrossRef
19.
Zurück zum Zitat Belue LM, Bauer KW Jr (1995) Determining input features for multilayer perceptrons. Neurocomputing 7(2):111–121CrossRef Belue LM, Bauer KW Jr (1995) Determining input features for multilayer perceptrons. Neurocomputing 7(2):111–121CrossRef
20.
Zurück zum Zitat LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436CrossRef LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436CrossRef
21.
Zurück zum Zitat Celik MU, Sharma G, Tekalp AM, Saber E (2002) Reversible data hiding. In: Proceedings international conference on image processing, vol 2, p II. IEEE Celik MU, Sharma G, Tekalp AM, Saber E (2002) Reversible data hiding. In: Proceedings international conference on image processing, vol 2, p II. IEEE
22.
Zurück zum Zitat LeCun Y, Boser BE, Denker JS, Henderson D, Howard RE, Hubbard WE, Jackel LD (1990) Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems, pp 396–404 LeCun Y, Boser BE, Denker JS, Henderson D, Howard RE, Hubbard WE, Jackel LD (1990) Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems, pp 396–404
24.
Zurück zum Zitat Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: CVPR, vol 1(2), p 3 Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: CVPR, vol 1(2), p 3
25.
Zurück zum Zitat Hornic K (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366CrossRef Hornic K (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366CrossRef
26.
Zurück zum Zitat Leshno M, Vladimir YL, Pinkus A et al (1991) Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw 6(6):861–867CrossRef Leshno M, Vladimir YL, Pinkus A et al (1991) Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw 6(6):861–867CrossRef
28.
Zurück zum Zitat Zhang X, Luo H, Fan X, Xiang W, Sun Y, Xiao Q et al (2017) Alignedreid: surpassing human-level performance in person re-identification. ArXiv preprint arXiv:1711.08184 Zhang X, Luo H, Fan X, Xiang W, Sun Y, Xiao Q et al (2017) Alignedreid: surpassing human-level performance in person re-identification. ArXiv preprint arXiv:​1711.​08184
29.
Zurück zum Zitat Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. ArXiv preprint arXiv:1502.03167 Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. ArXiv preprint arXiv:​1502.​03167
30.
Zurück zum Zitat Lee CY, Xie S, Gallagher P, Zhang Z, Tu Z (2015). Deeply-supervised nets. In: Artificial intelligence and statistics, pp 562–570 Lee CY, Xie S, Gallagher P, Zhang Z, Tu Z (2015). Deeply-supervised nets. In: Artificial intelligence and statistics, pp 562–570
31.
Zurück zum Zitat Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252MathSciNetCrossRef Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252MathSciNetCrossRef
Metadaten
Titel
Parameters Sharing in Residual Neural Networks
verfasst von
Dawei Dai
Liping Yu
Hui Wei
Publikationsdatum
13.11.2019
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 2/2020
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-019-10143-4

Weitere Artikel der Ausgabe 2/2020

Neural Processing Letters 2/2020 Zur Ausgabe

Neuer Inhalt