Skip to main content
Top
Published in: International Journal of Computer Vision 4/2021

09-01-2021

Excitation Dropout: Encouraging Plasticity in Deep Neural Networks

Authors: Andrea Zunino, Sarah Adel Bargal, Pietro Morerio, Jianming Zhang, Stan Sclaroff, Vittorio Murino

Published in: International Journal of Computer Vision | Issue 4/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We propose a guided dropout regularizer for deep networks based on the evidence of a network prediction defined as the firing of neurons in specific paths. In this work, we utilize the evidence at each neuron to determine the probability of dropout, rather than dropping out neurons uniformly at random as in standard dropout. In essence, we dropout with higher probability those neurons which contribute more to decision making at training time. This approach penalizes high saliency neurons that are most relevant for model prediction, i.e. those having stronger evidence. By dropping such high-saliency neurons, the network is forced to learn alternative paths in order to maintain loss minimization, resulting in a plasticity-like behavior, a characteristic of human brains too. We demonstrate better generalization ability, an increased utilization of network neurons, and a higher resilience to network compression using several metrics over four image/video recognition benchmarks.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Achille, A., & Soatto, S. (2018). Information dropout: Learning optimal representations through noisy computation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 40(12), 2897–2905.CrossRef Achille, A., & Soatto, S. (2018). Information dropout: Learning optimal representations through noisy computation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 40(12), 2897–2905.CrossRef
go back to reference Ba, J., & Frey, B. (2013). Adaptive dropout for training deep neural networks. Advances in Neural Information Processing Systems (NIPS), 26, 3084–3092. Ba, J., & Frey, B. (2013). Adaptive dropout for training deep neural networks. Advances in Neural Information Processing Systems (NIPS), 26, 3084–3092.
go back to reference Baldi, P., & Sadowski, P. J. (2013). Understanding dropout. Advances in Neural Information Processing Systems (NIPS), 26, 2814–2822. Baldi, P., & Sadowski, P. J. (2013). Understanding dropout. Advances in Neural Information Processing Systems (NIPS), 26, 2814–2822.
go back to reference Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR). Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR).
go back to reference Gal, Y., Hron, J., & Kendall, A. (2017). Concrete dropout. In Advances in neural information processing systems (NIPS). Gal, Y., Hron, J., & Kendall, A. (2017). Concrete dropout. In Advances in neural information processing systems (NIPS).
go back to reference Ghiasi, G., Lin, T. Y., & Le, Q. V. (2018). Dropblock: A regularization method for convolutional networks. Advances in Neural Information Processing Systems (NIPS), 31, 10727–10737. Ghiasi, G., Lin, T. Y., & Le, Q. V. (2018). Dropblock: A regularization method for convolutional networks. Advances in Neural Information Processing Systems (NIPS), 31, 10727–10737.
go back to reference Gomez, A. N., Zhang, I., Swersky, K., Gal, Y., & Hinton, G. E. (2018). Targeted dropout. In: NIPS Compact deep neural network representation with industrial applications workshop. Gomez, A. N., Zhang, I., Swersky, K., Gal, Y., & Hinton, G. E. (2018). Targeted dropout. In: NIPS Compact deep neural network representation with industrial applications workshop.
go back to reference Hebb, D. O. (2005). The organization of behavior: A neuropsychological theory. Routledge: Psychology Press.CrossRef Hebb, D. O. (2005). The organization of behavior: A neuropsychological theory. Routledge: Psychology Press.CrossRef
go back to reference Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. In NIPS deep learning and representation learning workshop. Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. In NIPS deep learning and representation learning workshop.
go back to reference Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580. Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:​1207.​0580.
go back to reference Kang, G., Li, J., & Tao, D. (2017). Shakeout: A new approach to regularized deep neural network training. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 40(5), 1245–58.CrossRef Kang, G., Li, J., & Tao, D. (2017). Shakeout: A new approach to regularized deep neural network training. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 40(5), 1245–58.CrossRef
go back to reference Kingma, D. P., Salimans, T., & Welling, M. (2015). Variational dropout and the local reparameterization trick. Advances in Neural Information Processing Systems (NIPS), 28, 2575–2583. Kingma, D. P., Salimans, T., & Welling, M. (2015). Variational dropout and the local reparameterization trick. Advances in Neural Information Processing Systems (NIPS), 28, 2575–2583.
go back to reference Krizhevsky, A., Hinton, G., et al. (2009). Learning multiple layers of features from tiny images. Citeseer. Krizhevsky, A., Hinton, G., et al. (2009). Learning multiple layers of features from tiny images. Citeseer.
go back to reference Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS). Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS).
go back to reference Li, Z., Gong, B., & Yang, T. (2016). Improved dropout for shallow and deep learning. In Advances in neural information processing systems (NIPS). Li, Z., Gong, B., & Yang, T. (2016). Improved dropout for shallow and deep learning. In Advances in neural information processing systems (NIPS).
go back to reference Ma, S., Bargal, S. A., Zhang, J., Sigal, L., & Sclaroff, S. (2017). Do less and achieve more: Training CNNs for action recognition utilizing action images from the web. Pattern Recognition, 68, 334–345.CrossRef Ma, S., Bargal, S. A., Zhang, J., Sigal, L., & Sclaroff, S. (2017). Do less and achieve more: Training CNNs for action recognition utilizing action images from the web. Pattern Recognition, 68, 334–345.CrossRef
go back to reference Miconi, T., Clune, J., & Stanley, K. O. (2018). Differentiable plasticity: Training plastic neural networks with backpropagation. arXiv preprint arXiv:1804.02464. Miconi, T., Clune, J., & Stanley, K. O. (2018). Differentiable plasticity: Training plastic neural networks with backpropagation. arXiv preprint arXiv:​1804.​02464.
go back to reference Mittal, D., Bhardwaj, S., Khapra, M. M., & Ravindran, B. (2018). Recovering from random pruning: On the plasticity of deep convolutional neural networks. In Winter conference on applications of computer vision. Mittal, D., Bhardwaj, S., Khapra, M. M., & Ravindran, B. (2018). Recovering from random pruning: On the plasticity of deep convolutional neural networks. In Winter conference on applications of computer vision.
go back to reference Morerio, P., Cavazza, J., Volpi, R., Vidal, R., & Murino, V. (2017). Curriculum dropout. In Proceedings of IEEE international conference on computer vision (ICCV). Morerio, P., Cavazza, J., Volpi, R., Vidal, R., & Murino, V. (2017). Curriculum dropout. In Proceedings of IEEE international conference on computer vision (ICCV).
go back to reference Rennie, S. J., Goel, V., & Thomas, S. (2014). Annealed dropout training of deep networks. In Spoken language technology workshop (SLT), IEEE, pp. 159–164). Rennie, S. J., Goel, V., & Thomas, S. (2014). Annealed dropout training of deep networks. In Spoken language technology workshop (SLT), IEEE, pp. 159–164).
go back to reference Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of IEEE international conference on computer vision (ICCV). Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of IEEE international conference on computer vision (ICCV).
go back to reference Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:​1409.​1556.
go back to reference Song, S., Miller, K. D., & Abbott, L. F. (2000). Competitive hebbian learning through spike-timing-dependent synaptic plasticity. Nature Neuroscience, 3(9), 919.CrossRef Song, S., Miller, K. D., & Abbott, L. F. (2000). Competitive hebbian learning through spike-timing-dependent synaptic plasticity. Nature Neuroscience, 3(9), 919.CrossRef
go back to reference Soomro, K., Zamir, A. R., & Shah, M. (2012). UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402. Soomro, K., Zamir, A. R., & Shah, M. (2012). UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:​1212.​0402.
go back to reference Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research (JMLR), 15(1), 1929–1958.MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research (JMLR), 15(1), 1929–1958.MathSciNetMATH
go back to reference Wager, S., Wang, S., & Liang, P. S. (2013). Dropout training as adaptive regularization. Advances in Neural Information Processing Systems (NIPS), 26, 351–359. Wager, S., Wang, S., & Liang, P. S. (2013). Dropout training as adaptive regularization. Advances in Neural Information Processing Systems (NIPS), 26, 351–359.
go back to reference Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., & Fergus, R. (2013). Regularization of neural networks using dropconnect. In Proceedings of international conference on machine learning (ICML). Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., & Fergus, R. (2013). Regularization of neural networks using dropconnect. In Proceedings of international conference on machine learning (ICML).
go back to reference Wang, S., & Manning, C. (2013). Fast dropout training. In Proceedings of international conference on machine learning (ICML), pp. 118–126. Wang, S., & Manning, C. (2013). Fast dropout training. In Proceedings of international conference on machine learning (ICML), pp. 118–126.
go back to reference Wu, H., & Gu, X. (2015). Towards dropout training for convolutional neural networks. Neural Networks, 71, 1–10.CrossRef Wu, H., & Gu, X. (2015). Towards dropout training for convolutional neural networks. Neural Networks, 71, 1–10.CrossRef
go back to reference Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. In Proceedings of British machine vision conference (BMVC). Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. In Proceedings of British machine vision conference (BMVC).
go back to reference Zhang, J., Lin, Z., Brandt, J., Shen, X., & Sclaroff, S. (2016). Top-down neural attention by excitation backprop. In Proceedings of European conference on computer vision (ECCV). Zhang, J., Lin, Z., Brandt, J., Shen, X., & Sclaroff, S. (2016). Top-down neural attention by excitation backprop. In Proceedings of European conference on computer vision (ECCV).
go back to reference Zhang, J., Bargal, S. A., Lin, Z., Brandt, J., Shen, X., & Sclaroff, S. (2017). Top-down neural attention by excitation backprop. International Journal of Computer Vision (IJCV), 126, 1–19. Zhang, J., Bargal, S. A., Lin, Z., Brandt, J., Shen, X., & Sclaroff, S. (2017). Top-down neural attention by excitation backprop. International Journal of Computer Vision (IJCV), 126, 1–19.
go back to reference Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR). Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR).
Metadata
Title
Excitation Dropout: Encouraging Plasticity in Deep Neural Networks
Authors
Andrea Zunino
Sarah Adel Bargal
Pietro Morerio
Jianming Zhang
Stan Sclaroff
Vittorio Murino
Publication date
09-01-2021
Publisher
Springer US
Published in
International Journal of Computer Vision / Issue 4/2021
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-020-01422-y

Other articles of this Issue 4/2021

International Journal of Computer Vision 4/2021 Go to the issue

Premium Partner