nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

Neural Network Encapsulation

verfasst von : Hongyang Li, Xiaoyang Guo, Bo Dai, Wanli Ouyang, Xiaogang Wang

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

A capsule is a collection of neurons which represents different variants of a pattern in the network. The routing scheme ensures only certain capsules which resemble lower counterparts in the higher layer should be activated. However, the computational complexity becomes a bottleneck for scaling up to larger networks, as lower capsules need to correspond to each and every higher capsule. To resolve this limitation, we approximate the routing process with two branches: a master branch which collects primary information from its direct contact in the lower layer and an aide branch that replenishes master based on pattern variants encoded in other lower capsules. Compared with previous iterative and unsupervised routing scheme, these two branches are communicated in a fast, supervised and one-time pass fashion. The complexity and runtime of the model are therefore decreased by a large margin. Motivated by the routing to make higher capsule have agreement with lower capsule, we extend the mechanism as a compensation for the rapid loss of information in nearby layers. We devise a feedback agreement unit to send back higher capsules as feedback. It could be regarded as an additional regularization to the network. The feedback agreement is achieved by comparing the optimal transport divergence between two distributions (lower and higher capsules). Such an add-on witnesses a unanimous gain in both capsule and vanilla networks. Our proposed EncapNet performs favorably better against previous state-of-the-arts on CIFAR10/100, SVHN and a subset of ImageNet.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel HairNet: Single-View Hair Reconstruction Using Convolutional Neural Networks

Nächstes Kapitel Learning Deep Representations with Probabilistic Knowledge Transfer

Equivariance is the detection of feature patterns that can transform to each other.

In some literature, i.e., [14, 15], it is called the probability measure and commonly denoted as \(\mu \) or \(\nu \); a coupling is the joint distribution (measure). We use distribution or measure interchangeably in the following context. \(\text {Prob}(\mathcal {U})\) is the set of probability distributions over a metric space \(\mathcal {U}\).

The term Sinkhorn used in this paper is two-folds: one is to indicate the computation of P via a Sinkhorn iterates; another is to imply the revised OT divergence.

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)CrossRef

Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)

Szegedy, C., et al.: Going deeper with convolutions. In: CVPR (2015)

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

Li, H., Liu, Y., Ouyang, W., Wang, X.: Zoom out-and-in network with map attention decision for region proposal and object detection. Int. J. Comput. Vis. (IJCV), 1–14 (2018)

Li, H., et al.: Do we really need more training data for object localization. In: IEEE International Conference on Image Processing (2017)

Chi, Z., Li, H., Lu, H., Yang, M.H.: Dual deep network for visual tracking. IEEE Trans. Image Process. 26(4), 2005–2015 (2017)MathSciNetCrossRef

Hui, J.: Understanding Matrix capsules with EM Routing (2017). https://jhui.github.io/2017/11/14/Matrix-Capsules-with-EM-routing-Capsule-Network. Accessed 10 Mar 2018

10.

Sabour, S., Frosst, N., Hinton, G.: Dynamic routing between capsules. In: NIPS (2017)

11.

Hinton, G.E., Sabour, S., Frosst, N.: Matrix capsules with EM routing. In: ICLR (2018)

12.

Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML (2015)

13.

Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: ICML, pp. 807–814 (2010)

14.

Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. arXiv preprint arXiv:1701.07875 (2017)

15.

Genevay, A., Peyr, G., Cuturi, M.: Learning generative models with Sinkhorn divergences. arXiv preprint: arXiv:1706.00292 (2017)

16.

Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: NIPS (2013)

17.

Sinkhorn, R.: A relationship between arbitrary positive matrices and doubly stochastic matrices. Ann. Math. Stat. 35(2), 876–879 (1964)MathSciNetCrossRef

18.

Gretton, A., Borgwardt, K., Rasch, M.J., Scholkopf, B., Smola, A.J.: A kernel method for the two-sample problem. In: NIPS (2007)

19.

Salimans, T., Zhang, H., Radford, A., Metaxas, D.: Improving GANs using optimal transport. In: International Conference on Learning Representations (2018)

20.

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

21.

Wang, D., Liu, Q.: An optimization view on dynamic routing between capsules. In: Submit to ICLR Workshop (2018)

22.

Li, M.J., Ng, M.K., ming Cheung, Y., Huang, J.Z.: Agglomerative fuzzy k-means clustering algorithm with selection of number of clusters. IEEE Trans. Knowl. Data Eng. 20, 1519–1534 (2008)CrossRef

23.

Shahroudnejad, A., Mohammadi, A., Plataniotis, K.N.: Improved explainability of capsule networks: relevance path by agreement. arXiv preprint: arXiv:1802.10204 (2018)

24.

Bahadori, M.T.: Spectral capsule networks. In: ICLR workshop (2018)

25.

Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K.: Recurrent models of visual attention. In: NIPS (2014)

26.

Stollenga, M.F., Masci, J., Gomez, F., Schmidhuber, J.: Deep networks with internal selective attention through feedback connections. In: NIPS, pp. 3545–3553 (2014)

27.

Vaswani, A., et al.: Attention is all you need. arXiv preprint: arXiv:1706.03762 (2017)

28.

Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report (2009)

29.

Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop (2011)

30.

Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)MathSciNetCrossRef

31.

Zagoruyko, S., Komodakis, N.: Wide residual networks. In: BMVC (2016)

32.

Mishkin, D., Matas, J.: All you need is a good init. arXiv preprint: arXiv:1511.06422 (2015)

33.

Snoek, J., et al.: Scalable Bayesian optimization using deep neural networks. In: ICML (2015)

34.

Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units. arXiv preprint: arXiv:1511.07289 (2015)

35.

Chang, J.R., Chen, Y.S.: Batch-normalized maxout network in network. arXiv preprint: arXiv:1511.02583 (2015)

36.

Liang, M., Hu, X.: Recurrent convolutional neural network for object recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015

37.

Agostinelli, F., Hoffman, M., Sadowski, P., Baldi, P.: Learning activation functions to improve deep neural networks. In: ICLR Workshop (2015)

38.

Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. arXiv preprint arXiv:1409.5185 (2014)

39.

Lin, M., Chen, Q., Yan, S.: Network in network. In: ICLR (2014)

40.

Goodfellow, I.J., Warde-farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: ICML (2013)

Titel: Neural Network Encapsulation
verfasst von: Hongyang Li
Xiaoyang Guo
Bo Dai
Wanli Ouyang
Xiaogang Wang
Verlag: Springer International Publishing
Buch: Computer Vision – ECCV 2018
Print ISBN: 978-3-030-01251-9

Electronic ISBN: 978-3-030-01252-6

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-030-01252-6_16

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner