Skip to main content

2023 | OriginalPaper | Buchkapitel

Capsule Networks as Generative Models

verfasst von : Alex B. Kiefer, Beren Millidge, Alexander Tschantz, Christopher L. Buckley

Erschienen in: Active Inference

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Capsule networks are a neural network architecture specialized for visual scene recognition. Features and pose information are extracted from a scene and then dynamically routed through a hierarchy of vector-valued nodes called ‘capsules’ to create an implicit scene graph, with the ultimate aim of learning vision directly as inverse graphics. Despite these intuitions, however, capsule networks are not formulated as explicit probabilistic generative models; moreover, the routing algorithms typically used are ad-hoc and primarily motivated by algorithmic intuition. In this paper, we derive an alternative capsule routing algorithm utilizing iterative inference under sparsity constraints. We then introduce an explicit probabilistic generative model for capsule networks based on the self-attention operation in transformer networks and show how it is related to a variant of predictive coding networks using Von-Mises-Fisher (VMF) circular Gaussian distributions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2014). arXiv preprint arXiv:1409.0473 Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2014). arXiv preprint arXiv:​1409.​0473
2.
Zurück zum Zitat Beal, M.J.: Variational algorithms for approximate Bayesian inference. Technical report (2003) Beal, M.J.: Variational algorithms for approximate Bayesian inference. Technical report (2003)
3.
Zurück zum Zitat Bogacz, R.: A tutorial on the free-energy framework for modelling perception and learning. J. Math. Psychol. 76, 198–211 (2017)MathSciNetCrossRefMATH Bogacz, R.: A tutorial on the free-energy framework for modelling perception and learning. J. Math. Psychol. 76, 198–211 (2017)MathSciNetCrossRefMATH
6.
Zurück zum Zitat Buckley, C.L., Kim, C.S., McGregor, S., Seth, A.K.: The free energy principle for action and perception: a mathematical review. J. Math. Psychol. 81, 55–79 (2017)MathSciNetCrossRefMATH Buckley, C.L., Kim, C.S., McGregor, S., Seth, A.K.: The free energy principle for action and perception: a mathematical review. J. Math. Psychol. 81, 55–79 (2017)MathSciNetCrossRefMATH
7.
Zurück zum Zitat Buzsáki, G., Mizuseki, K.: The log-dynamic brain: how skewed distributions affect network operations. Nat. Rev. Neurosci. 15(4), 264–278 (2014)CrossRef Buzsáki, G., Mizuseki, K.: The log-dynamic brain: how skewed distributions affect network operations. Nat. Rev. Neurosci. 15(4), 264–278 (2014)CrossRef
8.
Zurück zum Zitat Chen, L., et al.: Decision transformer: reinforcement learning via sequence modeling. Adv. Neural. Inf. Process. Syst. 34, 15084–15097 (2021) Chen, L., et al.: Decision transformer: reinforcement learning via sequence modeling. Adv. Neural. Inf. Process. Syst. 34, 15084–15097 (2021)
9.
Zurück zum Zitat De Zeeuw, C.I., Hoebeek, F.E., Bosman, L.W., Schonewille, M., Witter, L., Koekkoek, S.K.: Spatiotemporal firing patterns in the cerebellum. Nat. Rev. Neurosci. 12(6), 327–344 (2011)CrossRef De Zeeuw, C.I., Hoebeek, F.E., Bosman, L.W., Schonewille, M., Witter, L., Koekkoek, S.K.: Spatiotemporal firing patterns in the cerebellum. Nat. Rev. Neurosci. 12(6), 327–344 (2011)CrossRef
10.
Zurück zum Zitat Demircigil, M., Heusel, J., Löwe, M., Upgang, S., Vermet, F.: On a model of associative memory with huge storage capacity. J. Stat. Phys. 168(2), 288–299 (2017)MathSciNetCrossRefMATH Demircigil, M., Heusel, J., Löwe, M., Upgang, S., Vermet, F.: On a model of associative memory with huge storage capacity. J. Stat. Phys. 168(2), 288–299 (2017)MathSciNetCrossRefMATH
11.
Zurück zum Zitat Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020) Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. arXiv preprint arXiv:​2010.​11929 (2020)
12.
Zurück zum Zitat Friston, K.: A theory of cortical responses. Philos. Trans. Roy. Soc. B Biol. Sci. 360(1456), 815–836 (2005)CrossRef Friston, K.: A theory of cortical responses. Philos. Trans. Roy. Soc. B Biol. Sci. 360(1456), 815–836 (2005)CrossRef
13.
Zurück zum Zitat Graham, D.J., Field, D.J.: Sparse coding in the neocortex. Evol. Nerv. Syst. 3, 181–187 (2006) Graham, D.J., Field, D.J.: Sparse coding in the neocortex. Evol. Nerv. Syst. 3, 181–187 (2006)
14.
Zurück zum Zitat Greff, K., Srivastava, R.K., Schmidhuber, J.: Highway and residual networks learn unrolled iterative estimation. arXiv preprint arXiv:1612.07771 (2016) Greff, K., Srivastava, R.K., Schmidhuber, J.: Highway and residual networks learn unrolled iterative estimation. arXiv preprint arXiv:​1612.​07771 (2016)
15.
Zurück zum Zitat Gregor, K., Danihelka, I., Graves, A., Rezende, D., Wierstra, D.: Draw: a recurrent neural network for image generation. In: International Conference on Machine Learning, pp. 1462–1471. PMLR (2015) Gregor, K., Danihelka, I., Graves, A., Rezende, D., Wierstra, D.: Draw: a recurrent neural network for image generation. In: International Conference on Machine Learning, pp. 1462–1471. PMLR (2015)
18.
Zurück zum Zitat Hinton, G.E., Sabour, S., Frosst, N.: Matrix capsules with EM routing. In: International Conference on Learning Representations (2018) Hinton, G.E., Sabour, S., Frosst, N.: Matrix capsules with EM routing. In: International Conference on Learning Representations (2018)
19.
Zurück zum Zitat Jastrzbski, S., Arpit, D., Ballas, N., Verma, V., Che, T., Bengio, Y.: Residual connections encourage iterative inference. arXiv preprint arXiv:1710.04773 (2017) Jastrzbski, S., Arpit, D., Ballas, N., Verma, V., Che, T., Bengio, Y.: Residual connections encourage iterative inference. arXiv preprint arXiv:​1710.​04773 (2017)
20.
Zurück zum Zitat Kanerva, P.: Sparse Distributed Memory. MIT Press, Cambridge (1988)MATH Kanerva, P.: Sparse Distributed Memory. MIT Press, Cambridge (1988)MATH
21.
Zurück zum Zitat Krotov, D., Hopfield, J.: Large associative memory problem in neurobiology and machine learning. arXiv preprint arXiv:2008.06996 (2020) Krotov, D., Hopfield, J.: Large associative memory problem in neurobiology and machine learning. arXiv preprint arXiv:​2008.​06996 (2020)
22.
Zurück zum Zitat Krotov, D., Hopfield, J.J.: Dense associative memory for pattern recognition. Advance in Neural Information Processing System, vol. 29, pp. 1172–1180 (2016) Krotov, D., Hopfield, J.J.: Dense associative memory for pattern recognition. Advance in Neural Information Processing System, vol. 29, pp. 1172–1180 (2016)
23.
Zurück zum Zitat Lamme, V.A., Roelfsema, P.R.: The distinct modes of vision offered by feedforward and recurrent processing. Trends Neurosci. 23(11), 571–579 (2000)CrossRef Lamme, V.A., Roelfsema, P.R.: The distinct modes of vision offered by feedforward and recurrent processing. Trends Neurosci. 23(11), 571–579 (2000)CrossRef
24.
Zurück zum Zitat Makhzani, A., Frey, B.J.: k-sparse autoencoders. CoRR abs/1312.5663 (2014) Makhzani, A., Frey, B.J.: k-sparse autoencoders. CoRR abs/1312.5663 (2014)
25.
Zurück zum Zitat Melloni, L., van Leeuwen, S., Alink, A., Müller, N.G.: Interaction between bottom-up saliency and top-down control: how saliency maps are created in the human brain. Cereb. Cortex 22(12), 2943–2952 (2012)CrossRef Melloni, L., van Leeuwen, S., Alink, A., Müller, N.G.: Interaction between bottom-up saliency and top-down control: how saliency maps are created in the human brain. Cereb. Cortex 22(12), 2943–2952 (2012)CrossRef
26.
Zurück zum Zitat Millidge, B., Salvatori, T., Song, Y., Lukasiewicz, T., Bogacz, R.: Universal hopfield networks: a general framework for single-shot associative memory models. arXiv preprint arXiv:2202.04557 (2022) Millidge, B., Salvatori, T., Song, Y., Lukasiewicz, T., Bogacz, R.: Universal hopfield networks: a general framework for single-shot associative memory models. arXiv preprint arXiv:​2202.​04557 (2022)
27.
Zurück zum Zitat Millidge, B., Seth, A., Buckley, C.L.: Predictive coding: a theoretical and experimental review. arXiv preprint arXiv:2107.12979 (2021) Millidge, B., Seth, A., Buckley, C.L.: Predictive coding: a theoretical and experimental review. arXiv preprint arXiv:​2107.​12979 (2021)
29.
Zurück zum Zitat Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583), 607–609 (1996)CrossRef Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583), 607–609 (1996)CrossRef
30.
Zurück zum Zitat Olshausen, B.A., Field, D.J.: Sparse coding of sensory inputs. Curr. Opin. Neurobiol. 14(4), 481–487 (2004)CrossRef Olshausen, B.A., Field, D.J.: Sparse coding of sensory inputs. Curr. Opin. Neurobiol. 14(4), 481–487 (2004)CrossRef
31.
Zurück zum Zitat Paik, I., Kwak, T., Kim, I.: Capsule networks need an improved routing algorithm. ArXiv abs/1907.13327 (2019) Paik, I., Kwak, T., Kim, I.: Capsule networks need an improved routing algorithm. ArXiv abs/1907.13327 (2019)
32.
Zurück zum Zitat Parmar, N., et al.: Image transformer. In: International Conference on Machine Learning, pp. 4055–4064. PMLR (2018) Parmar, N., et al.: Image transformer. In: International Conference on Machine Learning, pp. 4055–4064. PMLR (2018)
33.
Zurück zum Zitat Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan kaufmann, Burlington (1988) Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan kaufmann, Burlington (1988)
34.
Zurück zum Zitat Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
36.
Zurück zum Zitat Rawlinson, D., Ahmed, A., Kowadlo, G.: Sparse unsupervised capsules generalize better. ArXiv abs/1804.06094 (2018) Rawlinson, D., Ahmed, A., Kowadlo, G.: Sparse unsupervised capsules generalize better. ArXiv abs/1804.06094 (2018)
38.
Zurück zum Zitat Ribeiro, F.D.S., Leontidis, G., Kollias, S.D.: Capsule routing via variational bayes. In: AAAI, pp. 3749–3756 (2020) Ribeiro, F.D.S., Leontidis, G., Kollias, S.D.: Capsule routing via variational bayes. In: AAAI, pp. 3749–3756 (2020)
39.
Zurück zum Zitat Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, vol. 30 (2017) Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
40.
Zurück zum Zitat Schweighofer, N., Doya, K., Lay, F.: Unsupervised learning of granule cell sparse codes enhances cerebellar adaptive control. Neuroscience 103(1), 35–50 (2001)CrossRef Schweighofer, N., Doya, K., Lay, F.: Unsupervised learning of granule cell sparse codes enhances cerebellar adaptive control. Neuroscience 103(1), 35–50 (2001)CrossRef
41.
Zurück zum Zitat Shepherd, G.M., Grillner, S.: Handbook of Brain Microcircuits. Oxford University Press, Oxford (2018) Shepherd, G.M., Grillner, S.: Handbook of Brain Microcircuits. Oxford University Press, Oxford (2018)
43.
Zurück zum Zitat Sterling, P., Laughlin, S.: Principles of Neural Design. MIT Press, Cambridge (2015) Sterling, P., Laughlin, S.: Principles of Neural Design. MIT Press, Cambridge (2015)
44.
Zurück zum Zitat Theeuwes, J.: Top-down and bottom-up control of visual selection. Acta Physiol. (Oxf) 135(2), 77–99 (2010) Theeuwes, J.: Top-down and bottom-up control of visual selection. Acta Physiol. (Oxf) 135(2), 77–99 (2010)
45.
Zurück zum Zitat Tschantz, A., Millidge, B., Seth, A.K., Buckley, C.L.: Hybrid predictive coding: Inferring, fast and slow. arXiv preprint arXiv:2204.02169 (2022) Tschantz, A., Millidge, B., Seth, A.K., Buckley, C.L.: Hybrid predictive coding: Inferring, fast and slow. arXiv preprint arXiv:​2204.​02169 (2022)
46.
Zurück zum Zitat VanRullen, R.: The power of the feed-forward sweep. Adv. Cogn. Psychol. 3(1–2), 167 (2007)CrossRef VanRullen, R.: The power of the feed-forward sweep. Adv. Cogn. Psychol. 3(1–2), 167 (2007)CrossRef
47.
Zurück zum Zitat Vaswani, A., Shazeer, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017) Vaswani, A., Shazeer, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
48.
Zurück zum Zitat Wainwright, M.J., Jordan, M.I., et al.: Graphical models, exponential families, and variational inference. Found. Trends® Mach. Learn. 1(1–2), 1–305 (2008) Wainwright, M.J., Jordan, M.I., et al.: Graphical models, exponential families, and variational inference. Found. Trends® Mach. Learn. 1(1–2), 1–305 (2008)
49.
Zurück zum Zitat Weidner, R., Krummenacher, J., Reimann, B., Müller, H.J., Fink, G.R.: Sources of top-down control in visual search. J. Cogn. Neurosci. 21(11), 2100–2113 (2009)CrossRef Weidner, R., Krummenacher, J., Reimann, B., Müller, H.J., Fink, G.R.: Sources of top-down control in visual search. J. Cogn. Neurosci. 21(11), 2100–2113 (2009)CrossRef
50.
Zurück zum Zitat Willmore, B.D., Mazer, J.A., Gallant, J.L.: Sparse coding in striate and extrastriate visual cortex. J. Neurophysiol. 105(6), 2907–2919 (2011)CrossRef Willmore, B.D., Mazer, J.A., Gallant, J.L.: Sparse coding in striate and extrastriate visual cortex. J. Neurophysiol. 105(6), 2907–2919 (2011)CrossRef
Metadaten
Titel
Capsule Networks as Generative Models
verfasst von
Alex B. Kiefer
Beren Millidge
Alexander Tschantz
Christopher L. Buckley
Copyright-Jahr
2023
DOI
https://doi.org/10.1007/978-3-031-28719-0_14

Premium Partner