Skip to main content

2019 | OriginalPaper | Buchkapitel

13. Deep Learning and Modelling of Audio-, Visual-, and Multimodal Audio-Visual Data in Brain-Inspired SNN

verfasst von : Nikola K. Kasabov

Erschienen in: Time-Space, Spiking Neural Networks and Brain-Inspired Artificial Intelligence

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This chapter presents methods for audio-, visual- and for the integrated audio and visual information processing using brain-inspired SNN architectures such as NeuCube. Case studies are presented for short musical pieces recognition, fast moving object recognition, age-invariant face identification, moving digits recognition and other.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat N. Kasabov, Evolving Connectionist Systems: The Knowledge Engineering Approach (Springer, 2007) N. Kasabov, Evolving Connectionist Systems: The Knowledge Engineering Approach (Springer, 2007)
2.
Zurück zum Zitat L. Benuskova, N. Kasabov, Computational Neurogenetic Modelling, Topics in Biomedical Engineering. International Book Series, ISBN 978-0-387-48355-9 L. Benuskova, N. Kasabov, Computational Neurogenetic Modelling, Topics in Biomedical Engineering. International Book Series, ISBN 978-0-387-48355-9
3.
Zurück zum Zitat G. Saraceno, Deep learning and memorizing of spectro-temporal data (music) in the spatio-temporal brain (Master Thesis, University of Trento, 2017) G. Saraceno, Deep learning and memorizing of spectro-temporal data (music) in the spatio-temporal brain (Master Thesis, University of Trento, 2017)
4.
Zurück zum Zitat J.L. Eriksson, A.E.P. Villa, Artificial neural networks simulation of learning of auditory equivalence classes for vowels, in International Joint Conference on Neural Networks, IJCNN. (Vancouver, 2006), pp. 1453–1460 J.L. Eriksson, A.E.P. Villa, Artificial neural networks simulation of learning of auditory equivalence classes for vowels, in International Joint Conference on Neural Networks, IJCNN. (Vancouver, 2006), pp. 1453–1460
6.
Zurück zum Zitat D.D. Greenwood, A cochlear frequency-position function for several species–29 years later. J. Acoust. Soc. Am. 87(6), 2592–2605 (1990) D.D. Greenwood, A cochlear frequency-position function for several species–29 years later. J. Acoust. Soc. Am. 87(6), 2592–2605 (1990)
7.
Zurück zum Zitat E. de Boer, H.R. de Jongh, On cochlear encoding: potentialities and limitations of the reverse-correlation technique. J. Acoust. Soc. Am. 63(1), 115–135 (1978) E. de Boer, H.R. de Jongh, On cochlear encoding: potentialities and limitations of the reverse-correlation technique. J. Acoust. Soc. Am. 63(1), 115–135 (1978)
8.
Zurück zum Zitat J.B. Allen, How do humans process and recognize speech? IEEE Trans. Speech Audio Process. 2(4), 567–577 (1994) J.B. Allen, How do humans process and recognize speech? IEEE Trans. Speech Audio Process. 2(4), 567–577 (1994)
9.
Zurück zum Zitat E. Zwicker, Subdivision of the audible frequency range into critical bands (Frequenzgruppen). J. Acoust. Soc. Am. 33(1961), 248 (1961)CrossRef E. Zwicker, Subdivision of the audible frequency range into critical bands (Frequenzgruppen). J. Acoust. Soc. Am. 33(1961), 248 (1961)CrossRef
10.
Zurück zum Zitat B.R. Glasberg, B.C. Moore, Derivation of auditory filter shapes from notched-noise data. Hear Res. 47(1–2), 103–138 (1990) B.R. Glasberg, B.C. Moore, Derivation of auditory filter shapes from notched-noise data. Hear Res. 47(1–2), 103–138 (1990)
11.
Zurück zum Zitat T.J. Cole, J.A. Blendy, A.P. Monaghan, K. Krieglstein, W. Schmid, A. Aguzzi, G. Fantuzzi, E. Hummler, K. Unsicker, G. Schütz, Targeted disruption of the glucocorticoid receptor gene blocks adrenergic chromaffin cell development and severely retards lung maturation. Genes Dev. 9(14), 1608–1621 (1995) T.J. Cole, J.A. Blendy, A.P. Monaghan, K. Krieglstein, W. Schmid, A. Aguzzi, G. Fantuzzi, E. Hummler, K. Unsicker, G. Schütz, Targeted disruption of the glucocorticoid receptor gene blocks adrenergic chromaffin cell development and severely retards lung maturation. Genes Dev. 9(14), 1608–1621 (1995)
12.
Zurück zum Zitat A.M. Aertsen, J.H. Olders, P.I. Johannesma, Spectro-temporal receptive fields of auditory neurons in the grassfrog. III. Analysis of the stimulus-event relation for natural stimuli. Biol. Cybern. 39(3), 195–209 (1981) A.M. Aertsen, J.H. Olders, P.I. Johannesma, Spectro-temporal receptive fields of auditory neurons in the grassfrog. III. Analysis of the stimulus-event relation for natural stimuli. Biol. Cybern. 39(3), 195–209 (1981)
13.
Zurück zum Zitat N. Kasabov, E. Postma, J. van den Herik, AVIS: a connectionist-based framework for integrated auditory and visual information processing. Inf. Sci. 143(2000), 147–148 (2000) N. Kasabov, E. Postma, J. van den Herik, AVIS: a connectionist-based framework for integrated auditory and visual information processing. Inf. Sci. 143(2000), 147–148 (2000)
14.
Zurück zum Zitat E.O. Postma, H.J. van den Herik, P.T.W. Hudson, Image recognition by brains and machines, in Brain-like Computing and Intelligent Information Systems, ed. by S. Amari, N. Kasabov (Springer, Singapore, 1998), pp. 25–47 E.O. Postma, H.J. van den Herik, P.T.W. Hudson, Image recognition by brains and machines, in Brain-like Computing and Intelligent Information Systems, ed. by S. Amari, N. Kasabov (Springer, Singapore, 1998), pp. 25–47
15.
Zurück zum Zitat E.O. Postma, H. J. van den Herik, P.T.W. Hudson, SCAN: a scalable model of covert attention. Neural Netw. 10, 993–1015 (1997) E.O. Postma, H. J. van den Herik, P.T.W. Hudson, SCAN: a scalable model of covert attention. Neural Netw. 10, 993–1015 (1997)
16.
Zurück zum Zitat K. Kim, N. Relkin, K.-M. Lee, J. Hirsch, Distinct cortical areas associated with native and second languages. Nature 388, 171–174 (1997) K. Kim, N. Relkin, K.-M. Lee, J. Hirsch, Distinct cortical areas associated with native and second languages. Nature 388, 171–174 (1997)
17.
Zurück zum Zitat S. Wysoski, L. Benuskova, N. Kasabov, Evolving spiking neural networks for audio-visual information processing. Neural Netw. 23(7), 819–835 (2010) S. Wysoski, L. Benuskova, N. Kasabov, Evolving spiking neural networks for audio-visual information processing. Neural Netw. 23(7), 819–835 (2010)
18.
Zurück zum Zitat S. Wysoski, L. Benuskova, N. Kasabov, Fast and adaptive network of spiking neurons for multi-view visual pattern recognition. Neurocomputing 71(14–15), 2563–2575 (2008) S. Wysoski, L. Benuskova, N. Kasabov, Fast and adaptive network of spiking neurons for multi-view visual pattern recognition. Neurocomputing 71(14–15), 2563–2575 (2008)
19.
Zurück zum Zitat A. Ross, A.K. Jain, Information fusion in biometrics. Pattern Recognit. Lett. 24(14), 2115–2145 (2003) A. Ross, A.K. Jain, Information fusion in biometrics. Pattern Recognit. Lett. 24(14), 2115–2145 (2003)
20.
Zurück zum Zitat C. Sanderson, K.K. Paliwal, Identity verification using speech and face information. Digital Signal Process. 14(2004), 449–480 (2004)CrossRef C. Sanderson, K.K. Paliwal, Identity verification using speech and face information. Digital Signal Process. 14(2004), 449–480 (2004)CrossRef
21.
Zurück zum Zitat A. Sharkey, Combining Artificial Neural Nets: Ensemble and Modular Multi-net Systems (Springer, Heidelberg, 1999) A. Sharkey, Combining Artificial Neural Nets: Ensemble and Modular Multi-net Systems (Springer, Heidelberg, 1999)
22.
Zurück zum Zitat B.E. Stein, M.A. Meredith, The Merging of the Senses (MIT Press, Cambridge, 1993) B.E. Stein, M.A. Meredith, The Merging of the Senses (MIT Press, Cambridge, 1993)
23.
Zurück zum Zitat S.G. Wysoski, L. Benuskova, N. Kasabov, Adaptive spiking neural networks for audiovisual pattern recognition, ICONIP. Lecture notes in computer science (2007) (to appear) S.G. Wysoski, L. Benuskova, N. Kasabov, Adaptive spiking neural networks for audiovisual pattern recognition, ICONIP. Lecture notes in computer science (2007) (to appear)
24.
Zurück zum Zitat C. Sanderson, K.K. Paliwal, Identity verification using speech and face information. Digital Signal Process. 14(2004), 449–480 (2004)CrossRef C. Sanderson, K.K. Paliwal, Identity verification using speech and face information. Digital Signal Process. 14(2004), 449–480 (2004)CrossRef
27.
Zurück zum Zitat C. Ge, N. Kasabov, Z. Liu, J. Yang, A spiking neural network model for obstacle avoidance in simulated prosthetic vision. Inf. Sci. 399(30–42), 2017 (2017) C. Ge, N. Kasabov, Z. Liu, J. Yang, A spiking neural network model for obstacle avoidance in simulated prosthetic vision. Inf. Sci. 399(30–42), 2017 (2017)
28.
Zurück zum Zitat A.R. McIntosh, R.E. Cabeza, N.J. Lobaugh, Analysis of neural interactions explains the activation of occipital cortex by an auditory stimulus. J. Neurophysiol. 80(1998), 2790–2796 (1998)CrossRef A.R. McIntosh, R.E. Cabeza, N.J. Lobaugh, Analysis of neural interactions explains the activation of occipital cortex by an auditory stimulus. J. Neurophysiol. 80(1998), 2790–2796 (1998)CrossRef
30.
Zurück zum Zitat P. Viola, M.J. Jones, Rapid object detection using a boosted cascade of simple features. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 1(2001), 511–518 (2001) P. Viola, M.J. Jones, Rapid object detection using a boosted cascade of simple features. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 1(2001), 511–518 (2001)
31.
Zurück zum Zitat A. Delorme, L. Perrinet, S. Thorpe, Networks of integrate-and-fire neurons using rank order coding. Neurocomputing 2001, 38–48 (2001) A. Delorme, L. Perrinet, S. Thorpe, Networks of integrate-and-fire neurons using rank order coding. Neurocomputing 2001, 38–48 (2001)
32.
Zurück zum Zitat A. Delorme, S. Thorpe, Face identification using one spike per neuron: resistance to image degradation. Neural Netw. 14(2001), 795–803 (2001)CrossRef A. Delorme, S. Thorpe, Face identification using one spike per neuron: resistance to image degradation. Neural Netw. 14(2001), 795–803 (2001)CrossRef
33.
Zurück zum Zitat D. Gonzalo, T. Shallice, R. Dolan, Time-dependent changes in learning audiovisual associations: a single-trial fMRI study. NeuroImage 11, 243–255 (2000)CrossRef D. Gonzalo, T. Shallice, R. Dolan, Time-dependent changes in learning audiovisual associations: a single-trial fMRI study. NeuroImage 11, 243–255 (2000)CrossRef
34.
Zurück zum Zitat A.M. Burton, V. Bruce, R.A. Johnston, Understanding face recognition with an interactive activation model. B. J. Psychol. 81, 361–380 (1990)CrossRef A.M. Burton, V. Bruce, R.A. Johnston, Understanding face recognition with an interactive activation model. B. J. Psychol. 81, 361–380 (1990)CrossRef
35.
Zurück zum Zitat A.W. Ellis, A. Young, D.C. Hay, Modelling the recognition of faces and words, in Modelling Cognition, ed. by P.E. Morris (Wiley, London, 1987), p. 1987 A.W. Ellis, A. Young, D.C. Hay, Modelling the recognition of faces and words, in Modelling Cognition, ed. by P.E. Morris (Wiley, London, 1987), p. 1987
36.
Zurück zum Zitat H.D. Ellis, D.M. Jones, N. Mosdell, Intra- and inter-modal repetition priming of familiar faces and voices. B. J. Psycol. 88, 143–156 (1997)CrossRef H.D. Ellis, D.M. Jones, N. Mosdell, Intra- and inter-modal repetition priming of familiar faces and voices. B. J. Psycol. 88, 143–156 (1997)CrossRef
37.
Zurück zum Zitat K. Kriegstein, A. von, Giraud, Implicit multisensory associations influence voice recognition. PLoS Biol. 4(10), 1809–1820 (2006) K. Kriegstein, A. von, Giraud, Implicit multisensory associations influence voice recognition. PLoS Biol. 4(10), 1809–1820 (2006)
40.
Zurück zum Zitat J.K. Ricanek, T. Tesafaye, Morph: a longitudinal image database of normal adult age-progression, in 7th International Conference on Automatic Face and Gesture Recognition. FGR 2006. (IEEE, 2006), pp. 341–345 J.K. Ricanek, T. Tesafaye, Morph: a longitudinal image database of normal adult age-progression, in 7th International Conference on Automatic Face and Gesture Recognition. FGR 2006. (IEEE, 2006), pp. 341–345
41.
Zurück zum Zitat L.G. Farkas, Anthropometry of the Head and Face (Raven Press, New York, 1994) L.G. Farkas, Anthropometry of the Head and Face (Raven Press, New York, 1994)
43.
Zurück zum Zitat L. Paulun, A. Abbott, N. Kasabov, A retinotopic spiking neural network system for accurate recognition of moving objects using NeuCube and dynamic vision sensors. Front. Comput. Neurosci. 12, 42 (2018) L. Paulun, A. Abbott, N. Kasabov, A retinotopic spiking neural network system for accurate recognition of moving objects using NeuCube and dynamic vision sensors. Front. Comput. Neurosci. 12, 42 (2018)
44.
Zurück zum Zitat D. Purves, Neuroscience (Sinauer, Sunderland, MA, 2014) D. Purves, Neuroscience (Sinauer, Sunderland, MA, 2014)
51.
Zurück zum Zitat S. Thorpe, J. Gautrais, Rank order coding, in Computational Neuroscience: Trends in Research, 1998, ed. by J.M. Bower (Springer US, Boston, 1999), pp. 114–118 S. Thorpe, J. Gautrais, Rank order coding, in Computational Neuroscience: Trends in Research, 1998, ed. by J.M. Bower (Springer US, Boston, 1999), pp. 114–118
53.
Zurück zum Zitat M. Nelson, J. Rinzel, The Hodgkin-Huxley model, in The Book of GENESIS, vol. 4, ed. by J.M. Bower, D. Beeman, (Springer, New York, 1995), pp. 27–51 M. Nelson, J. Rinzel, The Hodgkin-Huxley model, in The Book of GENESIS, vol. 4, ed. by J.M. Bower, D. Beeman, (Springer, New York, 1995), pp. 27–51
54.
Zurück zum Zitat S. Monsell, J. Driver, Control of Cognitive Processes: Attention and Performance XVIII (MIT Press, Cambridge, 2000) S. Monsell, J. Driver, Control of Cognitive Processes: Attention and Performance XVIII (MIT Press, Cambridge, 2000)
55.
Zurück zum Zitat J.A. Perez-Carrasco, C. Serrano, B. Acha, T. Serrano-Gotarredona, B. Linares-Barranco, spike-based convolutional network for real-time processing, in Proceedings of 20th International Conference on Pattern Recognition (ICPR), Istanbul, Turkey, 23–26 Aug 2010 (IEEE, Piscataway, NJ, 2010), pp. 3085–3088 J.A. Perez-Carrasco, C. Serrano, B. Acha, T. Serrano-Gotarredona, B. Linares-Barranco, spike-based convolutional network for real-time processing, in Proceedings of 20th International Conference on Pattern Recognition (ICPR), Istanbul, Turkey, 23–26 Aug 2010 (IEEE, Piscataway, NJ, 2010), pp. 3085–3088
57.
Zurück zum Zitat A. Jimenez-Fernandez, C. Lujan-Martinez, R. Paz-Vicente, A. Linares-Barranco, G. Jimenez, A. Civit, in From Vision Sensor to Actuators, Spike Based Robot Control through Address-Event-Representation. IWANN 2009: Bio-Inspired Systems: Computational and Ambient Intelligence (2009), pp. 797–804 A. Jimenez-Fernandez, C. Lujan-Martinez, R. Paz-Vicente, A. Linares-Barranco, G. Jimenez, A. Civit, in From Vision Sensor to Actuators, Spike Based Robot Control through Address-Event-Representation. IWANN 2009: Bio-Inspired Systems: Computational and Ambient Intelligence (2009), pp. 797–804
58.
Zurück zum Zitat F. Perez-Peña, A. Morgado-Estevez, A. Linares-Barranco, A. Jimenez-Fernandez, F. Gomez-Rodriguez, G. Jimenez-Moreno, et al., Neuro-inspired spike-based motion: from dynamic vision sensor to robot motor open-loop control through spike-VITE. Sensors 14, 15805–15832 (2014). https://doi.org/10.3390/s141115805 (Basel, Switzerland) F. Perez-Peña, A. Morgado-Estevez, A. Linares-Barranco, A. Jimenez-Fernandez, F. Gomez-Rodriguez, G. Jimenez-Moreno, et al., Neuro-inspired spike-based motion: from dynamic vision sensor to robot motor open-loop control through spike-VITE. Sensors 14, 15805–15832 (2014). https://​doi.​org/​10.​3390/​s141115805 (Basel, Switzerland)
Metadaten
Titel
Deep Learning and Modelling of Audio-, Visual-, and Multimodal Audio-Visual Data in Brain-Inspired SNN
verfasst von
Nikola K. Kasabov
Copyright-Jahr
2019
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-57715-8_13