Skip to main content
Top

2016 | OriginalPaper | Chapter

Reward-Based Learning of a Memory-Required Task Based on the Internal Dynamics of a Chaotic Neural Network

Authors : Toshitaka Matsuki, Katsunari Shibata

Published in: Neural Information Processing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We have expected that dynamic higher functions such as “thinking” emerge through the growth from exploration in the framework of reinforcement learning (RL) using a chaotic Neural Network (NN). In this frame, the chaotic internal dynamics is used for exploration and that eliminates the necessity of giving external exploration noises. A special RL method for this framework has been proposed in which “traces” were introduced. On the other hand, reservoir computing has shown its excellent ability in learning dynamic patterns. Hoerzer et al. showed that the learning can be done by giving rewards and exploration noises instead of explicit teacher signals. In this paper, aiming to introduce the learning ability into our new RL framework, it was shown that the memory-required task in the work of Hoerzer et al. could be learned without giving exploration noises by utilizing the chaotic internal dynamics while the exploration level was adjusted flexibly and autonomously. The task could be learned also using “traces”, but still with problems.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Shibata, K., Okabe, Y.: Reinforcement learning when visual signals are directly given as inputs. In: Proceedings of ICNN 1997, vol. 3, pp. 1716–1720 (1997) Shibata, K., Okabe, Y.: Reinforcement learning when visual signals are directly given as inputs. In: Proceedings of ICNN 1997, vol. 3, pp. 1716–1720 (1997)
2.
go back to reference Shibata, K.: Emergence of intelligence through reinforcement learning with a neural network. In: Mellouk, A. (ed.) Advances in Reinforcement Learning, pp. 99–120. InTech, Rijeka (2011) Shibata, K.: Emergence of intelligence through reinforcement learning with a neural network. In: Mellouk, A. (ed.) Advances in Reinforcement Learning, pp. 99–120. InTech, Rijeka (2011)
3.
go back to reference Shibata, K., Utsunomiya, H.: Discovery of pattern meaning from delayed rewards by reinforcement learning with a recurrent neural network. In: Proceedings of IJCNN, pp. 1445–1452 (2011) Shibata, K., Utsunomiya, H.: Discovery of pattern meaning from delayed rewards by reinforcement learning with a recurrent neural network. In: Proceedings of IJCNN, pp. 1445–1452 (2011)
4.
go back to reference Shibata, K., Goto, K.: Emergence of flexible prediction-based discrete decision making and continuous motion generation through actor-q-learning. In: Proceedings of ICDL-Epirob, ID 15 (2013) Shibata, K., Goto, K.: Emergence of flexible prediction-based discrete decision making and continuous motion generation through actor-q-learning. In: Proceedings of ICDL-Epirob, ID 15 (2013)
5.
go back to reference Sawatsubashi, Y., et al.: Emergence of discrete and abstract state representation in continuous input task. In: Robot Intelligence Technology and Applications, pp. 13–22 (2012) Sawatsubashi, Y., et al.: Emergence of discrete and abstract state representation in continuous input task. In: Robot Intelligence Technology and Applications, pp. 13–22 (2012)
6.
go back to reference Shibata, K., Sakashita, Y.: Reinforcement learning with internal-dynamics-based exploration using a chaotic neural network. In: Proceedings of International Joint Conference on Neural Networks (IJCNN) (2015). 2015.7 Shibata, K., Sakashita, Y.: Reinforcement learning with internal-dynamics-based exploration using a chaotic neural network. In: Proceedings of International Joint Conference on Neural Networks (IJCNN) (2015). 2015.7
7.
go back to reference Goto, Y., Shibata, K.: Emergence of higher exploration in reinforcement learning using a chaotic neural network. In: Akira, H., Seiichi, O., Kenji, D., Kazushi, I., Minho, L., Derong, L. (eds.) ICONIP 2016. LNCS, pp. 40–48. Springer, Heidelberg (2016) Goto, Y., Shibata, K.: Emergence of higher exploration in reinforcement learning using a chaotic neural network. In: Akira, H., Seiichi, O., Kenji, D., Kazushi, I., Minho, L., Derong, L. (eds.) ICONIP 2016. LNCS, pp. 40–48. Springer, Heidelberg (2016)
8.
go back to reference Jaeger, H.: The “echo state” approach to analysing and training recurrent neural networks. GMD report 148, p. 43 (2001) Jaeger, H.: The “echo state” approach to analysing and training recurrent neural networks. GMD report 148, p. 43 (2001)
9.
go back to reference Maass, W., Natschlger, T., Markram, H.: Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Comput. 14(11), 2531–2560 (2002)CrossRefMATH Maass, W., Natschlger, T., Markram, H.: Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Comput. 14(11), 2531–2560 (2002)CrossRefMATH
10.
go back to reference Sussillo, D., Abbott, L.F.: Generating coherent patterns of activity from chaotic neural networks. Neuron 63(4), 544–557 (2009)CrossRef Sussillo, D., Abbott, L.F.: Generating coherent patterns of activity from chaotic neural networks. Neuron 63(4), 544–557 (2009)CrossRef
11.
go back to reference Hoerzer, G.M., Legenstein, R., Maass, W.: Emergence of complex computational structures from chaotic neural networks through reward-modulated Hebbian learning. Cereb. Cortex 24(3), 677–690 (2014)CrossRef Hoerzer, G.M., Legenstein, R., Maass, W.: Emergence of complex computational structures from chaotic neural networks through reward-modulated Hebbian learning. Cereb. Cortex 24(3), 677–690 (2014)CrossRef
Metadata
Title
Reward-Based Learning of a Memory-Required Task Based on the Internal Dynamics of a Chaotic Neural Network
Authors
Toshitaka Matsuki
Katsunari Shibata
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-46687-3_42

Premium Partner