Skip to main content

2016 | OriginalPaper | Buchkapitel

Risk Sensitive Reinforcement Learning Scheme Is Suitable for Learning on a Budget

verfasst von : Kazuyoshi Kato, Koichiro Yamauchi

Erschienen in: Neural Information Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Risk-sensitive reinforcement learning (Risk-sensitiveRL) has been studied by many researchers. The methods are based on a prospect method, which imitates the value function of a human. Although they are mainly intended at imitating human behaviors, there are fewer discussions about the engineering meaning of it. In this paper, we show that Risk-sensitiveRL is useful for using online-learning machines whose resources are limited. In such a learning method, a part of the learned memories should be removed to create space for recording a new important instance. The experimental results show that risk-sensitive RL is superior to normal RL. This might mean that the human brain is also constructed by a limited number of neurons, so that humans hire the risk-sensitive value function for the learning.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Igushi, K., Ogiso, T., Yamauchi, K.: Acceleration of reinforcement learning via game-based renewal energy management system. In: SCISISIS 2014, pp. 415–420. The Institute of Electrical and Electronics Engineers, Inc., New York, December 2014 Igushi, K., Ogiso, T., Yamauchi, K.: Acceleration of reinforcement learning via game-based renewal energy management system. In: SCISISIS 2014, pp. 415–420. The Institute of Electrical and Electronics Engineers, Inc., New York, December 2014
2.
Zurück zum Zitat Ogiso, T., Yamauchi, K., Ishii, N., Suzuki, Y.: Co-learning system for humans and machines using a weighted majority-based method. Int. J. Hybrid Intell. Syst. 13, 63–76 (2016)CrossRef Ogiso, T., Yamauchi, K., Ishii, N., Suzuki, Y.: Co-learning system for humans and machines using a weighted majority-based method. Int. J. Hybrid Intell. Syst. 13, 63–76 (2016)CrossRef
3.
Zurück zum Zitat Shen, Y., Tobia, M.J., Sommer, T., Obermayer, K.: Risk-sensitive reinforcement learning. Neural Comput. 26, 1298–1328 (2014)MathSciNetCrossRef Shen, Y., Tobia, M.J., Sommer, T., Obermayer, K.: Risk-sensitive reinforcement learning. Neural Comput. 26, 1298–1328 (2014)MathSciNetCrossRef
4.
Zurück zum Zitat Kahneman, D., Tversky, A.: Prospect theory: an analysis of decision under risk. Econometrica 47(2), 263–291 (1979)CrossRefMATH Kahneman, D., Tversky, A.: Prospect theory: an analysis of decision under risk. Econometrica 47(2), 263–291 (1979)CrossRefMATH
5.
Zurück zum Zitat Walter, F.E., Schweitzer, F.: Risk-seeking versus risk-avoiding investments in noisy periodic environments. Int. J. Mod. Phys. C 19(6), 971–994 (2008)CrossRefMATH Walter, F.E., Schweitzer, F.: Risk-seeking versus risk-avoiding investments in noisy periodic environments. Int. J. Mod. Phys. C 19(6), 971–994 (2008)CrossRefMATH
6.
Zurück zum Zitat Amari, S., Park, H., Fukumizu, K.: Adaptive method of realizing natural gradient learning for multilayer perceptrons. Neural Comput. 12, 1399–1409 (2000)CrossRef Amari, S., Park, H., Fukumizu, K.: Adaptive method of realizing natural gradient learning for multilayer perceptrons. Neural Comput. 12, 1399–1409 (2000)CrossRef
7.
Zurück zum Zitat Dekel, O., Shalev-Shwartz, S., Singer, Y.: The forgetron: a kernel-based perceptron on a budget. SIAM J. Comput. (SICOMP) 37(5), 1342–1372 (2008)MathSciNetCrossRefMATH Dekel, O., Shalev-Shwartz, S., Singer, Y.: The forgetron: a kernel-based perceptron on a budget. SIAM J. Comput. (SICOMP) 37(5), 1342–1372 (2008)MathSciNetCrossRefMATH
8.
Zurück zum Zitat Orabona, F., Keshet, J., Caputo, B.: The projectron: a bounded kernel-based perceptron. In: ICML 2008, pp. 720–727 (2008) Orabona, F., Keshet, J., Caputo, B.: The projectron: a bounded kernel-based perceptron. In: ICML 2008, pp. 720–727 (2008)
9.
Zurück zum Zitat He, W., Si, W.: A kernel-based perceptron with dynamic memory. Neural Netw. 25, 105–113 (2011) He, W., Si, W.: A kernel-based perceptron with dynamic memory. Neural Netw. 25, 105–113 (2011)
10.
Zurück zum Zitat Yamauchi, K.: Pruning with replacement and automatic distance metric detection in limited general regression neural networks. In: IJCNN 2011, pp. 899–906. The Institute of Electrical and Electronics Engineers, Inc., New York, July 2011 Yamauchi, K.: Pruning with replacement and automatic distance metric detection in limited general regression neural networks. In: IJCNN 2011, pp. 899–906. The Institute of Electrical and Electronics Engineers, Inc., New York, July 2011
11.
Zurück zum Zitat Yamauchi, K.: Incremental learning on a budget and its application to quick maximum power point tracking of photovoltaic systems. J. Adv. Comput. Intell. Intell. Inform. 18(4), 682–696 (2014)MathSciNetCrossRef Yamauchi, K.: Incremental learning on a budget and its application to quick maximum power point tracking of photovoltaic systems. J. Adv. Comput. Intell. Intell. Inform. 18(4), 682–696 (2014)MathSciNetCrossRef
12.
Zurück zum Zitat Garcìa, S., Derrac, J., Cano, J.R., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417–435 (2012)CrossRef Garcìa, S., Derrac, J., Cano, J.R., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417–435 (2012)CrossRef
13.
Zurück zum Zitat Lee, D., Noh, S.H., Min, S.L., Choi, J., Kim, J.H., Cho, Y., Sang, K.C.: LRFU: a spectrum of policies that subsumes the least recently used and least frequently used policies. IEEE Trans. Comput. 50(12), 1352–1361 (2001)MathSciNetCrossRef Lee, D., Noh, S.H., Min, S.L., Choi, J., Kim, J.H., Cho, Y., Sang, K.C.: LRFU: a spectrum of policies that subsumes the least recently used and least frequently used policies. IEEE Trans. Comput. 50(12), 1352–1361 (2001)MathSciNetCrossRef
14.
Zurück zum Zitat Kondo, Y., Yamauchi, K.: A dynamic pruning strategy for incremental learning on a budget. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014, Part I. LNCS, vol. 8834, pp. 295–303. Springer, Heidelberg (2014) Kondo, Y., Yamauchi, K.: A dynamic pruning strategy for incremental learning on a budget. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014, Part I. LNCS, vol. 8834, pp. 295–303. Springer, Heidelberg (2014)
15.
Zurück zum Zitat Kato, H., Yamauchi, K.: Quick MPPT microconverter using a limited general regression neural network with adaptive forgetting. In: 2015 International Conference on Sustainable Energy Engineering and Application (ICSEEA), pp. 42–48. The Institute of Electrical and Electronics Engineers, Inc., New York, February 2016 Kato, H., Yamauchi, K.: Quick MPPT microconverter using a limited general regression neural network with adaptive forgetting. In: 2015 International Conference on Sustainable Energy Engineering and Application (ICSEEA), pp. 42–48. The Institute of Electrical and Electronics Engineers, Inc., New York, February 2016
16.
Zurück zum Zitat Sutton, R.S.: Generalization in reinforcement learning: successful examples using sparse coarse coding. Adv. Neural Inf. Process. Syst. 8, 1038–1044 (1995) Sutton, R.S.: Generalization in reinforcement learning: successful examples using sparse coarse coding. Adv. Neural Inf. Process. Syst. 8, 1038–1044 (1995)
17.
Zurück zum Zitat Ryuichi UEDA: Comparison of data amount for representing decision making policy. In: Burgard, W., Dillmann, R., Plagemann, C., Vahrenkamp, N. (eds.) Intelligent Autonomous Systems 10 (IAS 2010), vol. 10, pp. 26–35. IOS Press (2008) Ryuichi UEDA: Comparison of data amount for representing decision making policy. In: Burgard, W., Dillmann, R., Plagemann, C., Vahrenkamp, N. (eds.) Intelligent Autonomous Systems 10 (IAS 2010), vol. 10, pp. 26–35. IOS Press (2008)
Metadaten
Titel
Risk Sensitive Reinforcement Learning Scheme Is Suitable for Learning on a Budget
verfasst von
Kazuyoshi Kato
Koichiro Yamauchi
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-46675-0_23