nach oben

Cognitive Computation

Erschienen in:

25.09.2017

A Brain-Inspired Decision Making Model Based on Top-Down Biasing of Prefrontal Cortex to Basal Ganglia and Its Application in Autonomous UAV Explorations

verfasst von: Feifei Zhao, Yi Zeng, Guixiang Wang, Jun Bai, Bo Xu

Erschienen in: Cognitive Computation | Ausgabe 2/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Decision making is a fundamental ability for intelligent agents (e.g., humanoid robots and unmanned aerial vehicles). During decision making process, agents can improve the strategy for interacting with the dynamic environment through reinforcement learning. Many state-of-the-art reinforcement learning models deal with relatively smaller number of state-action pairs, and the states are preferably discrete, such as Q-learning and Actor-Critic algorithms. While in practice, in many scenario, the states are continuous and hard to be properly discretized. Better autonomous decision making methods need to be proposed to handle these problems. Inspired by the mechanism of decision making in human brain, we propose a general computational model, named as prefrontal cortex-basal ganglia (PFC-BG) algorithm. The proposed model is inspired by the biological reinforcement learning pathway and mechanisms from the following perspectives: (1) Dopamine signals continuously update reward-relevant information for both basal ganglia and working memory in prefrontal cortex. (2) We maintain the contextual reward information in working memory. This has a top-down biasing effect on reinforcement learning in basal ganglia. The proposed model separates the continuous states into smaller distinguishable states, and introduces continuous reward function for each state to obtain reward information at different time. To verify the performance of our model, we apply it to many UAV decision making experiments, such as avoiding obstacles and flying through window and door, and the experiments support the effectiveness of the model. Compared with traditional Q-learning and Actor-Critic algorithms, the proposed model is more biologically inspired, and more accurate and faster to make decision.

Vorheriger Artikel Anatomical Pattern Analysis for Decoding Visual Stimuli in Human Brains

Nächster Artikel Toward Robot Self-Consciousness (II): Brain-Inspired Robot Bodily Self Model for Self-Recognition

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Botvinick MM. Hierarchical reinforcement learning and decision making. Curr Opin Neurobiol. 2012;22(6): 956–962.CrossRefPubMed

Lee D, Seo H, Jung MW. Neural basis of reinforcement learning and decision making. Ann Rev Neurosci. 2012;35(1):287–308.CrossRefPubMedPubMedCentral

Humphrys M. Action selection methods using reinforcement learning. Proceedings of the International Conference on Simulation of Adaptive Behavior; 1996. p. 135–144.

Arel I. Theoretical foundations of artificial general intelligence, chapter deep reinforcement learning as foundation for artificial general Intelligence:89–102. 2012.

Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M. Playing atari with deep reinforcement learning. 2013. arXiv:1312.5602.

Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D. Human-level control through deep reinforcement learning. Nature. 2015;518:529–533.CrossRefPubMed

Hearn RA, Granger RH. Learning hierarchical representations and behaviors. Association for the Advancement of Artificial Intelligence. 2008.

Schultz W, Dickinson A. Neuronal coding of prediction errors. Ann Rev Neurosci. 2000;23:473–500.CrossRefPubMed

Alexander GE, Crutcher MD. Functional architecture of basal ganglia circuits Neural substrates of parallel processing. Trends Neurosci. 1990;13(7):266–271.CrossRefPubMed

10.

Gerfen CR. The neostriatal mosaic: multiple levels of compartmental organization in the basal ganglia. J Neural Transm Suppl. 1992;36(4):43–59.PubMed

11.

Joel D, Weiner I. The organization of the basal ganglia-thalamocortical circuits: open interconnected rather than closed segregated. Neuroscience. 1994;63(2):363–379.CrossRefPubMed

12.

Joel D, Weiner I. The connections of the primate subthalamic nucleus: indirect pathways and the open-interconnected scheme of basal ganglia-thalamocortical circuitry. Brain Res Rev. 1997;23:62–78.CrossRefPubMed

13.

Parent A. Extrinsic connections of the basal ganglia. Trends Neurosci. 1990;13(7):254–258.CrossRefPubMed

14.

Joel D, Weiner I. The connections of the dopaminergic system with the striatum in rats and primates: an analysis with respect to the functional and compartmental organization of the striatum. Neuroscience. 2000;96(3): 451–474.CrossRefPubMed

15.

Schultz W, Apicella P, Scarnati E, Ljungberg T. Neuronal activity in monkey ventral striatum related to the expectation of reward. J Neurosci. 1992;12(12):4595–4610.CrossRefPubMed

16.

O’Reilly RC, Frank MJ. Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia. Neural Comput. 2006;18(2):283–328.CrossRefPubMed

17.

Frank MJ, Claus ED. Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. Psychol Rev. 2006;113(2):300–326.CrossRefPubMed

18.

Dayan P, Daw ND. Decision theory, reinforcement learning, and the brain. Cogn Affect Behav Neurosci. 2008; 8(4):429–453.CrossRefPubMed

19.

Shadlen MN, Newsome WT. Motion perception: seeing and deciding. Proc Natl Acad Sci. 1996;93(2):628–633.CrossRefPubMedPubMedCentral

20.

Karni E. A theory of bayesian decision making with action-dependent subjective probabilities. Econ Theory. 2011; 48(1):125–146.CrossRef

21.

Mnih V, Badia AP, Mirza M, Graves A, Lillicrap TP, Harley T, Silver D, Kavukcuoglu K. Asynchronous methods for deep reinforcement learning. Proceedings of the 33th international conference on machine learning; 2016. p. 1928–1937.

22.

Timothy P, Lillicrap J, Hunt J, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D. Continuous control with deep reinforcement learning. 2015. arXiv:1509.02971.

23.

Hasselt HV, Guez A, Silver D. Deep reinforcement learning with double q-learning. Proceedings of the 30th AAAI conference on artificial intelligence; 2016.

24.

Nair A, Srinivasan P, Blackwell S, Alcicek C, Fearon R, De Maria A, Panneershelvam V, Suleyman M, Beattie C, Petersen S. Massively parallel methods for deep reinforcement learning. 2015. arXiv:1507.04296.

25.

Barto AG, Mahadevan S. Recent advances in hierarchical reinforcement learning. Discrete Event Dyn Syst. 2003;13(1):41–77.CrossRef

26.

Morimoto J, Doyayy K. Hierarchical reinforcement learning of low-dimensional subgoals and high-dimensional trajectories. Proceedings of the 5th International Conference on Neural Information Processing; 1998. p. 850–853.

27.

Smart WD, Kaelbling LP. Practical reinforcement learning in continuous spaces. Proceedings of the 17th International Conference on Machine Learning; 2000. p. 903–910.

28.

Lazaric A, Restelli M, Bonarini A. Reinforcement learning in continuous action spaces through sequential monte carlo methods. Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems; 2007. p. 833–840.

29.

Joel D, Niv Y, Ruppin E. Actor-ccritic models of the basal ganglia: new anatomical and computational perspectives. Neural Netw. 2002;15(4):535–547.CrossRefPubMed

30.

Frémaux N, Sprekeler H, Gerstner W. Reinforcement learning using a continuous time actor-critic framework with spiking neurons. PLOS Comput Biology. 2013;9(4):1–21.CrossRef

31.

Ellaithy K, Bogdan M. A reinforcement learning framework for spiking networks with dynamic synapses. Comput Intell Neuroscience. 2011;2011(3):713–750.

32.

Kim HF, Hikosaka O. Parallel basal ganglia circuits for voluntary and automatic behaviour to reach rewards. Brain. 2015;138(7):1776–1800.CrossRefPubMedPubMedCentral

33.

Berns GS, Sejnowski TJ. A computational model of how the basal ganglia produce sequences. J Cogn Neurosci. 1998;10(1):108–121.CrossRefPubMed

34.

Kumaravelu K, Brocker DT, Grill WM. A biophysical model of the cortex-basal ganglia-thalamus network in the 6-ohda lesioned rat model of parkinson’s disease. J Comput Neurosci. 2016;40(2):207–229.CrossRefPubMedPubMedCentral

35.

Debnath S, Nassour J. Extending cortical-basal inspired reinforcement learning model with success-failure experience. Proceedings of 4th IEEE International Conference on Development and Learning and on Epigenetic Robotics; 2014. p. 293–298.

36.

Vijay R, John N. Tsitsiklis Konda actor-critic algorithms. SLAM J Control Optim. 2003;42(4):1143–1166.CrossRef

37.

Grondman I, Busoniu L, Lopes G, Babuska R. A survey of actor-critic reinforcement learning Standard and natural policy grdients. IEEE Trans Syst Man Cybern. 2012;42(6):1291–1307.CrossRef

38.

Sutton RS, Barto AG. 1998. Reinforcement Learning: an introduction, chapter the reinforcement learning problem:70–71.

39.

Sutton RS, Barto AG. Reinforcement Learning: an introduction, chapter temporal-difference learning:188–190. 1998.

40.

Sutton RS, Barto AG. Reinforcement Learning: an introduction, chapter evaluative feedback:40–42. 1998.

41.

Sutton RS, Barto AG. Reinforcement Learning: an introduction, chapter temporal-difference learning:185–186. 1998.

Titel: A Brain-Inspired Decision Making Model Based on Top-Down Biasing of Prefrontal Cortex to Basal Ganglia and Its Application in Autonomous UAV Explorations
verfasst von: Feifei Zhao
Yi Zeng
Guixiang Wang
Jun Bai
Bo Xu
Publikationsdatum: 25.09.2017
Verlag: Springer US
Erschienen in: Cognitive Computation / Ausgabe 2/2018
Print ISSN: 1866-9956
Elektronische ISSN: 1866-9964
DOI: https://doi.org/10.1007/s12559-017-9511-3

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2018

A Novel Picture Fuzzy Linguistic Aggregation Operator and Its Application to Group Decision-making

Semantic Scene Mapping with Spatio-temporal Deep Neural Network for Robotic Applications

Toward Robot Self-Consciousness (II): Brain-Inspired Robot Bodily Self Model for Self-Recognition

A Primal Neural Network for Online Equality-Constrained Quadratic Programming

Similar Vague Concepts Selection Using Their Euclidean Distance at Different Granulation

Special Issue of BICS 2016

Premium Partner