Top

International Journal of Social Robotics

Published in:

Open Access 28-02-2019

Peers’ Experience Learning for Developmental Robots

Authors: Ruixuan Wei, Qirui Zhang, Zhuofan Xu

Published in: International Journal of Social Robotics | Issue 1/2020

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Patentsearch

Off

Abstract

Humans have a fundamental ability that is to learn others’ experience for their own use, while humanoid robots don’t have. Several attempts have been made for specific situations in evolution and study of developmental robots. However, such attempts have provided limitations, e.g. others’ experience learning get overlooked. The present article proposes peers’ experience learning method, which first reviewed the evolution and development of developmental robots as some typical studies revealed, moving from humanlike to developmental. These terms are then reconsidered from humanoid robots’ viewpoint, particularly with the developmental principles: the verification principle and the embodiment principle. Next, a conceptual model of peers’ experience learning is proposed based on the principles, and the simulation results show that robots can “copy” peers’ experience to cognize and develop automatically. Finally, a general discussion and proposals for addressing future issues are given.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Peers’ experience learning (PEL) helps humans to realize the contexts that they never heard before, which is important for humans’ mental development. It is also regarded as an essential requirement of future intelligent robots. To humanoid robots, PEL has been recently addressed in brief studies from the perspective of mental development [1], and several attempts have been made to address specific context of PEL (e.g. [2, 3]). Designers expect to manifest PEL behaviors towards humans, and to immigrate the mechanism to humanoid robots. Many efforts have been made in aiming at constructing a more diligent and more humanlike robot.

Elman et al. [4] attempted to define “development” to better clarify an approach while designing developmental robots (DR). However, their definitions are not precise and do not seem to be supported by simulations and experimental results. Thus Weng [5] gave an implemented developmental algorithm based on image analysis. They further proposed a developmental modeling [6] for agents, making developmental robots be able to learn internally via interaction with people.

Before long, a number of designs and applications integrating with developmental algorithms increased dramatically to make DR be closely to real scientific practice [7]. Major remarkable ones include Anticipating Rewards Robot proposed by Blanchard and Canamero [8], the developmental framework by Meng and Lee [9] SS-RICS by Kelley [10] and LOC based on ACT-R by Zelazo [11], etc. They incorporated many useful concepts that are necessary for human intelligence, providing diverse ways to give an impetus to DR according to brain mechanisms and biobehavioral studies. However, humanlike interactive learning is still a complex problem.

The importance of interactive learning is discussed by Asada [3], who together with his team have constructed a conceptual model based on an infant robot, providing more authentic experience sharing experiments [5, 12, 13]. Their work mainly focus on how can humans affective developmental processes by synthetic or constructive approaches, but PEL gets overlooked, especially transplanting and verifying peers’ experience based on self-other distinction.

Figure 1 shows a schematic depiction of evolution and development in the context of DR thus far. The horizontal axis is the time and the vertical axis indicates the “developmental level”, this begins with a trajectory from cognitive and humanlike robots, via self-developmental robots, and ending with interactive and developmental robots. Figure 1 shows, accordingly, that research progresses over time.

Based on views from Stoytchev [14], there are two basic principles of DR that are verification principle and embodiment principle. So PEL is expected to be achieved through interacting with other experienced robots to engraft peers’ experience, furthermore, to verify the experience knowledge based on its own embodiment for being self-serving. Asada et al. [15] have advocated that the key technologies of DR are interaction and development. However, such work has not been adequately precise from bio-behavioral and brain-science perspective.

The present paper proposes a DR based on PEL, in order to better understand experience sharing processes through synthetic and constructive approaches, especially regarding transplanting peers’ experience as own experience knowledge.

The rest of the article is organized as follows. Section 2 introduces PEL from humans’ perspective. Section 3 provides a conceptual model of DR based on PEL. And the simulation results are presented in Sect. 4. Finally, the conclusions and future developments are discussed.

2 Peers’ Experience Learning of Humans

Humans have a fundamental function that is to learn from others, which means they share the experience though everyone is unique. Suppose that one need to go across a labyrinthine environment and there are many narrow paths. And he needs to ask his peers, who have experienced the environment before, for more information. His peers may tell him watch out the narrow paths because it is difficult to get passed. So he needs to make a comparison between him and his peers, to ensure the successful pass. And the own experience will be learnt when he get passed. It’s a hypothetical scenario, but similar ones are happening with far too much regularity in PEL of humans. PEL between humans is a necessary means of everyday social communication and it paves the way for the development of humans [16].

In the mammalian brain, feedback connections in the brain play important roles in PEL [17]. Hideyuki et al. [18] points out that PEL obtained through social interaction with a variety of individuals uniquely modulate activity of brain network. Moreover, they found that there are two mental dimensions in brain when one interacting with others (see in Fig. 2): one represented “mind-holderness (the red in Fig. 2)” in which human borrow from the experiences of others by interacting, while the other dimension represented “mind-readerness (the blue in Fig. 2)” in which human use others’ experiences for reference.

As a result, human first “read” peers’ experience, then justify the reasonability and verify the feasibility by itself, finally “hold” the justified and verified experience into their own use [19]. So the “mind-readness” part and the “mind-holdness” part are important to humans’ PEL, then a PEL mechanism of human can be derived based on the studies of brain science [20], which is shown in Fig. 3. In Fig. 3, the black flow represents the PEL processes: (1) Learn. Human must have a way to learn peers’ experience. (2) Sense. Human use eyes or hands, to see or to touch, so as to sense the environment [21]. (3) Preprocess. The information of obstacles is mass and unclassified, so the sensed information should be classified and preprocessed. (4) Think. Humans use brain to think whether to take the peers’ experience or not so as to generate action strategies, and to develop its brain. (5) Pre-act. The action strategies are generated after thinking, but they cannot be decoded and acted by limbs, so they should be clarified so as to transform the strategies to body language. (6) Act. The limbs follow the commands from brain to execute action commands.

3 A Brain-Like PEL Model for Developmental Robots

3.1 Conceptual Architecture for DR

We adhere to the developmental principles as outlined by Stoytchev [14] to meet the requirements of DR. The relevant points are as follows:

(a)

The verification principle: EDR can learn from and maintain peers’ experience knowledge only to the extent that it can verify that knowledge (tried-and-true knowledge) itself.

(b)

The embodiment principle: a robot is always about the same from birth, but they are distinguishing and self-specifying from each other. So the DR should have the ability to remodify peers’ experience on the basis of its own embodiment parameters.

Based on the two principles and the PEL mechanism of humans, the DR architecture can be obtained as is shown in Fig. 4, where the PEL of DR is described as the black flow: (1) Learn. To DR, it transplants peers’ experience via uniform protocol between each other. (2) Sense. The DR use sensors onboard to sense environment around. (3) Preprocessor. To preprocess the mass and unclassified information. (4) Brain. This part purposes on thinking the proper action strategies, and reserving the strategies to develop DR itself. (5) Pre-act. The action strategies cannot be decoded and acted by robots, so it should be clarified so as to be transformed to commands. (6) Act. The DR follow the commands to act in environment.

3.2 Peers’ Experience Learning

PEL is conducted at discrete time instances ($ t = 0,1,2 \ldots $) through following definitions:

Definition 1

A robot agent $ M $ may have several sensors (including exteroceptive and interoceptive sensors that sense stimuli from external and internal environment, see, e.g., Weng [22]) and effectors, whose sensory and control signal at time $ t $ is collectively denoted by $ \varvec{S}\left( t \right) $ and $ {\varvec{C}}\left( t \right) $. $ M $’s embodiment is denoted by $ {\varvec{E}} $ (it’s not time-varying) and peers’ experience denoted by $ {\varvec{P}}\left( t \right) $.

Definition 2

$ M $ has a “brain” denoted by $ {\varvec{B}}\left( t \right) $, and the time-varying state-update function $ f_{t} $ updates $ {\varvec{B}}\left( t \right) $ at each time $ t $ based on: (1) sensory input $ {\varvec{S}}\left( t \right) $, (2) control output $ {\varvec{C}}\left( t \right) $, (3) peers’ experience knowledge $ {\varvec{P}}\left( t \right) $, (4) Embodiment parameters $ {\varvec{E}} $ and (5) current “brain” $ {\varvec{B}}\left( t \right) $:

$$ {\varvec{\rm B}}\left( {t + 1} \right) = f_{t} \left( {{\varvec{S}}\left( t \right),{\varvec{C}}\left( t \right),{\varvec{P}}\left( t \right),{\varvec{E}},{\varvec{B}}\left( t \right)} \right) $$

(1)

Definition 3

Control signal $ {\varvec{C}}\left( {t + 1} \right) $ is generated by the action generation function $ g_{t} $ based on $ {\varvec{B}}\left( {t + 1} \right) $:

$$ {\varvec{C}}\left( {t + 1} \right) = g_{t} \left( {{\varvec{B}}\left( {t + 1} \right)} \right) $$

(2)

As can be seen, DR not only matches the habit of human cognition and behavior (that is thinking before acting), but also implies a robot agent $ M $ cannot have two separate phases for learning and performing (that is learning while performing).

As a PEL example, consider the case that a robot has to go through a labyrinthine environment. And finally he succeeds, with large amounts of information. Definitely not all of the information should be transplanted and be bore in mind, all the robots need is the necessary information, i.e., as least as possible. So we introduce “nodes knowledge” to express peers’ experience.

Definition 4

Nodes knowledge: the labyrinthine environment is modeled as three kinds of nodes. (1) turning nodes $ N_{t} $: $ N_{t} $ is the crossings where the robot has to follow a circular arc trajectory. (2) straight nodes $ N_{s} $: $ N_{s} $ is the straight ways. (3) key nodes $ N_{k} $: $ N_{k} $ is where the robot has to arrive or pass by, i.e., target nodes.

So peers’ experience is represented as $ {\varvec{P}} = \left( {{\varvec{N}}_{t} ,{\varvec{N}}_{s} ,{\varvec{N}}_{k} ,{\varvec{E}}} \right) $. To the one who receives peers’ experience, it is not a time-varying knowledge, however, he will develop it to his own knowledge as time goes by. So we revise $ P $ to $ P\left( t \right) $ in consideration of development over time. As shown in Fig. 5, the three kinds of nodes knowledge and embodiment comprise peers’ experience.

3.3 Brain

The DR aims not simply at transplanting peers’ experience but, more challengingly, at building a paradigm that provides a new understanding of how we design a humanoid robot that interacts with and learns from others, to enrich its brain.

Roughly speaking, the brain of DR is consisted of two phases so as to produce own experience: from social development to individual development. More specifically, from justify peers’ experience is proper or not to verify peers’ experience is feasible or not.

3.4 PEL Justification

Unacquainted with the environment getting from peers, people may hold the question “if I was in his situation, what would I do?” So justify peers’ experience on the basis of individual characteristics. What is called “mental rehearsal [23, 24] ” in terms of psychology (Fig. 6).

PEL justification is a mental rehearsal of peers’ experience, which divides $ P\left( t \right) $ to feasible nodes $ P_{fr} \left( t \right) $ and infeasible nodes $ P_{ir} \left( t \right) $: $ P\left( t \right) = P_{fr} \left( t \right) + P_{ir} \left( t \right) $.

3.5 PEL Verification

In PEL justification, the peers’ experience knowledge is bound to mental rehearsal, anyway, “knowledge starts with practice”. Peers’ experience must be verified in real environment without being given any other information.

Let us consider the “brain” state of robot is denoted by a state vector $ {\varvec{B}}\left( t \right) $, a random process as Eq. (1) implied, which are closely related to Markov decision processes (MDP) [25]. Taking the uncertainty in states into account, the state-update function $ f_{t} $ can be:

$$ p\left( \begin{aligned} & {\varvec{B}} \left( {t + 1} \right) = {\varvec{B}}' | \\ & {\varvec{S}}\left( t \right),{\varvec{C}}\left( t \right),{\varvec{P}}\left( t \right),{\varvec{E}},{\varvec{B}}\left( t \right) = {\varvec{B}} \\ \end{aligned} \right) $$

(3)

and the action generation function $ g_{t} $:

$$ p\left( {\left. {{\varvec{C}}\left( {t + 1} \right) = {\varvec{C}}} \right|{\varvec{B}}\left( {t + 1} \right) = {\varvec{B}}'} \right) $$

(4)

where $ p\left( \bullet \right) $ denotes the probability. Though both peers’ experience and the verified information can be obtained by “brain”, but only the verified information can be entered to the action generation function $ g_{t} $.

3.6 Brain

The DR needs to form its own “brain” to memorize high dimensional experience as well as peers’ experience.One may tell others his own experience as well as the experience heard from others, so as to provide more information. Our proposed “brain” is constructed of the global states:

$$ {\varvec{B}} = \left( {{\varvec{P}}_{pe} ,{\varvec{E}}_{pe} ,{\varvec{P}}_{oe} ,{\varvec{E}}_{oe} } \right) = \left( {\left( {{\varvec{PE}}} \right)_{pe} ,\left( {{\varvec{PE}}} \right)_{oe} } \right) $$

(5)

where $ {\varvec{P}}_{pe} $ and $ {\varvec{P}}_{oe} $ are peers’ experience and own experience, $ {\varvec{E}}_{pe} $ and $ {\varvec{E}}_{oe} $ are peers’ embodiment and own embodiment.

However, experience accumulation over time induces a severe problem: “data disaster”. Because peers’ experience spreads from one to another, causing the $ {\varvec{P}}_{oe} $ and $ {\varvec{E}}_{oe} $ added to $ {\varvec{P}}_{pe} $ and $ {\varvec{E}}_{pe} $ each time. The brain at $ k $-th transplanting is:

$$ {\varvec{B}} = \left( {\left( {{\varvec{PE}}} \right)_{pe}^{1} ,\left( {{\varvec{PE}}} \right)_{pe}^{2} , \ldots ,\left( {{\varvec{PE}}} \right)_{pe}^{k} ,\left( {{\varvec{PE}}} \right)_{oe} } \right) $$

(6)

As we can see, the “brain” may encounter data disaster when $ k $ has big gaps to its initial value. Inspired by a new hierarchical statistical modeling method [26], the “brain” dimension is reduced by using incremental hierarchical discriminating regression (IHDR) tree [27] as Fig. 7 shows, which brings together similar features to generate feature clusters so as to improve the convergence performance.

So the different peers’ experience can be combined to reduce data in brain:

$$ {\varvec{B}}' = \left( {\mathop \cup \limits_{i = 1}^{k} {\varvec{P}}_{pe}^{i} ,\mathop \cup \limits_{i = 1}^{k} {\varvec{E}}_{pe}^{i} ,{\varvec{P}}_{oe} ,{\varvec{E}}_{oe} } \right) $$

(7)

4 Experiments and Discussion

4.1 Embodiment for DR

Actions cannot be performed in the absence of embodiment, robots must have some ways to affect the world, i.e., it must have a body [28‐30]. Damasio [31] holds the view that “the brain is the body’s captive audience”. In other words, all the commands must be applicable to the properties of its own body. Robots are created uniquely, they may have different properties even if with the same morphological structure (e.g., velocity and acceleration).

In a robot agent $ M $, the state $ {\varvec{E}} $ of embodiment is represented distributedly by different states $ e_{i} $: $ {\varvec{E}}{ = }\left( {e_{1} ,e_{2} ,e_{3} , \ldots ,e_{n} } \right) $. Take a differentia-drive mobile robot into consideration (see in Fig. 4), so the embodiment can be described as: $ {\varvec{E}}{ = }\left( {v_{\hbox{min} }^{l} ,v_{\hbox{max} }^{l} ,v_{\hbox{min} }^{r} ,v_{\hbox{max} }^{r} ,r} \right) $, where $ v^{l} $ and $ v^{r} $ are the left and the right velocities of the two wheels, and $ r $ is the radius (assuming the robot is a circle). The speed of two wheels control robot motion: when they are equal, straight-line motion is attained, while for different speeds the trajectory follows a circular arc.

4.2 PEL Algorithm Based on DR

The PEL approach is implemented and tested on eight challenging scenarios, which are classified into two categories (see in Table 1). In addition, simulations in the last two scenarios have been repeated for different environment to verify the effectiveness of proposed algorithm. For each case the state of “brain” obtained is measured as indicator of the development of the PEL algorithm performance.

Table 1

Two categories of six scenarios

Categories	Scenarios	PEL flow sequence
Two robots	2-ascending	Small to big
Two robots	2-descending	Big to small
Three robots	3-ascending	Small to middle, then middle to big
	3-descending	Big to middle, then middle to small
	3-small-descending	Small to big, then big to middle
	3-big-ascending	Big to small, then small to middle
	3-mid-ascending	Middle to small, then small to big
	3-mid-descending	Middle to big, then big to small

Robot trajectories are shown in Fig. 8 for the 2-ascending scenario and 2-descending scenario. Later positions are drawn on top of earlier. By comparing the two scenarios, it appears that DR can transplant peers’ experience for their own use, furthermore, once a robot encounters justified peers’ experience, trajectories are obtained without any path planning algorithm.

The trajectories of the robots in the 3-ascending scenario and 3-descending scenario are shown in Figs. 9 and 10. Take the state of “brain” (see Figs. 9b, 10b, the brain is defined and quantified by the amounts of the trajectory nodes) into consideration, the two scenarios are quite opposite: the “brain” of 3-ascending scenario increases over time while 3-descending scenario stay the same. It is because we introduce IHDR algorithm which, as a result, effectively reduces the “brain” dimension, so the robustness has been improved. It is also shown that this makes it easier to implement PEL and incrementally developing.

The 3-small-descending and 3-big-ascending of Fig. 11, together with 3-mid-ascending and 3-mid-descending scenarios of Fig. 12 represent the out-of-order PEL flow sequence. For a more detailed comparison, simulations in Fig. 12 have been repeated for different environments, and 3 various values of the robot radius have been considered. Simulation results have shown that the robots can learn for its predecessors and turn PEL into its own knowledge, and then verify the knowledge in environment to develop its “brain”. Furthermore, though the brain of robot 2 and robot 3 are almost the same (sees in Fig. 12b, they hold the same amounts of the trajectory nodes), but it turns out that the robot 1 follows robot 3, it is because we take safety factor into consideration, which means when a robot has two different peers’ experience but with same trajectory nodes amounts, it tends to the larger and safer one (Fig. 13).

4.3 Comparison with Other Algorithms

In order to validate the feasibility and effectiveness of PEL algorithm based on DR, particle swarm optimization (PSO) [32] and ant colony (AC) [33] are chosen to compare against it. All the simulations are conducted in Matlab 2012a environment, Inter core-i5.

Suppose that before entering the labyrinth (the length and width are both 100 meters.), all environments are unknown, and the robot has to find a way out as fast as possible. The simulation results are shown in Fig. 14a and their time-consuming in Fig. 14b.

Figure 14a shows that PEL, PSO and AC all can get a safe path, however, their time-consuming are different from each other as inferred from Fig. 14b. Though there is data fluctuation when sampling, the time-consuming stay stable within limits. The mean time-consuming of AC, PSO and PEL is 0.148 s, 0.117 s and 0.072 s respectively, which shows that PEL algorithm exhibits better performance than PSO and AC. The PEL algorithm has the least time consumed, PSO get the second place and AC needs the most. It is because PSO and AC requires continuously re-plan trajectory when they encounter with complex environment.

Even if step into the same environment, PSO and AC needs to repeat the previous calculation for they cannot learn to develop themselves. And it cost more time in calculating, causing instability of robots (The more time consumed, the more instable robots have). But PEL algorithm can get a safe path directly and dispense with the repetitive calculating to similar environment by referring to peers’ experience.

5 Conclusions

In terms of humans, we have argued how DR can follow a developmental pathway similar to natural PEL. After reviewing terminology in the context of biobehavioral perspective, a conceptual constructive model for acquiring peers’ experience as well as turning peers’ experience for own use has been proposed. Following are some concluding remarks.

The “brain” of agent is closed once after the birth, which means $ {\varvec{B}}\left( t \right) $ cannot be altered directly by human programmers. Instead, it can only be updated through interaction with the outer environment according to Eq. (1).

The DR has also been proposed, and a conceptual constructive model for PEL has been devised in parallel with self-other cognitive development on the basis of peers’ experience justification and verification.

The proposed constructive model is expected to shed new insight on our understanding of PEL for DR, which can directly reflected in the design of “brain”.

Still, there are several issues in need of attention and further investigations, including practical studies and dynamic environment studies.

Acknowledgements

We thank the anonymous reviewers for their careful review and helpful suggestions that greatly improved the manuscript. We thank Kai Zhou for suggesting improvements after reading early versions of this manuscript. This work was supported by Grant 61573373 of National Natural Science Foundation of China.

Compliance with Ethical Standards

Conflict of interest

All the authors in this paper declare that they have no conflict of interest.

Ethical Approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

previous article The Humanoid Robot NAO as Trainer in a Memory Program for Elderly People with Mild Cognitive Impairment

next article Cognitive Performance and Physiological Response Analysis

Riek LD, Robinson P (2009) Affective-centered design for inter-active robots. In: Proceedings of the AISB symposium on new frontiers in human-robot interaction

Leite I, Martinho C, Paiva A (2013) Social robots for long-term interaction: a survey. Int J Soc Robot 5:291–308CrossRef

Asada M (2015) Towards artificial empathy: how can artificial empathy follow the developmental pathway of natural empathy? Int J Soc Robot 7:19–33CrossRef

Elman JL, Bates EA, Johnson MH, Karmiloff-Smith A, Parisi D, Plunkett K (1997) Rethinking innateness: a connectionist perspective on development. MIT Press, Cambridge

Weng J (1998) Learning in image analysis and beyond: development. Visual communication and image processing. Marcel Dekker, New York

Weng J, McClelland J, Pentland A, Sporns O, Stockman I, Sur M, Thelen E (2001) Autonomous mental development by robots and animals. Science 291(5504):599–600CrossRef

Oudeyer PY, Kaplan F, Hafner V (2005) The playground experiment: task independent development of a curious robot. In: Proceedings of the AAAI spring symposium workshop on developmental robotics. Stanford, California

Blanchard AJ, Canamero L (2006) Anticipating rewards in continuous time and space: a case study in developmental robotics. Workshop on anticipatory behavior in adaptive learning systems. Springer, Berlin, Heidelberg, pp 267–284

Meng Q, Lee MH (2005) Novelty and habituation: the driving forces in early stage learning for developmental robotics. Biomimetic neural learning for intelligent robots. Springer, Berlin, pp 315–332CrossRef

10.

Kelley TD (2006) Developing a psychologically inspired cognitive architecture for robotic control: the symbolic and sub-symbolic robotic intelligence control system (SS-RICS). Int J Adv Robot Syst 3(1):219–222

11.

Zelazo PD (2004) The development of conscious control in childhood. Trends Cognit Sci 8(1):12–17CrossRef

12.

Asada M, Nagai Y, Ishihara H (2012) Why not artificial sympathy? In Proceedings of the international conference on social robotics. Springer, Berlin, Heidelberg, pp 278-287

13.

Asada M, Hosoda K, Kuniyoshi Y, Ishiguro H, Inui T, Yoshikawa Y, Ogino M, Yoshida C (2009) Cognitive developmental robotics: a survey. IEEE Trans Auton Ment Dev 1(1):12–34CrossRef

14.

Stoytchev A (2009) Some basic principles of developmental robotics. IEEE Trans Auton Ment Dev 1(2):122–130CrossRef

15.

Asada M, MacDorman KF, Ishiguro H, Kuniyoshi Y (2001) Cognitive developmental robotics as a new paradigm for the design of humanoid robots. Robot Auton Syst 37(1):185–193CrossRef

16.

Decety J (2011) The neuroevolution of empathy. Ann N Y Acad Sci 1231(1):35–45CrossRef

17.

Zhou H, Friedman HS, von der Heydt R (2000) Coding of border ownership in monkey visual cortex. J Neurosci 20:6594–6611CrossRef

18.

Hideyuki T, Kazunori T, Tomoyo M et al (2014) Different impressions of other agents obtained through social interaction uniquely modulate dorsal and ventral pathway activities in the social human brain. Cortex 58:289–300CrossRef

19.

Glassman RB (1977) How can so little brain hold so much knowledge? Applicability of the principle of natural selection to mental processes. Psychol Rec 27(2):393–415CrossRef

20.

Baglio F, Castelli I, Alberoni M et al (2016) Theory of mind in amnestic mild cognitive impairment: an FMRI study. J Alzheimers Dis 29(1):25–34CrossRef

21.

Bonekamp D, Smith MA, Zhu H et al (2010) Quantitative SENSE-MRSI of the human brain. Magn Reson Imaging 28(3):305–313CrossRef

22.

Weng J (2002) A theory for mentally developing robots. In: Proceedings of the 2nd international conference on development and learning (ICDL’02). IEEE, pp 131–140

23.

Jeannerod M, Decety J (1995) Mental motor imagery: a window into the representational stages of actions. Curr Opin Neurobiol 5(6):727–732CrossRef

24.

Cisek P, Kalaska JF (2004) Neural correlates of mental rehearsal in dorsal premotor cortex. Nature 431(7011):993–996CrossRef

25.

Puterman ML (1994) Markov decision processes. Wiley, New YorkCrossRef

26.

Rochat P (2003) Five levels of self-awareness as they unfold early in life. Conscious Cognit 12(4):717–731CrossRef

27.

Weng J (2004) Developmental robotics: theory and experiments. Int J Humanoid Robot 1(2):199–236CrossRef

28.

Varela F, Thompson E, Rosch E (1991) The embodied mind: cognitive science and human experience. MIT Press, CambridgeCrossRef

29.

Brooks R, Stein L (1994) Building brains for bodies. Auton Robots 1(1):7–25CrossRef

30.

Clark A (1997) Being there: putting brain, body, and world together again. MIT Press, Cambridge

31.

Damasio AR (1994) Descartes’ error: emotion, reason and the human brain. Gosset/Putnam Press, New York

32.

Foo JL, Knutzon J, Kalivarapu V, Oliver J, Winer E (2009) Path planning of unmanned aerial vehicles using B-splines and particle swarm optimization. J Aerosp Comput Inf Commun 6:271–290CrossRef

33.

Chen X, Zhao YL, Han JD (2010) An improved ant colony optimization algorithm for robotic path planning. Control Theory Appl 27(6):821–825

Title: Peers’ Experience Learning for Developmental Robots
Authors: Ruixuan Wei
Qirui Zhang
Zhuofan Xu
Publication date: 28-02-2019
Publisher: Springer Netherlands
Published in: International Journal of Social Robotics / Issue 1/2020
Print ISSN: 1875-4791
Electronic ISSN: 1875-4805
DOI: https://doi.org/10.1007/s12369-019-00531-0

Springer Professional

Peers’ Experience Learning for Developmental Robots

Abstract

Publisher’s Note

1 Introduction

2 Peers’ Experience Learning of Humans

3 A Brain-Like PEL Model for Developmental Robots

3.1 Conceptual Architecture for DR

3.2 Peers’ Experience Learning

3.3 Brain

3.4 PEL Justification

3.5 PEL Verification

3.6 Brain

4 Experiments and Discussion

4.1 Embodiment for DR

4.2 PEL Algorithm Based on DR

4.3 Comparison with Other Algorithms

5 Conclusions

Acknowledgements

Compliance with Ethical Standards

Conflict of interest

Ethical Approval

Publisher’s Note

Premium Partners

Springer Professional

Abstract

Publisher’s Note

1 Introduction

2 Peers’ Experience Learning of Humans

3 A Brain-Like PEL Model for Developmental Robots

3.1 Conceptual Architecture for DR

3.2 Peers’ Experience Learning

3.3 Brain

3.4 PEL Justification

3.5 PEL Verification

3.6 Brain

4 Experiments and Discussion

4.1 Embodiment for DR

4.2 PEL Algorithm Based on DR

4.3 Comparison with Other Algorithms

5 Conclusions

Acknowledgements

Compliance with Ethical Standards

Conflict of interest

Ethical Approval

Publisher’s Note

Other articles of this Issue 1/2020

Why Do Robots Need a Head? The Role of Social Interfaces on Service Robots

Cognitive Performance and Physiological Response Analysis

A Feasibility Study of a Social Robot Collecting Patient Reported Outcome Measurements from Older Adults

Geometric Intuitive Techniques for Human Machine Interaction in Medical Robotics

Correction to: Editorial

When Your Robot Avatar Misbehaves You Are Likely to Apologize: An Exploration of Guilt During Robot Embodiment

Premium Partners