nach oben

2013 | Buch

Intrinsically Motivated Learning in Natural and Artificial Systems

herausgegeben von: Gianluca Baldassarre, Marco Mirolli

Verlag: Springer Berlin Heidelberg

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

It has become clear to researchers in robotics and adaptive behaviour that current approaches are yielding systems with limited autonomy and capacity for self-improvement. To learn autonomously and in a cumulative fashion is one of the hallmarks of intelligence, and we know that higher mammals engage in exploratory activities that are not directed to pursue goals of immediate relevance for survival and reproduction but are instead driven by intrinsic motivations such as curiosity, interest in novel stimuli or surprising events, and interest in learning new behaviours. The adaptive value of such intrinsically motivated activities lies in the fact that they allow the cumulative acquisition of knowledge and skills that can be used later to accomplish ﬁtness-enhancing goals. Intrinsic motivations continue during adulthood, and in humans they underlie lifelong learning, artistic creativity, and scientific discovery, while they are also the basis for processes that strongly affect human well-being, such as the sense of competence, self-determination, and self-esteem.

This book has two aims: to present the state of the art in research on intrinsically motivated learning, and to identify the related scientific and technological open challenges and most promising research directions. The book introduces the concept of intrinsic motivation in artificial systems, reviews the relevant literature, offers insights from the neural and behavioural sciences, and presents novel tools for research. The book is organized into six parts: the chapters in Part I give general overviews on the concept of intrinsic motivations, their function, and possible mechanisms for implementing them; Parts II, III, and IV focus on three classes of intrinsic motivation mechanisms, those based on predictors, on novelty, and on competence; Part V discusses mechanisms that are complementary to intrinsic motivations; and Part VI introduces tools and experimental frameworks for investigating intrinsic motivations.

The contributing authors are among the pioneers carrying out fundamental work on this topic, drawn from related disciplines such as artificial intelligence, robotics, artificial life, evolution, machine learning, developmental psychology, cognitive science, and neuroscience. The book will be of value to graduate students and academic researchers in these domains, and to engineers engaged with the design of autonomous, adaptive robots.

Inhaltsverzeichnis

Frontmatter

Intrinsically Motivated Learning Systems: An Overview

Abstract

This chapter introduces the field of intrinsically motivated learning systems and illustrates the content, objectives, and organisation of the book. The chapter first expands the concept of intrinsic motivations, then introduces a taxonomy of three classes of intrinsic-motivation mechanisms (based on predictors, on novelty detection, and on competence), and finally introduces and reviews the various contributions of the book. The contributions are organised in six parts. The contributions of the first part provide general overviews on the concept of intrinsic motivations, the possible mechanisms that may implement them, and the functions that they can play. The contributions of the second, third, and fourth part focus on the three classes of the aforementioned intrinsic-motivation mechanisms. The contributions of the fifth part discuss mechanisms that are complementary to intrinsic motivations. The contributions of the sixth part introduce tools and experimental paradigms that can be used to investigate intrinsic motivations.

Gianluca Baldassarre, Marco Mirolli

General Overviews on Intrinsic Motivations

Frontmatter

Intrinsic Motivation and Reinforcement Learning

Abstract

Psychologists distinguish between extrinsically motivated behavior, which is behavior undertaken to achieve some externally supplied reward, such as a prize, a high grade, or a high-paying job, and intrinsically motivated behavior, which is behavior done for its own sake. Is an analogous distinction meaningful for machine learning systems? Can we say of a machine learning system that it is motivated to learn, and if so, is it possible to provide it with an analog of intrinsic motivation? Despite the fact that a formal distinction between extrinsic and intrinsic motivation is elusive, this chapter argues that the answer to both questions is assuredly “yes” and that the machine learning framework of reinforcement learning is particularly appropriate for bringing learning together with what in animals one would call motivation. Despite the common perception that a reinforcement learning agent’s reward has to be extrinsic because the agent has a distinct input channel for reward signals, reinforcement learning provides a natural framework for incorporating principles of intrinsic motivation.

Andrew G. Barto

Functions and Mechanisms of Intrinsic Motivations

The Knowledge Versus Competence Distinction

Abstract

Mammals, and humans in particular, are endowed with an exceptional capacity for cumulative learning. This capacity crucially depends on the presence of intrinsic motivations, that is, motivations that are directly related not to an organism’s survival and reproduction but rather to its ability to learn. Recently, there have been a number of attempts to model and reproduce intrinsic motivations in artificial systems. Different kinds of intrinsic motivations have been proposed both in psychology and in machine learning and robotics: some are based on the knowledge of the learning system, while others are based on its competence. In this contribution, we discuss the distinction between knowledge-based and competence-based intrinsic motivations with respect to both the functional roles that motivations play in learning and the mechanisms by which those functions are implemented. In particular, after arguing that the principal function of intrinsic motivations consists in allowing the development of a repertoire of skills (rather than of knowledge), we suggest that at least two different sub-functions can be identified: (a) discovering which skills might be acquired and (b) deciding which skill to train when. We propose that in biological organisms, knowledge-based intrinsic motivation mechanisms might implement the former function, whereas competence-based mechanisms might underlie the latter one.

Marco Mirolli, Gianluca Baldassarre

Exploration from Generalization Mediated by Multiple Controllers

Abstract

Intrinsic motivation involves internally governed drives for exploration, curiosity, and play. These shape subjects over the course of development and beyond to explore to learn and expand the actions they are capable of performing and to acquire skills that can be useful in future domains. We adopt a utilitarian view of this learning process, treating it in terms of exploration bonuses that arise from distributions over the structure of the world that imply potential benefits from generalizing knowledge and skills to subsequent environments. We discuss how functionally and architecturally different controllers may realize these bonuses in different ways.

Peter Dayan

Prediction-Based Intrinsic Motivation Mechanisms

Frontmatter

Maximizing Fun by Creating Data with Easily Reducible Subjective Complexity

Abstract

The Formal Theory of Fun and Creativity (1990–2010) [Schmidhuber, J.: Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Trans. Auton. Mental Dev. 2(3), 230–247 (2010b)] describes principles of a curious and creative agent that never stops generating nontrivial & novel & surprising tasks and data. Two modules are needed: a data encoder and a data creator. The former encodes the growing history of sensory data as the agent is interacting with its environment; the latter executes actions shaping the history. Both learn. The encoder continually tries to encode the created data more efficiently, by discovering new regularities in it. Its learning progress is the wow-effect or fun or intrinsic reward of the creator, which maximizes future expected reward, being motivated to invent skills leading to interesting data that the encoder does not yet know but can easily learn with little computational effort. I have argued that this simple formal principle explains science and art and music and humor.

Jürgen Schmidhuber

The Role of the Basal Ganglia in Discovering Novel Actions

Abstract

Our interest is in the neural circuitry which supports the discovery and encoding of novel actions. We discuss the significant existing literature which identifies the basal ganglia, a complex of subcortical nuclei, as important in both the selection of actions and in reinforcement learning. We discuss the complementarity of these problems of action selection and action learning. Two basic mechanisms of biasing action selection are identified: (a) adjusting the relative strengths of competing inputs and (b) adjusting the relative sensitivity of the receiver of reinforced inputs. We discuss the particular importance of the phasic dopamine signal in the basal ganglia and its proposed role in conveying a reward prediction error. Temporal constraints of this signal limit the information it can convey to immediately surprising sensory events, thus—we argue—making it inappropriate to convey information regarding the economic value of actions (as proposed by the reward prediction error hypothesis). Rather, we suggest this signal is ideal to support the identification of novel actions and their encoding via the biasing of future action selection.

Peter Redgrave, Kevin Gurney, Tom Stafford, Martin Thirkettle, Jen Lewis

Action Discovery and Intrinsic Motivation: A Biologically Constrained Formalisation

Abstract

We introduce a biologically motivated, formal framework or “ontology” for dealing with many aspects of action discovery which we argue is an example of intrinsically motivated behaviour (as such, this chapter is a companion to that by Redgrave et al. in this volume). We argue that action discovery requires an interplay between separate internal forward models of prediction and inverse models mapping outcomes to actions. The process of learning actions is driven by transient changes in the animal’s policy (repetition bias) which is, in turn, a result of unpredicted, phasic sensory information (“surprise”). The notion of salience as value is introduced and broken down into contributions from novelty (or surprise), immediate reward acquisition, or general task/goal attainment. Many other aspects of biological action discovery emerge naturally in our framework which aims to guide future modelling efforts in this domain.

Kevin Gurney, Nathan Lepora, Ashvin Shah, Ansgar Koene, Peter Redgrave

Novelty-Based Intrinsic Motivation Mechanisms

Frontmatter

Novelty Detection as an Intrinsic Motivation for Cumulative Learning Robots

Abstract

Novelty detection is an inherent part of intrinsic motivations and constitutes an important research issue for the effective and long-term operation of intelligent robots designed to learn, act and make decisions based on their cumulative knowledge and experience. Our approach to novelty detection is from the perspective that the robot ignores perceptions that are already known, but is able to identify anything different. This is achieved by developing biologically inspired novelty detectors based on habituation. Habituation is a type of non-associative learning used to describe the behavioural phenomenon of decreased responsiveness of a cognitive organism to a recently and frequently presented stimulus, and it has been observed in a number of biological organisms. This chapter first considers the relationship between intrinsic motivations and novelty detection and outlines some works on intrinsic motivations. It then presents a critical review of the methods of novelty detection published by the authors. A brief summary of some key recent surveys in the field is then provided. Finally, key open challenges that need to be considered in the design of novelty detection filters for cumulative learning tasks are discussed.

Ulrich Nehmzow, Yiannis Gatsoulis, Emmett Kerr, Joan Condell, Nazmul Siddique, T. Martin McGuinnity

Novelty and Beyond: Towards Combined Motivation Models and Integrated Learning Architectures

Abstract

For future intrinsically motivated agents to combine multiple intrinsic motivation or behavioural components, there is a need to identify fundamental units of motivation models that can be reused and combined to produce more complex agents. This chapter reviews three existing models of intrinsic motivation, novelty, interest and competence-seeking motivation, that are based on the neural network framework of a real-time novelty detector. Four architectures are discussed that combine basic units of the intrinsic motivation functions in different ways. This chapter concludes with a discussion of future directions for combined motivation models and integrated learning architectures.

Kathryn E. Merrick

The Hippocampal-VTA Loop: The Role of Novelty and Motivation in Controlling the Entry of Information into Long-Term Memory

Abstract

The role of dopamine has been strongly implicated in reward processes, but recent work shows an additional role as a signal that promotes the stable incorporation of novel information into long-term hippocampal memory. Indeed, dopamine neurons, in addition to being activated by reward, can be activated by novelty in the absence of reward. The computation of novelty is thought to occur in the hippocampus and is carried to the dopamine cells of the VTA through a polysynaptic pathway. Although a picture of novelty-dependent processes in the VTA and hippocampus is beginning to emerge, many aspects of the process remain unclear. Here, we will consider several issues: (1) What is relationship of novelty signals coming to the VTA from the superior colliculus, as compared to those that come from the hippocampus? (2) Can dopamine released by a reward enhance the learning of novel information? (3) Is there an interaction between motivational signals and hippocampal novelty signals? (4) What are the properties of the axons that generate dopamine release in the hippocampus? We close with a discussion of some of the outstanding open issues in this field.

Nonna Otmakhova, Emrah Duzel, Ariel Y. Deutch, John Lisman

Competence-Based Intrinsic Motivation Mechanisms

Frontmatter

Deciding Which Skill to Learn When: Temporal-Difference Competence-Based Intrinsic Motivation (TD-CB-IM)

Abstract

Intrinsic motivations can be defined by contrasting them to extrinsic motivations. Extrinsic motivations are directed to drive the learning of behavior directed to satisfy basic needs related to the organisms’ survival and reproduction. Intrinsic motivations, instead, are motivations that serve the evolutionary function of acquiring knowledge (e.g., the capacity to predict) and competence (i.e., the capacity to do) in the absence of extrinsic motivations: this knowledge and competence can be later exploited for producing behaviors that enhance biological fitness. Knowledge-based intrinsic motivation mechanisms (KB-IM), usable for guiding learning on the basis of the level or change of knowledge, have been widely modeled and studied. Instead, competence-based intrinsic motivation mechanisms (CB-IM), usable for guiding learning on the basis of the level or improvement of competence, have been much less investigated. The goal of this chapter is twofold. First, it aims to clarify the nature and possible roles of CB-IM mechanisms for learning, in particular in relation to the cumulative acquisition of a repertoire of skills. Second, it aims to review a specific CB-IM mechanism, the Temporal-Difference Competence-Based Intrinsic Motivation (TD-CB-IM). TD-CB-IM measures the improvement rate of skill acquisition on the basis of the Temporal-Difference learning signal (TD error) that is used in several reinforcement learning (RL) models. The effectiveness of the mechanism is supported by reviewing and discussing in depth the results of experiments in which the TD-CB-IM mechanism is successfully exploited by a hierarchical RL model controlling a simulated navigating robot to decide when to train different skills in different environmental conditions.

Gianluca Baldassarre, Marco Mirolli

Intrinsically Motivated Affordance Discovery and Modeling

Abstract

In this chapter, we argue that a single intrinsic motivation function for affordance discovery can guide long-term learning in robot systems. To these ends, we provide a novel definition of “affordance” as the latent potential for the closed-loop control of environmental stimuli perceived by sensors. Specifically, the proposed intrinsic motivation function rewards the discovery of such control affordances. We will demonstrate how this function has been used by a humanoid robot to learn a number of general purpose control skills that address many different tasks. These skills, for example, include strategies for finding, grasping, and placing simple objects. We further show how this same intrinsic reward function is used to direct the robot to build stable models of when the environment affords these skills.

Stephen Hart, Roderic Grupen

Mechanisms Complementary to Intrinsic Motivations

Frontmatter

Intrinsically Motivated Learning of Real-World Sensorimotor Skills with Developmental Constraints

Abstract

Open-ended exploration and learning in the real world is a major challenge of developmental robotics. Three properties of real-world sensorimotor spaces provide important conceptual and technical challenges: unlearnability, high dimensionality, and unboundedness. In this chapter, we argue that exploration in such spaces needs to be constrained and guided by several combined developmental mechanisms. While intrinsic motivation, that is, curiosity-driven learning, is a key mechanism to address this challenge, it has to be complemented and integrated with other developmental constraints, in particular: sensorimotor primitives and embodiment, task space representations, maturational processes (i.e., adaptive changes of the embodied sensorimotor apparatus), and social guidance. We illustrate and discuss the potential of such an integration of developmental mechanisms in several robot learning experiments.

Pierre-Yves Oudeyer, Adrien Baranes, Frédéric Kaplan

Investigating the Origins of Intrinsic Motivation in Human Infants

Abstract

One of the earliest behaviors driven by intrinsic motivation is visual exploration. In this chapter, I highlight how the development of this capacity is influenced not only by changes in the brain that take place after birth but also by the acquisition of oculomotor skill. To provide a context for interpreting these developmental changes, I then survey three theoretical perspectives that are available for explaining how and why visual exploration develops. Next, I describe work on the development of perceptual completion, which offers a case study on the development of visual exploration and the role of oculomotor skill. I conclude by discussing a number of challenges and open questions that are suggested by this work.

Matthew Schlesinger

Tools for Research on Intrinsic Motivations

Frontmatter

A Novel Behavioural Task for Researching Intrinsic Motivations

Abstract

We present a novel behavioural task for the investigation of how actions are added to an agent’s repertoire. In the task, free exploration of the range of possible movements with a manipulandum, such as a joystick, is recorded. A subset of these movements trigger a reinforcing signal. Our interest is in how those elements of total behaviour which cause an unexpected outcome are identified and stored. This process is necessarily prior to the attachment of value to different actions [Redgrave, P., Gurney, K.: The short-latency dopamine signal: A role in discovering novel actions? Nat. Rev. Neurosci. 7(12), 967–975 (2006)]. The task allows for critical tests of reinforcement prediction error theories [e.g. Schultz, W., Dayan, P., Montague, P.: A neural substrate of prediction and reward. Science 275, 1593–1599 (1997)], as well as providing a window on a number of other issues in action learning. The task provides a paradigm where the exploratory motive drives learning, and as such we view it as in the tradition of Thorndike [Animal intelligence (1911)]. Our task is easily scalable in difficulty, is adaptable across species and provides rich set of behavioural measures throughout the action-learning process. Targets can be defined in spatial, temporal or kinematic/gestural terms, and the task also allows the concatenation of actions to be investigated. Action learning requires integration across spatial, kinematic and temporal dimensions. The task affords insight into these (and into the process of integration).

Tom Stafford, Tom Walton, Len Hetherington, Martin Thirkettle, Kevin Gurney, Peter Redgrave

The “Mechatronic Board”: A Tool to Study Intrinsic Motivations in Humans, Monkeys, and Humanoid Robots

Abstract

In this chapter the design and fabrication of a new mechatronic platform (called “mechatronic board”) for behavioural analysis of children, non-human primates, and robots are presented and discussed. The platform is the result of a multidisciplinary design approach which merges indications coming from neuroscientists, psychologists, primatologists, roboticists, and bioengineers, with the main goal of studying learning mechanisms driven by intrinsic motivations and curiosity. This chapter firstly introduces the main requirements of the platform, coming from the different needs of the experiments involving the different types of participants. Then, it provides a detailed analysis of the main features of the mechatronic board, focusing on its key aspects which allow the study of intrinsically motivated learning in children and non-human primates. Finally, it shows some preliminary results on curiosity-driven learning coming from pilot experiments involving children, capuchin monkeys, and a computational model of the behaviour of these organisms tested with a humanoid robot (the iCub robot). These experiments investigate the capacity of children, capuchin monkeys, and a computational model implemented on the iCub robot to learn action-outcome contingencies on the basis of intrinsic motivations.

Fabrizio Taffoni, Domenico Formica, Giuseppina Schiavone, Maria Scorcia, Alessandra Tomassetti, Eugenia Polizzi di Sorrentino, Gloria Sabbatini, Valentina Truppa, Francesco Mannella, Vincenzo Fiore, Marco Mirolli, Gianluca Baldassarre, Elisabetta Visalberghi, Flavio Keller, Eugenio Guglielmelli

The iCub Platform: A Tool for Studying Intrinsically Motivated Learning

Abstract

Intrinsically motivated robots are machines designed to operate for long periods of time, performing tasks for which they have not been programmed. These robots make extensive use of explorative, often unstructured actions in search for opportunities to learn and extract information from the environment. Research in this field faces challenges that need advances not only on the algorithms but also on the experimental platforms. The iCub is a humanoid platform that was designed to support research in cognitive systems. We review in this chapter the chief characteristics of the iCub robot, devoting particular attention to those aspects that make the platform particularly suitable to the study of intrinsically motivated learning. We provide details on the software architecture, the mechanical design, and the sensory system. We report examples of experiments and software modules to show how the robot can be programmed to obtain complex behaviors involving the interaction with the environment. The goal of this chapter is to illustrate the potential impact of the iCub on the scientific community at large and, in particular, on the field of intrinsically motivated learning.

Lorenzo Natale, Francesco Nori, Giorgio Metta, Matteo Fumagalli, Serena Ivaldi, Ugo Pattacini, Marco Randazzo, Alexander Schmitz, Giulio Sandini

Titel: Intrinsically Motivated Learning in Natural and Artificial Systems
herausgegeben von: Gianluca Baldassarre
Marco Mirolli
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-642-32375-1
Print ISBN: 978-3-642-32374-4
DOI: https://doi.org/10.1007/978-3-642-32375-1