Skip to main content

2006 | Buch

Learning and Adaption in Multi-Agent Systems

First International Workshop, LAMAS 2005, Utrecht, The Netherlands, July 25, 2005, Revised Selected Papers

herausgegeben von: Karl Tuyls, Pieter Jan’t Hoen, Katja Verbeeck, Sandip Sen

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This book contains selected and revised papers of the International Workshop on Lea- ing and Adaptation in Multi-Agent Systems (LAMAS 2005), held at the AAMAS 2005 Conference in Utrecht, The Netherlands, July 26. An important aspect in multi-agent systems (MASs) is that the environment evolves over time, not only due to external environmental changes but also due to agent int- actions. For this reason it is important that an agent can learn, based on experience, and adapt its knowledge to make rational decisions and act in this changing environment autonomously. Machine learning techniques for single-agent frameworks are well established. Agents operate in uncertain environments and must be able to learn and act - tonomously. This task is, however, more complex when the agent interacts with other agents that have potentially different capabilities and goals. The single-agent case is structurally different from the multi-agent case due to the added dimension of dynamic interactions between the adaptive agents. Multi-agent learning, i.e., the ability of the agents to learn how to cooperate and compete, becomes crucial in many domains. Autonomous agents and multi-agent systems (AAMAS) is an emerging multi-disciplinary area encompassing computer science, software engineering, biology, as well as cognitive and social sciences. A t- oretical framework, in which rationality of learning and interacting agents can be - derstood, is still under development in MASs, although there have been promising ?rst results.

Inhaltsverzeichnis

Frontmatter
An Overview of Cooperative and Competitive Multiagent Learning
Abstract
Multi-agent systems (MASs) is an area of distributed artificial intelligence that emphasizes the joint behaviors of agents with some degree of autonomy and the complexities arising from their interactions. The research on MASs is intensifying, as supported by a growing number of conferences, workshops, and journal papers. In this survey we give an overview of multi-agent learning research in a spectrum of areas, including reinforcement learning, evolutionary computation, game theory, complex systems, agent modeling, and robotics.
MASs range in their description from cooperative to being competitive in nature. To muddle the waters, competitive systems can show apparent cooperative behavior, and vice versa. In practice, agents can show a wide range of behaviors in a system, that may either fit the label of cooperative or competitive, depending on the circumstances. In this survey, we discuss current work on cooperative and competitive MASs and aim to make the distinctions and overlap between the two approaches more explicit.
Lastly, this paper summarizes the papers of the first International workshop on Learning and Adaptation in MAS (LAMAS) hosted at the fourth International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS’05) and places the work in the above survey.
Pieter Jan ’t Hoen, Karl Tuyls, Liviu Panait, Sean Luke, J. A. La Poutré
Multi-robot Learning for Continuous Area Sweeping
Abstract
As mobile robots become increasingly autonomous over extended periods of time, opportunities arise for their use on repetitive tasks. We define and implement behaviors for a class of such tasks that we call continuous area sweeping tasks. A continuous area sweeping task is one in which a group of robots must repeatedly visit all points in a fixed area, possibly with non-uniform frequency, as specified by a task-dependent cost function. Examples of problems that need continuous area sweeping are trash removal in a large building and routine surveillance. We present a formulation for this problem and an initial algorithm to address it. The approach is analyzed analytically and is fully implemented and tested, both in simulation and on physical robots.
Mazda Ahmadi, Peter Stone
Learning Automata as a Basis for Multi Agent Reinforcement Learning
Abstract
In this paper we summarize some important theoretical results from the domain of Learning Automata. We start with single stage, single agent learning schema’s, and gradually extend the setting to multi-stage multi agent systems. We argue that the theory of Learning Automata is an ideal basis to build multi agent learning algorithms.
Ann Nowé, Katja Verbeeck, Maarten Peeters
Learning Pareto-optimal Solutions in 2x2 Conflict Games
Abstract
Multiagent learning literature has investigated iterated two-player games to develop mechanisms that allow agents to learn to converge on Nash Equilibrium strategy profiles. Such equilibrium configurations imply that no player has the motivation to unilaterally change its strategy. Often, in general sum games, a higher payoff can be obtained by both players if one chooses not to respond myopically to the other player. By developing mutual trust, agents can avoid immediate best responses that will lead to a Nash Equilibrium with lesser payoff. In this paper we experiment with agents who select actions based on expected utility calculations that incorporate the observed frequencies of the actions of the opponent(s). We augment these stochastically greedy agents with an interesting action revelation strategy that involves strategic declaration of one’s commitment to an action to avoid worst-case, pessimistic moves. We argue that in certain situations, such apparently risky action revelation can indeed produce better payoffs than a non-revealing approach. In particular, it is possible to obtain Pareto-optimal Nash Equilibrium outcomes. We improve on the outcome efficiency of a previous algorithm and present results over the set of structurally distinct two-person two-action conflict games where the players’ preferences form a total order over the possible outcomes. We also present results on a large number of randomly generated payoff matrices of varying sizes and compare the payoffs of strategically revealing learners to payoffs at Nash equilibrium.
Stéphane Airiau, Sandip Sen
Unifying Convergence and No-Regret in Multiagent Learning
Abstract
We present a new multiagent learning algorithm, RV σ(t), that builds on an earlier version, ReDVaLeR . ReDVaLeR could guarantee (a) convergence to best response against stationary opponents and either (b) constant bounded regret against arbitrary opponents, or (c) convergence to Nash equilibrium policies in self-play. But it makes two strong assumptions: (1) that it can distinguish between self-play and otherwise non-stationary agents and (2) that all agents know their portions of the same equilibrium in self-play. We show that the adaptive learning rate of RV σ(t) that is explicitly dependent on time can overcome both of these assumptions. Consequently, RV σ(t) theoretically achieves (a’) convergence to near-best response against eventually stationary opponents, (b’) no-regret payoff against arbitrary opponents and (c’) convergence to some Nash equilibrium policy in some classes of games, in self-play. Each agent now needs to know its portion of any equilibrium, and does not need to distinguish among non-stationary opponent types. This is also the first successful attempt (to our knowledge) at convergence of a no-regret algorithm in the Shapley game.
Bikramjit Banerjee, Jing Peng
Implicit Coordination in a Network of Social Drivers: The Role of Information in a Commuting Scenario
Abstract
One of the major research directions in multi-agent systems is dedicated to learning how to coordinate and whether individual agents’ decisions can lead to globally optimal or at least acceptable solutions. Our long term goal is to study the effect of several types of information to guide the decision process of the individual agents. This present paper addresses simulation of agents’ decision-making regarding route choice, and the role of an information component. This information can be provided by group colleagues, by acquaintances from other groups (small-world), or by route guidance. Besides, we study the role of agents lying about their choices. We compare these scenarios, concluding that information (from some kind of source) is beneficial in general: lying helps only to a certain extent, and route guidance is the best type of information.
Ana L. C. Bazzan, Manuel Fehler, Franziska Klügl
Multiagent Traffic Management: Opportunities for Multiagent Learning
Abstract
Traffic congestion is one of the leading causes of lost productivity and decreased standard of living in urban settings. In previous work published at AAMAS, we have proposed a novel reservation-based mechanism for increasing throughput and decreasing delays at intersections [3]. In more recent work, we have provided a detailed protocol by which two different classes of agents (intersection managers and driver agents) can use this system [4]. We believe that the domain created by this mechanism and protocol presents many opportunities for multiagent learning on the parts of both classes of agents. In this paper, we identify several of these opportunities and offer a first-cut approach to each.
Kurt Dresner, Peter Stone
Dealing with Errors in a Cooperative Multi-agent Learning System
Abstract
This paper presents some methods of dealing with the problem of cooperative learning in a multi-agent system, in error prone environments. A system is developed that learns by reinforcement and is robust to errors that can come from the agents’ sensors, from another agent that shares wrong information or even from the communication channel.
Constança Oliveira e Sousa, Luis Custódio
The Success and Failure of Tag-Mediated Evolution of Cooperation
Abstract
Use of tags to limit partner selection for playing has been shown to produce stable cooperation in agent populations playing the Prisoner’s Dilemma game. There is, however, a lack of understanding of how and why tags facilitate such cooperation. We start with an empirical investigation that identifies the key dynamics that result in sustainable cooperation in PD. Sufficiently long tags are needed to achieve this effect. A theoretical analysis shows that multiple simulation parameters including tag length, mutation rate and population size will have significant effect on sustaining cooperation. Experiments partially validate these observations. Additionally, we claim that tags only promote mimicking and not coordinated behavior in general, i.e., tags can promote cooperation only if cooperation requires identical actions from all group members. We illustrate the failure of the tag model to sustain cooperation by experimenting with domains where agents need to take complementary actions to maximize payoff.
Austin McDonald, Sandip Sen
An Adaptive Approach for the Exploration-Exploitation Dilemma and Its Application to Economic Systems
Abstract
Learning agents have to deal with the exploration-exploitation dilemma. The choice between exploration and exploitation is very difficult in dynamic systems; in particular in large scale ones such as economic systems. Recent research shows that there is neither an optimal nor a unique solution for this problem. In this paper, we propose an adaptive approach based on meta-rules to adapt the choice between exploration and exploitation. This new adaptive approach relies on the variations of the performance of the agents. To validate the approach, we apply it to economic systems and compare it to two adaptive methods originally proposed by Wilson: one local and one global. Moreover, we compare different exploration strategies and focus on their influence on the performance of the agents.
Lilia Rejeb, Zahia Guessoum, Rym M’Hallah
Efficient Reward Functions for Adaptive Multi-rover Systems
Abstract
This chapter focuses on deriving reward functions that allow multiple agents to co-evolve efficient control policies that maximize a system level reward in noisy and dynamic environments. The solution we present is based on agent rewards satisfying two crucial properties. First, the agent reward function and global reward function has to be aligned, that is, an agent maximizing its agent-specific reward should also maximize the global reward. Second, the agent has to receive sufficient “signal” from its reward, that is, an agent’s action should have a large influence over its agent-specific reward. Agents using rewards with these two properties will evolve the correct policies quickly. This hypothesis is tested in episodic and non-episodic, continuous-space multi-rover environment where rovers evolve to maximize a global reward function over all rovers. The environments are dynamic (i.e. changes over time), noisy and have restriction on communication between agents. We show that a control policy evolved using agent-specific rewards satisfying the above properties outperforms policies evolved using global rewards by up to 400%. More notably, in the presence of a larger number of rovers or rovers with noisy and communication limited sensors, the proposed method outperforms global reward by a higher percentage than in noise-free conditions with a small number of rovers.
Kagan Tumer, Adrian Agogino
Multi-agent Relational Reinforcement Learning
Explorations in Multi-state Coordination Tasks
Abstract
In this paper we report on using a relational state space in multi-agent reinforcement learning. There is growing evidence in the Reinforcement Learning research community that a relational representation of the state space has many benefits over a propositional one. Complex tasks as planning or information retrieval on the web can be represented more naturally in relational form. Yet, this relational structure has not been exploited for multi-agent reinforcement learning tasks and has only been studied in a single agent context so far. In this paper we explore the powerful possibilities of using Relational Reinforcement Learning (RRL) in complex multi-agent coordination tasks. More precisely, we consider an abstract multi-state coordination problem, which can be considered as a variation and extension of repeated stateless Dispersion Games. Our approach shows that RRL allows to represent a complex state space in a multi-agent environment more compactly and allows for fast convergence of learning agents. Moreover, with this technique, agents are able to make complex interactive models (in the sense of learning from an expert), to predict what other agents will do and generalize over this model. This enables to solve complex multi-agent planning tasks, in which agents need to be adaptive and learn, with more powerful tools.
Tom Croonenborghs, Karl Tuyls, Jan Ramon, Maurice Bruynooghe
Multi-type ACO for Light Path Protection
Abstract
Backup trees (BTs) are a promising approach to network protection in optical networks. BTs allow us to protect a group of working paths against single network failures, while reserving only a minimum amount of network capacity for backup purposes. The process of constructing a set of working paths together with a backup tree is computationally very expensive, however. In this paper we propose a multi-agent approach based on ant colony optimization (ACO) for solving this problem. ACO algorithms use a set of relatively simple agents that model the behavior of real ants. In our algorithm multiple types of ants are used. Ants of the same type collaborate, but are in competition with the ants of other types. The idea is to let each type find a path in the network that is disjoint with that of other types. We also demonstrate a preliminary version of this algorithm in a series of simple experiments.
Peter Vrancx, Ann Nowé, Kris Steenhaut
Backmatter
Metadaten
Titel
Learning and Adaption in Multi-Agent Systems
herausgegeben von
Karl Tuyls
Pieter Jan’t Hoen
Katja Verbeeck
Sandip Sen
Copyright-Jahr
2006
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-540-33059-2
Print ISBN
978-3-540-33053-0
DOI
https://doi.org/10.1007/11691839

Premium Partner