Coevolution of finite automata with errors

Ioannou, Christos A.

doi:10.1007/s00191-013-0325-5

Coevolution of finite automata with errors

Regular Article
Published: 06 August 2013

Volume 24, pages 541–571, (2014)
Cite this article

Journal of Evolutionary Economics Aims and scope Submit manuscript

Christos A. Ioannou¹

252 Accesses
8 Citations
Explore all metrics

Abstract

We use a genetic algorithm to simulate the evolution of error-prone finite automata in the repeated Prisoner’s Dilemma game. In particular, the automata are subjected to implementation and perception errors. The computational experiments examine whether and how the distribution of outcomes and genotypes of the coevolved automata change with different levels of errors. We find that the complexity of the automata is decreasing in the probability of errors. Furthermore, the prevailing structures tend to exhibit low reciprocal cooperation and low tolerance to defections as the probability of errors increases. In addition, by varying the error level, the study identifies a threshold. Below the threshold, the prevailing structures are closed-loop (history-dependent) and diverse, which impedes any inferential projections on the superiority of a particular automaton. However, at and above the threshold, the prevailing structures converge to the open-loop (history-independent) automaton Always-Defect (ALLD). Finally, we find that perception errors are more detrimental than implementation errors to the fitness of the automata. These resultsshow that the evolution of cooperative automata is considerably weaker than expected.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Characteristic Features of the Sustainable Strategies in the Evolvable Iterated Prisoners’ Dilemma

Finding a Prototype Form of Sustainable Strategies for the Iterated Prisoners Dilemma

Automata representation of successful strategies for social dilemmas

Article Open access 07 August 2020

Notes

Oscar Volij (2002) provides a theoretical framework which confirms that the only evolutionary stable strategy is ALLD when agents’ preferences are lexicographic (first, according to the limit of the means criterion, and second, according to the complexity of the automaton).
Abreu and Rubinstein (1988) defined the complexity of a strategy as the size of the minimal automaton implementing it, while Banks and Sundaram (1990) argued that the traditional number-of-states measure of complexity of an automaton neglects some essential features, such as informational requirements at a state. They proposed, instead, a criterion of complexity that takes into account both the size (number of states) and transitional structure of a machine. Under this proposition, they proved that the resulting Nash equilibria of the machine-game are now trivial: the machines recommend actions in every period that are invariably stage-game Nash equilibria.
We conjecture that similar results hold for other reciprocal strategies, such as Win-Stay, Lose-Shift. Win-Stay, Lose-Shift, also known as Pavlov, repeats the previous action if the resulting payoff has met the aspiration level, and changes otherwise.
To be consistent with the conventional notation in Markov chains, the word “outcome” is replaced with the word “state”.
A Markov transition matrix is a matrix used to describe the transitions of a Markov chain. Each of its entries is a nonnegative real number representing a probability. This is not to be confused with a transition function in the context of finite automata, which maps every two-tuple of state and opponent’s action to a state. (The concept of a finite automaton will be formally defined in Section 4).
The limit value of the payoff for 𝜖 → 0 and δ → 0 cannot be computed, as the transition matrix is no longer irreducible. Therefore, the stationary distribution Π is no longer uniquely defined.
The payoffs for TFT and ALLD for each possible value of x are provided in the Appendix.
We are grateful to an anonymous referee, who laid down the foundations to this analysis, and for pointing out the illusory nature of the process.
The transition function depends only on the present state and the other agent’s action. This formalization fits the natural description of a strategy as agent i’s plan of action in all possible circumstances that are consistent with agent i’s plans. However, the notion of a game-theoretic strategy for agent i requires the specification of an action for every possible history, including those that are inconsistent with agent i’s plan of action. To formulate the game-theoretic strategy, one would have to construct the transition function such that τ ⁱ: Q ⁱ × A → Q ⁱinstead of τ ⁱ: Q ⁱ × A ^{− i} → Q ⁱ.
A more general definition would postulate that the automaton of agent i commits an implementation error with probability 𝜖, when, for any given state q, the automaton’s output function returns the action f ⁱ(q) with probability 1 − 𝜖 and draws another action a ⁱ ∈ A ⁱ ∖ f ⁱ(q) randomly and uniformly otherwise. However, since the action space in the PD game consists of only two actions, the former definition suffices.
To test the robustness of the results with respect to the size selection, computational experiments with automata that held 16 states were also conducted. The computations performed confirm that the results are robust.
For a more-detailed discussion of genetic algorithms, see An Introduction to Genetic Algorithms (Mitchell 1998). No previous knowledge of genetic algorithms is required to understand the description of this genetic algorithm.
Given this layout, there are 2⁵⁹possible structures. Yet, the total number of unique structures is much less since some of the automata are isomorphic. For example, every two-state automaton, such as TFT, can be renamed in any of \((P^{8}_{2}) (2^{42})\) ways.
Bergin and Lipman (1996) indicate that mutation rates represent either experimentation or computational errors. In our context, mutation rates represent experimentation. In addition, we assume that the mutation rate is constant across strings, across agents and over time.
A higher mutation rate increases the variation in the population; as a result, in such an environment, the automaton ALLD is hugely favored. The computational simulations performed with higher rates of mutation ( 8% and 12% independent chance of mutation per element, which implies an expectation of 2 and 3 elements-mutation per string, respectively) confirm this claim. On the other hand, the choice of a 4% independent chance of mutation is a conservative one that allows forms of innovation to appear in the structure of the automata while controlling for the variation in the population.
The average payoff of a given generation t is calculated as the sample average of the payoffs of the thirty members of the population and over the thirty simulations conducted for each treatment.
This definition does not mandate that the opponent is part of the current population. Thus, we circumvent the problem that some states may not be reached, given the set of opponents. We clarify further our definition with an example. Suppose that, in some generation, t, the population consists of one-state automata that Always-Cooperate (ALLC) and one automaton that cooperates unless the opponent defects for seven consecutive periods, after which the automaton defects forever. Given our definition, such an automaton has eight accessible states. The ALLCs have one accessible state. We thank an anonymous referee for bringing to our attention the necessity of this clarification.
The “temptation” payoff is the payoff upon unilaterally defecting in the PD game.
Table 7 in the Appendix provides the frequency of wins of ALLD, ALLC, and Other automata for each computational treatment.
Table 8 in the Appendix provides the frequencies for each computational treatment for the perception errors, and Panel B provides the frequencies for each computational treatment for the implementation errors.
Due to its generosity, NAF lost in its pairwise play with every one of its opponents. In contrast to NAF’s pattern, VIGILANT, the strategy that placed dead last in the tournament, beat every one of its partners in bilateral play. VIGILANT was a highly provocable and unforgiving strategy that retaliated sharply if it inferred that its partner was playing anything less than maximal cooperation.
Recall that the “sucker’s” payoff is the payoff upon unilaterally cooperating in the PD game and the “temptation” payoff is the payoff upon unilaterally defecting.
The Negative Binomial Distribution model assumes that purchases follow a Poisson process with a purchase parameter λ, that customers’ lifetimes follow an exponential distribution with a dropout rate parameter μ, and that, across customers, purchase and dropout rates are distributed according to a gamma distribution.
Once the automaton enters an absorbing state, it remains there for the duration of the game-play. Trigger automata have absorbing states as well as open-loop automata.
We thank an anonymous referee for suggesting this vital direction.

References

Abreu D, Rubinstein A (1988) The structure of nash equilibrium in repeated games with finite automata. Econometrica 56:1259–1282
Article Google Scholar
Aumann R (1981) Survey of repeated games. In: Essays in game theory and mathematical economics in honor of Oskar Morgenstern. Mannheim, Bibliographisches Institut
Google Scholar
Axelrod R (1984) The evolution of cooperation. Basic Books, New York
Google Scholar
Axelrod R (1987) The evolution of strategies in the iterated prisoner’s dilemma. In: Davis L (ed) Genetic algorithms and simulated annealing. London, California, Pitman, Morgan Kaufman, Los Altos, pp 32–41
Axelrod R, Dion D (1988) The further evolution of cooperation. Science 242:1385–1390
Article Google Scholar
Banks JS, Sundaram RK (1990) Repeated games, finite automata, and complexity. Games Econ Behav 2:97–117
Article Google Scholar
Bendor J, Kramer R, Stout S (1991) When in doubt... cooperation in a noisy prisoner’s dilemma. J Confl Resolut 35:691–719
Article Google Scholar
Ben-Shoham A, Serrano R, Volij O (2004) The evolution of exchange. J Econ Theory 114:310–328
Article Google Scholar
Bergin J, Lipman BL (1996) Evolution with state-dependent mutations. Econometrica 64:943–956
Article Google Scholar
Binmore KG, Samuelson L (1992) Evolutionary stability in repeated games played by finite automata. J Econ Theory 57:278–305
Article Google Scholar
Boyd R, Loberbaum J (1987) No pure strategy is evolutionary stable in the repeated prisoner’s dilemma game. Nature 327:58–59
Article Google Scholar
Dal Bó P, Fréchette GR (2011) The evolution of cooperation in infinitely repeated games: experimental evidence. Am Econ Rev 101:411–429
Article Google Scholar
Dawes RM, Thaler RH (1988) Anomalies: cooperation. J Econ Perspect 2:187–197
Article Google Scholar
Foster DP, Young HP (1990) Stochastic evolutionary game dynamics. Theor Popul Biol 38:219–232
Article Google Scholar
Franchini E, Brito CJ, Artioli GG (2012) Weight loss in combat sports: physiological, psychological and performance effects. J Int Soc Sports Nutr 9:52
Article Google Scholar
Fudenberg D, Maskin E (1986) The folk theorem in repeated games with discounting and with incomplete information. Econometrica 54:533–554
Article Google Scholar
Gigerenzer G, Gaissmaier W (2011) Heuristic decision making. Annu Rev Psychol 62:451–482
Article Google Scholar
Glover F, Laguna M (1993) Tabu search. In: CR Reeves (ed) Modern heuristic techniques for combinatorial problems. Blackwell Scientific Publishing, Oxford, p 70
Google Scholar
Goldberg D (1989) Genetic algorithms in search optimization, and machine learning. Addison-Wesley Publishing Company, New York
Google Scholar
Hamo Y, Markovitch S (2005) The compset algorithm for subset selection. In: Proceedings of the nineteenth international joint conference for artificial intelligence, p 728–733
Heiner R (1983) The origins of predictable behavior. Am Econ Rev 73:560–595
Google Scholar
Holland J (1975) Adaptation in natural and artificial systems. MIT Press, Cambridge
Google Scholar
Ioannou C, Qi S, Rustichini A (2012) Group outcomes and group behavior. Working Paper, University of Southampton
Kandori M, Mailath G, Rob R (1993) Learning, mutation, and long run equilibria in games. Econometrica 61:29–56
Article Google Scholar
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220:671–680
Article Google Scholar
Marks RE (1992) Breeding hybrid strategies: optimal behavior for oligopolists. J Evol Econ 2:17–38
Article Google Scholar
Miller J (1996) The coevolution of automata in the repeated prisoner’s dilemma. J Econ Behav Organ 29:87–112
Article Google Scholar
Mitchell M (1998) An introduction to genetic algorithms. MIT Press, Cambridge
Google Scholar
Moore E (1956) Gedanken experiments on sequential machines. Ann Math Stud 34:129–153
Google Scholar
Nowak MA, Sigmund K (1992) Tit-For-Tat in heterogeneous populations. Nature 355:250–253
Article Google Scholar
Nowak MA, Sigmund K (1993) A strategy of win-stay, lose-shift that outperforms Tit-For-Tat in prisoner’s dilemma. Nature 364:56–58
Article Google Scholar
Neyman A (1985) Bounded complexity justifies cooperation in the finitely repeated prisoner’s dilemma. Econ Lett 19:227–229
Article Google Scholar
Rubinstein A (1986) Finite automata play the repeated prisoner’s dilemma. J Econ Theory 39:83–96
Article Google Scholar
Rust J, Miller JH, Palmer R (1994) Characterizing effective trading strategies. J Econ Dyn Control 18:61–96
Article Google Scholar
Selten R, Mitzkewitz G, Uhlich R (1997) Duopoly strategies programmed by experienced traders. Econometrica 65:517–555
Article Google Scholar
Steen SN, Brownell KD (1990) Patterns of weight loss and regain in wrestlers: has the tradition changed? Med Sci Sports Exerc 22:762–768
Article Google Scholar
Volij O (2002) In defense of defect. Games Econ Behav 39:309–321
Article Google Scholar
Vega-Redondo F (1997) The evolution of walrasian behaviour. Econometrica 65:375–384
Article Google Scholar
Wubben M, von Wangenheim F (2008) Instant customer base analysis: managerial heuristics often get it right. J Mark 72:82–93
Article Google Scholar

Download references

Acknowledgments

I am indebted to Aldo Rustichini and Ket Richter for their continuous support and invaluable discussions. I would also like to thank John H. Miller, Scott E. Page, Larry Blume, David Rahman, Itai Sher, Ichiro Obara, Snigdhansu Chatterjee, Ioannis Nompelis, Shi Qi, Miltos Makris, Maria Kyriacou, Kostas Biliouris, and Panayotis Mertikopoulos as well as the seminar participants at the University of Vienna, Santa Fe Institute, University of Minnesota, and Washington University for their suggestions. Finally, I am grateful to the editor, Uwe Cantner, and two anonymous referees for their thoughtful and detailed comments, which significantly improved the paper. Errors are mine.

Author information

Authors and Affiliations

Department of Economics, University of Southampton, Southampton, SO17 1BJ, UK
Christos A. Ioannou

Authors

Christos A. Ioannou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christos A. Ioannou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ioannou, C.A. Coevolution of finite automata with errors. J Evol Econ 24, 541–571 (2014). https://doi.org/10.1007/s00191-013-0325-5

Download citation

Published: 06 August 2013
Issue Date: July 2014
DOI: https://doi.org/10.1007/s00191-013-0325-5

Keywords

JEL Classifications

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Coevolution of finite automata with errors

Abstract

Access this article

Similar content being viewed by others

Characteristic Features of the Sustainable Strategies in the Evolvable Iterated Prisoners’ Dilemma

Finding a Prototype Form of Sustainable Strategies for the Iterated Prisoners Dilemma

Automata representation of successful strategies for social dilemmas

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

JEL Classifications

Navigation

Coevolution of finite automata with errors

Abstract

Access this article

Similar content being viewed by others

Characteristic Features of the Sustainable Strategies in the Evolvable Iterated Prisoners’ Dilemma

Finding a Prototype Form of Sustainable Strategies for the Iterated Prisoners Dilemma

Automata representation of successful strategies for social dilemmas

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classifications

Search

Navigation