Skip to main content
Top

2018 | OriginalPaper | Chapter

The Sharer’s Dilemma in Collective Adaptive Systems of Self-interested Agents

Authors : Lenz Belzner, Kyrill Schmid, Thomy Phan, Thomas Gabor, Martin Wirsing

Published in: Leveraging Applications of Formal Methods, Verification and Validation. Distributed Systems

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In collective adaptive systems (CAS), adaptation can be implemented by optimization wrt. utility. Agents in a CAS may be self-interested, while their utilities may depend on other agents’ choices. Independent optimization of agent utilities may yield poor individual and global reward due to locally interfering individual preferences. Joint optimization may scale poorly, and is impossible if agents cannot expose their preferences due to privacy or security issues.
In this paper, we study utility sharing for mitigating this issue. Sharing utility with others may incentivize individuals to consider choices that are locally suboptimal but increase global reward. We illustrate our approach with a utility sharing variant of distributed cross entropy optimization. Empirical results show that utility sharing increases expected individual and global payoff in comparison to optimization without utility sharing.
We also investigate the effect of greedy defectors in a CAS of sharing, self-interested agents. We observe that defection increases the mean expected individual payoff at the expense of sharing individuals’ payoff. We empirically show that the choice between defection and sharing yields a fundamental dilemma for self-interested agents in a CAS.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
We can account for the change of signature of \(u_i\) by extending the action space \(A_i\) of each agent accordingly: \(A_{s, i} = A_i \times \mathbb {R}, A_s = \times _{i \in N} A_{s, i}\).
 
Literature
1.
go back to reference Tan, M.: Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 330–337 (1993) Tan, M.: Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 330–337 (1993)
2.
go back to reference Hillston, J., Pitt, J., Wirsing, M., Zambonelli, F.: Collective adaptive systems: qualitative and quantitative modelling and analysis (dagstuhl seminar 14512). In: Dagstuhl Reports, vol. 4. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2015) Hillston, J., Pitt, J., Wirsing, M., Zambonelli, F.: Collective adaptive systems: qualitative and quantitative modelling and analysis (dagstuhl seminar 14512). In: Dagstuhl Reports, vol. 4. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2015)
4.
go back to reference Foerster, J., Nardelli, N., Farquhar, G., Torr, P., Kohli, P., Whiteson, S., et al.: Stabilising experience replay for deep multi-agent reinforcement learning. arXiv preprint arXiv:1702.08887 (2017) Foerster, J., Nardelli, N., Farquhar, G., Torr, P., Kohli, P., Whiteson, S., et al.: Stabilising experience replay for deep multi-agent reinforcement learning. arXiv preprint arXiv:​1702.​08887 (2017)
5.
go back to reference Phan, T., Belzner, L., Gabor, T., Schmid, K.: Leveraging statistical multi-agent online planning with emergent value function approximation. In: Proceedings of the 17th Conference on Autonomous Agents and Multi Agent Systems, International Foundation for Autonomous Agents and Multiagent Systems (2018) Phan, T., Belzner, L., Gabor, T., Schmid, K.: Leveraging statistical multi-agent online planning with emergent value function approximation. In: Proceedings of the 17th Conference on Autonomous Agents and Multi Agent Systems, International Foundation for Autonomous Agents and Multiagent Systems (2018)
6.
go back to reference Leibo, J.Z., Zambaldi, V., Lanctot, M., Marecki, J., Graepel, T.: Multi-agent reinforcement learning in sequential social dilemmas. In: Proceedings of the 16th Conference on Autonomous Agents and Multi Agent Systems, International Foundation for Autonomous Agents and Multiagent Systems, pp. 464–473 (2017) Leibo, J.Z., Zambaldi, V., Lanctot, M., Marecki, J., Graepel, T.: Multi-agent reinforcement learning in sequential social dilemmas. In: Proceedings of the 16th Conference on Autonomous Agents and Multi Agent Systems, International Foundation for Autonomous Agents and Multiagent Systems, pp. 464–473 (2017)
7.
go back to reference Perolat, J., Leibo, J.Z., Zambaldi, V., Beattie, C., Tuyls, K., Graepel, T.: A multi-agent reinforcement learning model of common-pool resource appropriation. In: Advances in Neural Information Processing Systems, pp. 3646–3655 (2017) Perolat, J., Leibo, J.Z., Zambaldi, V., Beattie, C., Tuyls, K., Graepel, T.: A multi-agent reinforcement learning model of common-pool resource appropriation. In: Advances in Neural Information Processing Systems, pp. 3646–3655 (2017)
8.
go back to reference Brundage, M., et al.: The malicious use of artificial intelligence: forecasting, prevention, and mitigation. arXiv preprint arXiv:1802.07228 (2018) Brundage, M., et al.: The malicious use of artificial intelligence: forecasting, prevention, and mitigation. arXiv preprint arXiv:​1802.​07228 (2018)
9.
go back to reference Lerer, A., Peysakhovich, A.: Maintaining cooperation in complex social dilemmas using deep reinforcement learning. arXiv preprint arXiv:1707.01068 (2017) Lerer, A., Peysakhovich, A.: Maintaining cooperation in complex social dilemmas using deep reinforcement learning. arXiv preprint arXiv:​1707.​01068 (2017)
10.
go back to reference Van der Hoek, W., Wooldridge, M.: Multi-agent systems. Found. Artif. Intell. 3, 887–928 (2008)CrossRef Van der Hoek, W., Wooldridge, M.: Multi-agent systems. Found. Artif. Intell. 3, 887–928 (2008)CrossRef
11.
go back to reference Fioretto, F., Pontelli, E., Yeoh, W.: Distributed constraint optimization problems and applications: a survey. arXiv preprint arXiv:1602.06347 (2016) Fioretto, F., Pontelli, E., Yeoh, W.: Distributed constraint optimization problems and applications: a survey. arXiv preprint arXiv:​1602.​06347 (2016)
13.
go back to reference Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRef
14.
go back to reference Silver, D., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017) Silver, D., et al.: Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:​1712.​01815 (2017)
15.
go back to reference Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)CrossRef Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)CrossRef
16.
go back to reference Anthony, T., Tian, Z., Barber, D.: Thinking fast and slow with deep learning and tree search. In: Advances in Neural Information Processing Systems, pp. 5366–5376 (2017) Anthony, T., Tian, Z., Barber, D.: Thinking fast and slow with deep learning and tree search. In: Advances in Neural Information Processing Systems, pp. 5366–5376 (2017)
17.
go back to reference Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Machine Learning Proceedings, pp. 157–163. Elsevier (1994) Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Machine Learning Proceedings, pp. 157–163. Elsevier (1994)
18.
go back to reference Foerster, J., Assael, I.A., de Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 2137–2145 (2016) Foerster, J., Assael, I.A., de Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 2137–2145 (2016)
19.
go back to reference Tampuu, A., et al.: Multiagent cooperation and competition with deep reinforcement learning. PLoS ONE 12(4), e0172395 (2017)CrossRef Tampuu, A., et al.: Multiagent cooperation and competition with deep reinforcement learning. PLoS ONE 12(4), e0172395 (2017)CrossRef
20.
go back to reference Sodomka, E., Hilliard, E., Littman, M., Greenwald, A.: Coco-Q: learning in stochastic games with side payments. In: International Conference on Machine Learning, pp. 1471–1479 (2013) Sodomka, E., Hilliard, E., Littman, M., Greenwald, A.: Coco-Q: learning in stochastic games with side payments. In: International Conference on Machine Learning, pp. 1471–1479 (2013)
21.
go back to reference Peysakhovich, A., Lerer, A.: Prosocial learning agents solve generalized stag hunts better than selfish ones. arXiv preprint arXiv:1709.02865 (2017) Peysakhovich, A., Lerer, A.: Prosocial learning agents solve generalized stag hunts better than selfish ones. arXiv preprint arXiv:​1709.​02865 (2017)
23.
go back to reference Kroese, D.P., Rubinstein, R.Y., Cohen, I., Porotsky, S., Taimre, T.: Cross-entropy method. In: Encyclopedia of Operations Research and Management Science, pp. 326–333. Springer, New York (2013) Kroese, D.P., Rubinstein, R.Y., Cohen, I., Porotsky, S., Taimre, T.: Cross-entropy method. In: Encyclopedia of Operations Research and Management Science, pp. 326–333. Springer, New York (2013)
24.
go back to reference Schelling, T.C.: Hockey helmets, concealed weapons, and daylight saving: a study of binary choices with externalities. J. Confl. Resolut. 17(3), 381–428 (1973)CrossRef Schelling, T.C.: Hockey helmets, concealed weapons, and daylight saving: a study of binary choices with externalities. J. Confl. Resolut. 17(3), 381–428 (1973)CrossRef
25.
go back to reference Foerster, J.N., Chen, R.Y., Al-Shedivat, M., Whiteson, S., Abbeel, P., Mordatch, I.: Learning with opponent-learning awareness. arXiv preprint arXiv:1709.04326 (2017) Foerster, J.N., Chen, R.Y., Al-Shedivat, M., Whiteson, S., Abbeel, P., Mordatch, I.: Learning with opponent-learning awareness. arXiv preprint arXiv:​1709.​04326 (2017)
26.
go back to reference Rabinowitz, N.C., Perbet, F., Song, H.F., Zhang, C., Eslami, S., Botvinick, M.: Machine theory of mind. arXiv preprint arXiv:1802.07740 (2018) Rabinowitz, N.C., Perbet, F., Song, H.F., Zhang, C., Eslami, S., Botvinick, M.: Machine theory of mind. arXiv preprint arXiv:​1802.​07740 (2018)
27.
go back to reference Sandholm, T.W., Crites, R.H.: Multiagent reinforcement learning in the iterated prisoner’s dilemma. Biosystems 37(1–2), 147–166 (1996)CrossRef Sandholm, T.W., Crites, R.H.: Multiagent reinforcement learning in the iterated prisoner’s dilemma. Biosystems 37(1–2), 147–166 (1996)CrossRef
Metadata
Title
The Sharer’s Dilemma in Collective Adaptive Systems of Self-interested Agents
Authors
Lenz Belzner
Kyrill Schmid
Thomy Phan
Thomas Gabor
Martin Wirsing
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-03424-5_16

Premium Partner