Skip to main content
Erschienen in: Autonomous Agents and Multi-Agent Systems 1/2022

01.04.2022

Privacy preserving planning in multi-agent stochastic environments

verfasst von: Tommy Hefner, Guy Shani, Roni Stern

Erschienen in: Autonomous Agents and Multi-Agent Systems | Ausgabe 1/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Collaborative privacy preserving planning (cppp) gained much attention in the past decade. cppp aims to create solutions for multi agent planning problems where cooperation is required to achieve an efficient solution, without exposing information that the agent considers private in the process. To date, cppp has focused on domains with deterministic action effects. However, in real-world problems action effects are often non-deterministic, and actions can have multiple possible effects with varying probabilities. In this paper, we introduce Stochastic cppp (scppp), which is an extension of cppp to domains with stochastic action effects. We show how scppp can be modeled as a Markov decision process (mdp) and how the value-iteration algorithm can be adapted to solve it. This adaptation requires extending value-iteration to support multiple agents and privacy. Then, we present two adaptions of the real-time dynamic programming (rtdp) algorithm, a popular algorithm for solving mdps, designed to solve scppp problems. The first rtdp adaptation, called distributed rtdp (drtdp), yields identical behavior to applying rtdp in a centralized manner on the joint problem. To preserve privacy, drtdp uses a message passing mechanism adopted from the mafs algorithm. The second rtdp adaptation is an approximation of drtdp called public synchronization rtdp (ps-rtdp). ps-rtdp differs from drtdp mainly in its message passing mechanism, where ps-rtdp sends significantly fewer messages than drtdp. We experimented on domains adapted from the deterministic cppp literature by adding different stochastic effects to different actions. The results show that ps-rtdp can reduce the amount of messages compared to drtdp by orders of magnitude thus improving run-time, while producing policies with similar expected costs.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Hefner, T., Stern, R., & Shani, G. (2020). Privacy preserving planning in stochastic environments. In J. C. Beck, O. Buffet, J. Hoffmann, E. Karpas, & S. Sohrabi, (Eds.). Proceedings of the Thirtieth International Conference on Automated Planning and Scheduling, Nancy, France, October 26–30 (pp. 258–262). AAAI Press. https://aaai.org/ojs/index.php/ICAPS/article/view/6669 Hefner, T., Stern, R., & Shani, G. (2020). Privacy preserving planning in stochastic environments. In J. C. Beck, O. Buffet, J. Hoffmann, E. Karpas, & S. Sohrabi, (Eds.). Proceedings of the Thirtieth International Conference on Automated Planning and Scheduling, Nancy, France, October 26–30 (pp. 258–262). AAAI Press. https://​aaai.​org/​ojs/​index.​php/​ICAPS/​article/​view/​6669
2.
Zurück zum Zitat Brafman, R. I., & Domshlak, C. (2008). From one to many: Planning for loosely coupled multi-agent systems. In ICAPS (pp. 28–35). Brafman, R. I., & Domshlak, C. (2008). From one to many: Planning for loosely coupled multi-agent systems. In ICAPS (pp. 28–35).
3.
Zurück zum Zitat Bellman, R. (1966). Dynamic programming. Science, 153(3731), 34–37.CrossRef Bellman, R. (1966). Dynamic programming. Science, 153(3731), 34–37.CrossRef
4.
Zurück zum Zitat Kolobov, A. . (2012). Planning with Markov decision processes: An AI perspective. Synthesis Lectures on Artificial Intelligence and Machine Learning, 6(1), 1–210.MATH Kolobov, A. . (2012). Planning with Markov decision processes: An AI perspective. Synthesis Lectures on Artificial Intelligence and Machine Learning, 6(1), 1–210.MATH
5.
Zurück zum Zitat Barto, A. G., Bradtke, S. J., & Singh, S. P. (1995). Learning to act using real-time dynamic programming. Artificial Intelligence, 72(1–2), 81–138.CrossRef Barto, A. G., Bradtke, S. J., & Singh, S. P. (1995). Learning to act using real-time dynamic programming. Artificial Intelligence, 72(1–2), 81–138.CrossRef
6.
Zurück zum Zitat Nissim, R., & Brafman, R. I. (2014). Distributed heuristic forward search for multi-agent planning. JAIR, 51, 293–332.MathSciNetCrossRef Nissim, R., & Brafman, R. I. (2014). Distributed heuristic forward search for multi-agent planning. JAIR, 51, 293–332.MathSciNetCrossRef
7.
Zurück zum Zitat Štolba, M., Komenda, A., & Kovacs, D. L. (2015). Competition of distributed and multiagent planners (codmap). In The International Planning Competition (WIPC-15), (p. 24). Štolba, M., Komenda, A., & Kovacs, D. L. (2015). Competition of distributed and multiagent planners (codmap). In The International Planning Competition (WIPC-15), (p. 24).
8.
Zurück zum Zitat Brafman, R. I., & Domshlak, C. (2013). On the complexity of planning for agent teams and its implications for single agent planning. Artificial Intelligence, 198, 52–71.MathSciNetCrossRef Brafman, R. I., & Domshlak, C. (2013). On the complexity of planning for agent teams and its implications for single agent planning. Artificial Intelligence, 198, 52–71.MathSciNetCrossRef
9.
Zurück zum Zitat Brafman, R. I. (2015). A privacy preserving algorithm for multi-agent planning and search. In IJCAI (pp. 1530–1536). Brafman, R. I. (2015). A privacy preserving algorithm for multi-agent planning and search. In IJCAI (pp. 1530–1536).
10.
Zurück zum Zitat Tožička, J., Štolba, M., & Komenda, A. (2017). The limits of strong privacy preserving multi-agent planning. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 27). Tožička, J., Štolba, M., & Komenda, A. (2017). The limits of strong privacy preserving multi-agent planning. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 27).
11.
Zurück zum Zitat Maliah, S., Shani, G., & Stern, R. (2018). Action dependencies in privacy-preserving multi-agent planning. Autonomous Agents and Multi-Agent Systems, 32(6), 779–821.CrossRef Maliah, S., Shani, G., & Stern, R. (2018). Action dependencies in privacy-preserving multi-agent planning. Autonomous Agents and Multi-Agent Systems, 32(6), 779–821.CrossRef
12.
Zurück zum Zitat Maliah, S., Shani, G., & Brafman, R. I. (2016a). Online macro generation for privacy preserving planning. In Twenty-Sixth International Conference on Automated Planning and Scheduling. Maliah, S., Shani, G., & Brafman, R. I. (2016a). Online macro generation for privacy preserving planning. In Twenty-Sixth International Conference on Automated Planning and Scheduling.
13.
Zurück zum Zitat Wu, F., Zilberstein, S., & Chen, X. (2018). Privacy-preserving policy iteration for decentralized pomdps. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI), New Orleans, USA (pp. 4759–4766). Wu, F., Zilberstein, S., & Chen, X. (2018). Privacy-preserving policy iteration for decentralized pomdps. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI), New Orleans, USA (pp. 4759–4766).
14.
Zurück zum Zitat Štolba, M., Tožička, J., & Komenda, A. (2018). Quantifying privacy leakage in multi-agent planning. TOIT, 18(3), 28.CrossRef Štolba, M., Tožička, J., & Komenda, A. (2018). Quantifying privacy leakage in multi-agent planning. TOIT, 18(3), 28.CrossRef
15.
Zurück zum Zitat Štolba, M., Fišer, D., & Komenda, A. (2019). Privacy leakage of search-based multi-agent planning algorithms. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 29, pp. 482–490). Štolba, M., Fišer, D., & Komenda, A. (2019). Privacy leakage of search-based multi-agent planning algorithms. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 29, pp. 482–490).
16.
Zurück zum Zitat Guestrin, C., Koller, D., & Parr, R. (2001). Max-norm projections for factored mdps. InIn IJCAI (Vol. 17, pp. 673–680). Guestrin, C., Koller, D., & Parr, R. (2001). Max-norm projections for factored mdps. InIn IJCAI (Vol. 17, pp. 673–680).
17.
Zurück zum Zitat Bellman, R. (1957). A Markovian decision process. Journal of Mathematics and Mechanics, 6, 679–684.MathSciNetMATH Bellman, R. (1957). A Markovian decision process. Journal of Mathematics and Mechanics, 6, 679–684.MathSciNetMATH
18.
Zurück zum Zitat Puterman, M. L. (2014). Markov decision processes: Discrete stochastic dynamic programming. Wiley.MATH Puterman, M. L. (2014). Markov decision processes: Discrete stochastic dynamic programming. Wiley.MATH
19.
Zurück zum Zitat Trevizan, F. W., Teichteil-Königsbuch, F., & Thiébaux, S. (2017). Efficient solutions for stochastic shortest path problems with dead ends. In UAI. Trevizan, F. W., Teichteil-Königsbuch, F., & Thiébaux, S. (2017). Efficient solutions for stochastic shortest path problems with dead ends. In UAI.
20.
Zurück zum Zitat Du, Wenliang., & Zhan, Zhijun. (2002). A practical approach to solve secure multi-party computation problems. In Proceedings of the 2002 workshop on New security paradigms, pp 127–135. Du, Wenliang., & Zhan, Zhijun. (2002). A practical approach to solve secure multi-party computation problems. In Proceedings of the 2002 workshop on New security paradigms, pp 127–135.
21.
Zurück zum Zitat Bertsekas, D. (1982). Distributed dynamic programming. IEEE Transactions on Automatic Control, 27(3), 610–616.MathSciNetCrossRef Bertsekas, D. (1982). Distributed dynamic programming. IEEE Transactions on Automatic Control, 27(3), 610–616.MathSciNetCrossRef
22.
Zurück zum Zitat Witzig, J., Beckenbach, I., Eifler, L., Fackeldey, K., Gleixner, A., Grever, A., & Weber, M. (2018). Mixed-integer programming for cycle detection in nonreversible Markov processes. Multiscale Modeling & Simulation, 16(1), 248–265.MathSciNetCrossRef Witzig, J., Beckenbach, I., Eifler, L., Fackeldey, K., Gleixner, A., Grever, A., & Weber, M. (2018). Mixed-integer programming for cycle detection in nonreversible Markov processes. Multiscale Modeling & Simulation, 16(1), 248–265.MathSciNetCrossRef
24.
Zurück zum Zitat Papadimitriou, C. H., & Tsitsiklis, J. N. (1987). The complexity of Markov decision processes. Mathematics of Operations Research, 12(3), 441–450.MathSciNetCrossRef Papadimitriou, C. H., & Tsitsiklis, J. N. (1987). The complexity of Markov decision processes. Mathematics of Operations Research, 12(3), 441–450.MathSciNetCrossRef
25.
Zurück zum Zitat Bertsekas, D. P., & Tsitsiklis, J. N. (1991). An analysis of stochastic shortest path problems. Mathematics of Operations Research, 16(3), 580–595.MathSciNetCrossRef Bertsekas, D. P., & Tsitsiklis, J. N. (1991). An analysis of stochastic shortest path problems. Mathematics of Operations Research, 16(3), 580–595.MathSciNetCrossRef
27.
Zurück zum Zitat Štolba, M., & Komenda, A. (2017). The Madla planner: Multi-agent planning by combination of distributed and local heuristic search. Artificial Intelligence, 252, 175–210.MathSciNetCrossRef Štolba, M., & Komenda, A. (2017). The Madla planner: Multi-agent planning by combination of distributed and local heuristic search. Artificial Intelligence, 252, 175–210.MathSciNetCrossRef
28.
Zurück zum Zitat Maliah, S., Shani, G., & Stern, R. (2016b). Stronger privacy preserving projections for multi-agent planning. In the International Conference on Automated Planning and Scheduling (ICAPS) (pp. 221–229). Maliah, S., Shani, G., & Stern, R. (2016b). Stronger privacy preserving projections for multi-agent planning. In the International Conference on Automated Planning and Scheduling (ICAPS) (pp. 221–229).
29.
Zurück zum Zitat Maliah, S., Shani, G., & Stern, R. (2016c). Stronger privacy preserving projections for multi-agent planning. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 26). Maliah, S., Shani, G., & Stern, R. (2016c). Stronger privacy preserving projections for multi-agent planning. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 26).
30.
Zurück zum Zitat Maliah, S., Shani, G., & Stern, R. (2014). Privacy preserving landmark detection. In the European Conference on Artificial Intelligence (ECAI) (pp. 597–602). Maliah, S., Shani, G., & Stern, R. (2014). Privacy preserving landmark detection. In the European Conference on Artificial Intelligence (ECAI) (pp. 597–602).
31.
32.
Zurück zum Zitat Gerevini, A. E., Lipovetzky, N., Percassi, F., Saetti, A., & Serina, I. (2019). Best-first width search for multi agent privacy-preserving planning. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 29, pp. 163–171). Gerevini, A. E., Lipovetzky, N., Percassi, F., Saetti, A., & Serina, I. (2019). Best-first width search for multi agent privacy-preserving planning. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 29, pp. 163–171).
33.
Zurück zum Zitat Gohari, P., Hale, M., & Topcu, U. (2020). Privacy-preserving policy synthesis in markov decision processes. In 2020 59th IEEE Conference on Decision and Control (CDC) (pp. 6266–6271). IEEE. Gohari, P., Hale, M., & Topcu, U. (2020). Privacy-preserving policy synthesis in markov decision processes. In 2020 59th IEEE Conference on Decision and Control (CDC) (pp. 6266–6271). IEEE.
34.
Zurück zum Zitat Venkitasubramaniam, P. (2013). Privacy in stochastic control: A markov decision process perspective. In 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton) (pp. 381–388). IEEE. Venkitasubramaniam, P. (2013). Privacy in stochastic control: A markov decision process perspective. In 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton) (pp. 381–388). IEEE.
35.
Zurück zum Zitat Boutilier, C. (1999). Sequential optimality and coordination in multiagent systems. In In IJCAI (Vol. 99, pp. 478–485). Boutilier, C. (1999). Sequential optimality and coordination in multiagent systems. In In IJCAI (Vol. 99, pp. 478–485).
36.
Zurück zum Zitat Amato, C., Chowdhary, G., Geramifard, A., Kemal Üre, N., & Kochenderfer, M. J. (2013). Decentralized control of partially observable markov decision processes. In 52nd IEEE Conference on Decision and Control (pp. 2398–2405). IEEE. Amato, C., Chowdhary, G., Geramifard, A., Kemal Üre, N., & Kochenderfer, M. J. (2013). Decentralized control of partially observable markov decision processes. In 52nd IEEE Conference on Decision and Control (pp. 2398–2405). IEEE.
37.
Zurück zum Zitat Kocsis, L., Szepesvári, C. (2006). Bandit based monte-carlo planning. In European conference on machine learning (pp. 282–293). Springer. Kocsis, L., Szepesvári, C. (2006). Bandit based monte-carlo planning. In European conference on machine learning (pp. 282–293). Springer.
38.
Zurück zum Zitat Brendan, M. H., Likhachev, M., & Gordon, G. J. (2005). Bounded real-time dynamic programming: Rtdp with monotone upper bounds and performance guarantees. In Proceedings of the 22nd international conference on Machine learning (pp. 569–576). ACM. Brendan, M. H., Likhachev, M., & Gordon, G. J. (2005). Bounded real-time dynamic programming: Rtdp with monotone upper bounds and performance guarantees. In Proceedings of the 22nd international conference on Machine learning (pp. 569–576). ACM.
39.
Zurück zum Zitat Smith, T., & Simmons, R. (2006). Focused real-time dynamic programming for mdps: Squeezing more out of a heuristic. In AAAI (pp. 1227–1232). Smith, T., & Simmons, R. (2006). Focused real-time dynamic programming for mdps: Squeezing more out of a heuristic. In AAAI (pp. 1227–1232).
40.
41.
Zurück zum Zitat Keller, T., & Eyerich, P. (2012). Prost: Probabilistic planning based on uct. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 22). Keller, T., & Eyerich, P. (2012). Prost: Probabilistic planning based on uct. In Proceedings of the International Conference on Automated Planning and Scheduling (Vol. 22).
42.
Zurück zum Zitat Bonet, B., & Geffner, H. (2006). Learning depth-first search: A unified approach to heuristic search in deterministic and non-deterministic settings, and its application to mdps. In In ICAPS (Vol. 6, pp. 142–151). Bonet, B., & Geffner, H. (2006). Learning depth-first search: A unified approach to heuristic search in deterministic and non-deterministic settings, and its application to mdps. In In ICAPS (Vol. 6, pp. 142–151).
43.
Zurück zum Zitat Stolba, M., Tozicka, J., & Komenda, A. (2018). Quantifying privacy leakage in multi-agent planning. ACM Transactions on Internet Technology, 18(3), 28:1-28:21.CrossRef Stolba, M., Tozicka, J., & Komenda, A. (2018). Quantifying privacy leakage in multi-agent planning. ACM Transactions on Internet Technology, 18(3), 28:1-28:21.CrossRef
44.
Zurück zum Zitat Stolba, M., Fiser, D., & Komenda, A. (2019). Privacy leakage of search-based multi-agent planning algorithms. In J. Benton, N. Lipovetzky, E. Onaindia, D. E. Smith, & S. Srivastava (Eds.). Proceedings of the Twenty-Ninth International Conference on Automated Planning and Scheduling, ICAPS 2018, Berkeley, CA, USA, July 11–15, 2019 (pp. 482–490). AAAI Press. Stolba, M., Fiser, D., & Komenda, A. (2019). Privacy leakage of search-based multi-agent planning algorithms. In J. Benton, N. Lipovetzky, E. Onaindia, D. E. Smith, & S. Srivastava (Eds.). Proceedings of the Twenty-Ninth International Conference on Automated Planning and Scheduling, ICAPS 2018, Berkeley, CA, USA, July 11–15, 2019 (pp. 482–490). AAAI Press.
Metadaten
Titel
Privacy preserving planning in multi-agent stochastic environments
verfasst von
Tommy Hefner
Guy Shani
Roni Stern
Publikationsdatum
01.04.2022
Verlag
Springer US
Erschienen in
Autonomous Agents and Multi-Agent Systems / Ausgabe 1/2022
Print ISSN: 1387-2532
Elektronische ISSN: 1573-7454
DOI
https://doi.org/10.1007/s10458-022-09554-w

Weitere Artikel der Ausgabe 1/2022

Autonomous Agents and Multi-Agent Systems 1/2022 Zur Ausgabe

Premium Partner