nach oben

Autonomous Agents and Multi-Agent Systems

Erschienen in:

01.01.2016

Strategic advice provision in repeated human-agent interactions

verfasst von: Amos Azaria, Ya’akov Gal, Sarit Kraus, Claudia V. Goldman

Erschienen in: Autonomous Agents and Multi-Agent Systems | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper addresses the problem of automated advice provision in scenarios that involve repeated interactions between people and computer agents. This problem arises in many applications such as route selection systems, office assistants and climate control systems. To succeed in such settings agents must reason about how their advice influences people’s future actions or decisions over time. This work models such scenarios as a family of repeated bilateral interaction called “choice selection processes”, in which humans or computer agents may share certain goals, but are essentially self-interested. We propose a social agent for advice provision (SAP) for such environments that generates advice using a social utility function which weighs the sum of the individual utilities of both agent and human participants. The SAP agent models human choice selection using hyperbolic discounting and samples the model to infer the best weights for its social utility function. We demonstrate the effectiveness of SAP in two separate domains which vary in the complexity of modeling human behavior as well as the information that is available to people when they need to decide whether to accept the agent’s advice. In both of these domains, we evaluated SAP in extensive empirical studies involving hundreds of human subjects. SAP was compared to agents using alternative models of choice selection processes informed by behavioral economics and psychological models of decision-making. Our results show that in both domains, the SAP agent was able to outperform alternative models. This work demonstrates the efficacy of combining computational methods with behavioral economics to model how people reason about machine-generated advice and presents a general methodology for agent-design in such repeated advice settings.

Vorheriger Artikel Guest Editorial

Nächster Artikel Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

We use the term “world state” to disambiguate the states of an MDP from those of a selection process.

This method is more common in POMDPs, however, since our state space is very large, we use this method as well.

This model does not require an additional parameter for the actual cost for the receiver (\(c_R(a,v)\)), since \(c_R(a,v)\) is already a linear combination of the comfort level and the energy consumption.

In fact, the exact equivalent to the road selection domain, would be assuming that the user set a cost to each of the possible combinations of the heat load and each of the possible power levels. However, such an assumption would result with too many arms, most of which would not be sampled or sampled only once, and thus would not result in a good human model.

Camerer, C. F. (2003). Behavioral game theory. Experiments in strategic interaction, Chapter 2. Princeton: Princeton University Press.

Bonaccio, S., & Dalal, R. S. (2006). Advice taking and decision-making: An integrative literature review and implications for the organizational sciences. Organizational Behavior and Human Decision Processes, 101(2), 127–151.CrossRef

Yaniv, I., & Kleinberger, E. (2000). Advice taking in decision making: Egocentric discounting and reputation formation. Organizational Behavior and Human Decision Processes, 83(2), 260–281.CrossRef

Gans, N., Knox, G., & Croson, R. (2007). Simple models of discrete choice and their performance in bandit experiments. Manufacturing & Service Operations Management, 9(4), 383–408.CrossRef

Haile, P. A., Hortasu, A., & Kosenok, G. (2008). On the empirical content of quantal response equilibrium. American Economic Review, 98(1), 180–200.CrossRef

Amazon. (2010). Mechanical turk services. Retrieved from http://www.mturk.com/.

Azaria, A., Rabinovich, Z., Kraus, S., Goldman, C. V., & Gal, Y. (2012). Strategic advice provision in repeated human-agent interactions. In The 26th AAAI Conference on Artificial Intelligence (AAAI), Bellevue, WA.

Jonker, C. M., Hindriks, K. V., Wiggers, P., & Broekens, J. (2012). Negotiating agents. AI Magazine, 33(3), 79.

Rovatsos, M., & Belesiotis, A. (2007). Advice taking in multiagent reinforcement learning. In AAMAS (pp. 237). New York: ACM.

10.

Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6), 734–749.CrossRef

11.

Ricci, F., Rokach, L., Shapira, B., & Kantor, P. B. (Eds.). (2011). Recommender systems handbook. New York: Springer.MATH

12.

Azaria, A., Hassidim, A., Kraus, S., Eshkol, A., Weintraub, O., & Netanely, I. (2013). Movie recommender system for profit maximization. In RecSys (pp. 121–128).

13.

Chen, L. S., Hsu, F. H., Chen, M. C., & Hsu, Y. C. (2008). Developing recommender systems with the consideration of product profitability for sellers. Information Sciences, 178(4), 1032–1048.CrossRef

14.

Das, A., Mathieu, C., & Ricketts, D. (2009). Maximizing profit using recommender systems. ArXiv e-prints, pp. 0908, 3633.

15.

Pathak, B., Garfinkel, R., Gopal, R. D., Venkatesan, R., & Yin, F. (2010). Empirical analysis of the impact of recommender systems on sales. Journal of Management Information Systems, 27(2), 159–188.CrossRef

16.

Shani, G., Heckerman, D., & Brafman, R. I. (2005). An MDP-based recommender system. The Journal of Machine Learning Research, 6, 1265–1295.MATHMathSciNet

17.

Rosenberg, S. W., Bohan, L., McCafferty, P., & Harris, K. (1986). The image and the vote: The effect of candidate presentation on voter preference. American Journal of Political Science, 30, 108–127.CrossRef

18.

Fenster, M., Zuckerman, I., & Kraus, S. (2012). Guiding user choice during discussion by silence, examples and justifications. ECAI (pp. 330–335). Amsterdam: IOS Press.

19.

Azaria, A., Rabinovich, Z., Kraus, S., & Goldman, C. V. (2011). Strategic information disclosure to people with multiple alternatives. In Proceedings of the 26th AAAI Conference on artificial intelligence (AAAI), Maryland.

20.

Hajaj, C., Hazon, N., & Sarne, D. (2014). Ordering effects and belief adjustment in the use of comparison shopping agents. In AAAI-14 (pp. 930–936). Israel: Bar-Ilan University.

21.

Hajaj, C., Hazon, N., Sarne, D., & Elmalech, A. (2013). Search more, disclose less. In Proceedings of the twenty-seventh AAAI conference on artificial intelligence (pp. 401–408), Bellevue.

22.

Elmalech, A., Sarne, D., Rosenfeld, A., & Erez, E. S. (2015). When suboptimal rules. In Proceedings of AAAI-15, Menlo Park, CA.

23.

Wahlster, W., & Kobsa, A. (1989). User models in dialog systems. Berlin: Springer.MATHCrossRef

24.

Horvitz, E., Breese, J., Heckerman, D., Hovel, D., & Rommelse, K. (1998). The lumiere project: Bayesian user modeling for inferring the goals and needs of software users. In Proceedings of the fourteenth conference on uncertainty in artificial intelligence (pp. 256–265), Madison.

25.

Amir, O., & Gal, Y. K. (2013). Plan recognition and visualization in exploratory learning environments. ACM Transactions on Interactive Intelligent Systems (TiiS), 3(3), 16.

26.

Kim, T., Hong, H., & Magerko, B. (2009). Coralog: Use-aware visualization connecting human micro-activities to environmental change. In CHI’09 Extended abstracts on human factors in computing systems (pp. 4303–4308). New York: ACM.

27.

Petersen, D., Steele, J., & Wilkerson, J. (2009). Wattbot: A residential electricity monitoring and feedback system. In CHI’09 extended abstracts on human factors in computing systems (pp. 2847–2852). New York: ACM.

28.

Pierce, J., Schiano, D. J., & Paulos, E. (2010). Home, habits, and energy: Examining domestic interactions and energy consumption. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 1985–1994). New York: ACM.

29.

Froehlich, J., Findlater, L., & Landay, J. (2010). The design of eco-feedback technology. In SIGCHI conference on human factors in computing systems (pp. 1999–2008). New York: ACM.

30.

Fogg, B. J. (2002). Persuasive technology: Using computers to change what we think and do. Ubiquity, 2002, 5.CrossRef

31.

Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (1995). Gambling in a rigged casino: The adversarial multi-armed bandit problem. In Proceedings of 36th annual symposium on foundations of computer science (FOCS), (pp. 322–331). Alamitos: IEEE Computer Society Press.

32.

Chabris, C. F., Laibson, D. I., & Schuldt, J. P. (2006). Intertemporal choice. The New Palgrave Dictionary of Economics, 2, 1–11.

33.

Deaton, A., & Paxson, C. (1994). Intertemporal choice and inequality. The Journal of Political Economy, 102(3), 437–467.CrossRef

34.

Lisman, J. E., & Idiart, M. A. P. (1995). Storage of 7 \(\pm \) 2 short-term memories in oscillatory subcycles. Science, 267, 1512–1515.CrossRef

35.

Miller, G. A. (1956). The magical number seven plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81–97.CrossRef

36.

Vermorel, Joanns, & Mohri, Mehryar. (2005). Multi-armed bandit algorithms and empirical evaluation. European conference on machine learning (pp. 437–448). New York: Springer.

37.

Goldman, C. V., & Zilberstein, S. (2003). Optimizing information exchange in cooperative multi-agent systems. In Proceedings of the second international joint conference on autonomous agents and multiagent systems (pp. 137–144). Melbourne: ACM Press.

38.

Guestrin, C., Koller, D., & Parr, R. (2001). Multiagent planning with factored mdps. In NIPS (Vol. 1, pp. 1523–1530). Dordrecht: Kluwer Academic Publishers.

39.

Marecki, J., Koenig, S., & Tambe, M. (2007). A fast analytical algorithm for solving markov decision processes with real-valued resources. In Proceedings of the international joint conference on artificial intelligence (IJCAI) (pp. 2536–2541), Hyderabad.

40.

Feng, Z., Dearden, R., Meuleau, N., & Washington, R. (2004). Dynamic programming for structured continuous markov decision problems. In The 20th conference on uncertainty in artificial intelligence (pp. 154–161). Orlando: AUAI Press.

41.

Ormoneit, D., & Sen, S. (2002). Kernel-based reinforcement learning. Machine Learning, 49(2), 161–178.MATHCrossRef

42.

Keith, W. (1970). Hastings. Monte carlo sampling methods using markov chains and their applications. Biometrika, 57(1), 97–109.CrossRef

43.

Metropolis, N., & Ulam, S. (1949). The Monte carlo method. Journal of the American statistical association, 44(247), 335–341.MATHMathSciNetCrossRef

44.

Gal, Y., Kraus, S., Gelfand, M., Khashan, H., & Salmon, E. (2011). An adaptive agent for negotiating with people in different cultures. ACM Transactions on Intelligent Systems and Technology (TIST), 3(1), 8.

45.

Silver, D., & Veness, J. (2010). Monte-carlo planning in large pomdps. In Advances in neural information processing systems (pp. 2164–2172).

46.

Stone, P., & Kraus, S. (2010). To teach or not to teach? Decision making under uncertainty in ad hoc teams. In Proceedings of the 9th international conference on autonomous agents and multiagent systems (Vol. pp. 117–124). Toronto: International Foundation for Autonomous Agents and Multiagent Systems.

47.

Nguyen, T., Yang, R., Azaria, A., Kraus, S., & Tambe, M. (2013). Analyzing the effectiveness of adversary modeling in security games. In AAAI, New York.

48.

Azaria, A., Rabinovich, Z., Kraus, S., & Goldman, C. V. (2014). Strategic information disclosure to people with multiple alternatives. Transactions on Intelligent Systems and Technology (TIST), 5(4), 64–86.

Titel: Strategic advice provision in repeated human-agent interactions
verfasst von: Amos Azaria
Ya’akov Gal
Sarit Kraus
Claudia V. Goldman
Publikationsdatum: 01.01.2016
Verlag: Springer US
Erschienen in: Autonomous Agents and Multi-Agent Systems / Ausgabe 1/2016
Print ISSN: 1387-2532
Elektronische ISSN: 1573-7454
DOI: https://doi.org/10.1007/s10458-015-9284-6

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 1/2016

Human–agent collaboration for disaster response

NegoChat-A: a chat-based negotiation agent with bounded rationality

Evaluation of a trust-modulated argumentation-based interactive decision-making tool

Guest Editorial

Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning

Looking for Conflict: Gaze Dynamics in a Dyadic Mixed-Motive Game

Premium Partner