Abstract
Assessing and understanding intelligent agents is a difficult task for users who lack an AI background. “Explainable AI” (XAI) aims to address this problem, but what should be in an explanation? One route toward answering this question is to turn to theories of how humans try to obtain information they seek. Information Foraging Theory (IFT) is one such theory. In this article, we present a series of studies1 using IFT: the first investigates how expert explainers supply explanations in the RTS domain, the second investigates what explanations domain experts demand from agents in the RTS domain, and the last focuses on how both populations try to explain a state-of-the-art AI. Our results show that RTS environments like StarCraft offer so many options that change so rapidly, foraging tends to be very costly. Ways foragers attempted to manage such costs included “satisficing” approaches to reduce their cognitive load, such as focusing more on What information than on Why information, strategic use of language to communicate a lot of nuanced information in a few words, and optimizing their environment when possible to make their most valuable information patches readily available. Further, when a real AI entered the picture, even very experienced domain experts had difficulty understanding and judging some of the AI’s unconventional behaviors. Finally, our results reveal ways Information Foraging Theory can inform future XAI interactive explanation environments, and also how XAI can inform IFT.
- Adrian K. Agogino and Kagan Tumer. 2004. Unifying temporal and structural credit assignment problems. In Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems. IEEE Computer Society, 980–987. Google ScholarDigital Library
- S. Amershi, M. Cakmak, W. Knox, and T. Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. AI Mag. 35, 4 (2014), 105–120.Google ScholarDigital Library
- Andrew Anderson, Jonathan Dodge, Amrita Sadarangani, Zoe Juozapaitis, Evan Newman, Jed Irvine, Souti Chattopadhyay, Alan Fern, and Margaret Burnett. 2019. Explaining reinforcement learning to mere mortals: An empirical study. In Proceedings of the International Joint Conferences on Artificial Intelligence. Google ScholarCross Ref
- Balaji Athreya and Chris Scaffidi. 2014. Towards aiding within-patch information foraging by end-user programmers. In Proceedings of the IEEE Symposium on Visual Languages and Human-centric Computing (VL/HCC’14). IEEE, 13–20.Google ScholarCross Ref
- Juan Felipe Beltran, Ziqi Huang, Azza Abouzied, and Arnab Nandi. 2017. Don’t just swipe left, tell me why: Enhancing gesture-based feedback with reason bins. In Proceedings of the International Conference on Intelligent User Interfaces. ACM, 469–480. Google ScholarDigital Library
- Sourav S. Bhowmick, Aixin Sun, and Ba Quan Truong. 2013. Why not, WINE?: Towards answering why-not questions in social image search. In Proceedings of the ACM International Conference on Multimedia. ACM, 917–926. Google ScholarDigital Library
- Svetlin Bostandjiev, John O’Donovan, and Tobias Höllerer. 2012. TasteWeights: A visual interactive hybrid recommender system. In Proceedings of the ACM Conference on Recommender Systems. ACM, 35–42. Google ScholarDigital Library
- Barrett S. Caldwell, Sandra K. Garrett, and Karim C. Boustany. 2010. Healthcare team performance in time critical environments: Coordinating events, foraging, and system processes. J. Healthc. Eng. 1, 2 (2010), 255–276.Google ScholarCross Ref
- Nico Castelli, Corinna Ogonowski, Timo Jakobi, Martin Stein, Gunnar Stevens, and Volker Wulf. 2017. What happened in my home? An end-user development approach for smart home data visualization. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 853–866. Google ScholarDigital Library
- Gifford Cheung and Jeff Huang. 2011. Starcraft from the stands: Understanding the game spectator. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’11). ACM, New York, NY, 763–772. DOI:https://doi.org/10.1145/1978942.1979053 Google ScholarDigital Library
- Ed H. Chi, Peter Pirolli, Kim Chen, and James Pitkow. 2001. Using information scent to model user information needs and actions and the web. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 490–497. Google ScholarDigital Library
- Robert Collins and David Jefferson. 1991. Representations for artificial organisms. In From Animals to Animats. In Proceedings of the 1st International Conference on Simulation of Adaptive Behavior. The MIT Press. Google ScholarDigital Library
- Kelley Cotter, Janghee Cho, and Emilee Rader. 2017. Explaining the news feed algorithm: An analysis of the “News Feed FYI” blog. In Proceedings of the ACM CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM, 1553–1560. Google ScholarDigital Library
- Jonathan Dodge, Sean Penney, Claudia Hilderbrand, Andrew Anderson, and Margaret Burnett. 2018. How the experts do it: Assessing and explaining agent behaviors in real-time strategy games. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI’18). ACM, New York, NY. DOI:https://doi.org/10.1145/3173574.3174136 Google ScholarDigital Library
- Upol Ehsan, Pradyumna Tambwekar, Larry Chan, Brent Harrison, and Mark O. Riedl. 2019. Automated rationale generation: A technique for explainable AI and its effects on human perceptions. In Proceedings of the 24th International Conference on Intelligent User Interfaces (IUI’19). ACM, New York, NY, 263–274. DOI:https://doi.org/10.1145/3301275.3302316 Google ScholarDigital Library
- S. Fleming, C. Scaffidi, D. Piorkowski, M. Burnett, R. Bellamy, J. Lawrance, and I. Kwan. 2013. An information foraging theory perspective on tools for debugging, refactoring, and reuse tasks. ACM Trans. Softw. Eng. Methodol. 22, 2 (2013), 14. Google ScholarDigital Library
- W. Fu and P. Pirolli. 2007. SNIF-ACT: A cognitive model of user navigation on the world wide web. Hum.-comput. Interact. 22, 4 (2007), 355–412. Google ScholarDigital Library
- Sandra K. Garrett and Barrett S. Caldwell. 2009. Human factors aspects of planning and response to pandemic events. In Proceedings of the Institute of Industrial and Systems Engineers Conference (IISE’09). 705.Google Scholar
- V. Grigoreanu, M. Burnett, and G. Robertson. 2010. A strategy-centric approach to the design of end-user debugging tools. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 713–722. Google ScholarDigital Library
- Valentina Grigoreanu, Margaret Burnett, Susan Wiedenbeck, Jill Cao, Kyle Rector, and Irwin Kwan. 2012. End-user debugging strategies: A sensemaking perspective. ACM Trans. Comput.-hum. Interact. 19, 1 (2012), 1–28. Google ScholarDigital Library
- Alex Groce, Todd Kulesza, Chaoqiang Zhang, Shalini Shamasunder, Margaret Burnett, Weng-Keen Wong, Simone Stumpf, Shubhomoy Das, Amber Shinsel, Forrest Bice, et al. 2014. You are the only possible oracle: Effective test selection for end users of interactive machine learning systems. IEEE Trans. Softw. Eng. 40, 3 (2014), 307–323. Google ScholarDigital Library
- Bradley Hayes and Julie A. Shah. 2017. Improving robot controller transparency through autonomous policy explanation. In Proceedings of the ACM/IEEE International Conference on Human-robot Interaction. ACM, 303–312. Google ScholarDigital Library
- Steven R. Haynes, Mark A. Cohen, and Frank E. Ritter. 2009. Designs for explaining intelligent agents. Int. J. Hum.-comput. Stud. 67, 1 (2009), 90–110. Google ScholarDigital Library
- Zhian He and Eric Lo. 2014. Answering why-not questions on top-k queries. IEEE Trans. Knowl. Data Eng. 26, 6 (2014), 1300–1315. Google ScholarDigital Library
- Robert R. Hoffman and Gary Klein. 2017. Explaining explanation, Part 1: Theoretical foundations. IEEE Intell. Syst. 32, 3 (2017), 68–73.Google ScholarCross Ref
- Paul Jaccard. 1908. Nouvelles recherches sur la distribution florale. Bull. Soc. Vaud. Sci. Nat. 44 (1908), 223–270.Google Scholar
- Ashish Kapoor, Bongshin Lee, Desney Tan, and Eric Horvitz. 2010. Interactive optimization for steering machine classification. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 1343–1352. Google ScholarDigital Library
- Lucas Kempe-Cook, Stephen Tsung-Han Sher, and Norman Makoto Su. 2019. Behind the voices: The practice and challenges of Esports casters. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI’19). Association for Computing Machinery, New York, NY. DOI:https://doi.org/10.1145/3290605.3300795 Google ScholarDigital Library
- Man-Je Kim, Kyung-Joong Kim, SeungJun Kim, and Anind K. Dey. 2016. Evaluation of starcraft artificial intelligence competition bots by experienced human players. In Proceedings of the ACM CHI Conference Extended Abstracts. ACM, 1915–1921. Google ScholarDigital Library
- M. J. Kim, K. J. Kim, S. Kim, and A. K. Dey. 2018. Performance evaluation gaps in a real-time strategy game between human and artificial intelligence players. IEEE Access 6 (2018), 13575–13586. DOI:https://doi.org/10.1109/ACCESS.2018.2800016Google ScholarCross Ref
- Josua Krause, Adam Perer, and Kenney Ng. 2016. Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI’16). ACM, New York, NY, 5686–5697. DOI:https://doi.org/10.1145/2858036.2858529 Google ScholarDigital Library
- Cliff Kuang. 2017. Can AI be taught to explain itself? New York Times. (Nov. 21 2017). Retrieved from https://www.nytimes.com/2017/11/21/magazine/can-ai-be-taught-to-explain-itself.html.Google Scholar
- T. Kulesza, M. Burnett, W. Wong, and S. Stumpf. 2015. Principles of explanatory debugging to personalize interactive machine learning. In Proceedings of the ACM International Conference on Intelligent User Interfaces. ACM, 126–137. Google ScholarDigital Library
- Todd Kulesza, Simone Stumpf, Margaret Burnett, and Irwin Kwan. 2012. Tell me more? The effects of mental model soundness on personalizing an intelligent agent. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 1–10. Google ScholarDigital Library
- T. Kulesza, S. Stumpf, M. Burnett, W. Wong, Y. Riche, T. Moore, I. Oberst, A. Shinsel, and K. McIntosh. 2010. Explanatory debugging: Supporting end-user debugging of machine-learned programs. In Proceedings of the IEEE Symposium on Visual Languages and Human-centric Computing (VL/HCC’10). IEEE, 41–48. Google ScholarDigital Library
- T. Kulesza, S. Stumpf, W. Wong, M. Burnett, S. Perona, A. Ko, and I. Oberst. 2011. Why-oriented end-user debugging of naive Bayes text classification. ACM Trans. Interact. Intell. Syst. 1, 1 (2011), 2. Google ScholarDigital Library
- Sandeep Kaur Kuttal, Anita Sarma, Margaret Burnett, Gregg Rothermel, Ian Koeppe, and Brooke Shepherd. 2019. How end-user programmers debug visual web-based programs: An information foraging theory perspective. J. Comput. Lang. 53 (2019), 22–37.Google ScholarCross Ref
- Sandeep Kaur Kuttal, Anita Sarma, and Gregg Rothermel. 2013. Predator behavior in the wild web world of bugs: An information foraging theory perspective. In Proceedings of the IEEE Symposium on Visual Languages and Human-centric Computing (VL/HCC’13). IEEE, 59–66.Google ScholarCross Ref
- Joseph Lawrance, Margaret Burnett, Rachel Bellamy, Christopher Bogart, and Calvin Swart. 2010. Reactive information foraging for evolving goals. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’10). Association for Computing Machinery, New York, NY, 25–34. DOI:https://doi.org/10.1145/1753326.1753332 Google ScholarDigital Library
- B. Lim and A. Dey. 2009. Assessing demand for intelligibility in context-aware applications. In Proceedings of the ACM International Conference on Ubiquitous Computing. ACM, 195–204. Google ScholarDigital Library
- B. Lim, A. Dey, and D. Avrahami. 2009. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 2119–2128. Google ScholarDigital Library
- Brian Y. Lim. 2012. Improving Understanding and Trust with Intelligibility in Context-aware Applications. Ph.D. Dissertation. Carnegie Mellon University. Google ScholarDigital Library
- Diane Litman, Steve Young, M. J. F. Gales, Kate Knill, Karen Ottewell, Rogier van Dalen, and David Vandyke. 2016. Towards using conversations with spoken dialogue systems in the automated assessment of non-native speakers of English. In Proceedings of the SIGDIAL Conference. 270–275.Google ScholarCross Ref
- M. Lomas, R. Chevalier, E. V. Cross, R. C. Garrett, J. Hoare, and M. Kopack. 2012. Explaining robot actions. In Proceedings of the ACM/IEEE International Conference on Human-robot Interaction (HRI’12). 187–188. DOI:https://doi.org/10.1145/2157689.2157748 Google ScholarDigital Library
- S. McGregor, H. Buckingham, T. G. Dietterich, R. Houtman, C. Montgomery, and R. Metoyer. 2015. Facilitating testing and debugging of Markov decision processes with interactive visualization. In Proceedings of the IEEE Symposium on Visual Languages and Human-centric Computing (VL/HCC’15). 53–61. DOI:https://doi.org/10.1109/VLHCC.2015.7357198Google Scholar
- Ronald Metoyer, Simone Stumpf, Christoph Neumann, Jonathan Dodge, Jill Cao, and Aaron Schnabel. 2010. Explaining how to play real-time strategy games. Knowl.-based Syst. 23, 4 (2010), 295–301. Google ScholarDigital Library
- Tim Miller. 2017. Explanation in artificial intelligence: Insights from the social sciences. CoRR abs/1706.07269 (2017).Google Scholar
- Nan Niu, Anas Mahmoud, Zhangji Chen, and Gary Bradshaw. 2013. Departures from optimality: Understanding human analyst’s information foraging in assisted requirements tracing. In Proceedings of the ACM/ICSE International Conference on Software Engineering. IEEE Press, 572–581. Google ScholarDigital Library
- Donald A. Norman. 1983. Some observations on mental models. Ment. Models 7, 112 (1983), 7–14.Google Scholar
- S. Ontañón, G. Synnaeve, A. Uriarte, F. Richoux, D. Churchill, and M. Preuss. 2013. A survey of real-time strategy game AI research and competition in StarCraft. IEEE Trans. Comput. Intell. AI Games 5, 4 (Dec. 2013), 293–311. DOI:https://doi.org/10.1109/TCIAIG.2013.2286295Google ScholarCross Ref
- Sean Penney, Jonathan Dodge, Claudia Hilderbrand, Andrew Anderson, Logan Simpson, and Margaret Burnett. 2018. Toward foraging for understanding of StarCraft agents: An empirical study. In Proceedings of the 23rd International Conference on Intelligent User Interfaces (IUI’18). ACM, New York, NY, 225–237. DOI:https://doi.org/10.1145/3172944.3172946 Google ScholarDigital Library
- Alexandre Perez and Rui Abreu. 2014. A diagnosis-based approach to software comprehension. In Proceedings of the ACM International Conference on Program Comprehension. ACM, 37–47. Google ScholarDigital Library
- David Piorkowski, Scott Fleming, Christopher Scaffidi, Christopher Bogart, Margaret Burnett, Bonnie John, Rachel Bellamy, and Calvin Swart. 2012. Reactive information foraging: An empirical investigation of theory-based recommender systems for programmers. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’12). Association for Computing Machinery, New York, NY, 1471–1480. DOI:https://doi.org/10.1145/2207676.2208608 Google ScholarDigital Library
- David Piorkowski, Scott D. Fleming, Christopher Scaffidi, Margaret Burnett, Irwin Kwan, Austin Z Henley, Jamie Macbeth, Charles Hill, and Amber Horvath. 2015. To fix or to learn? How production bias affects developers’ information foraging during debugging. In Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME’15). IEEE, 11–20. Google ScholarDigital Library
- D. Piorkowski, A. Henley, T. Nabi, S. Fleming, C. Scaffidi, and M. Burnett. 2016. Foraging and navigations, fundamentally: Developers’ predictions of value and cost. In Proceedings of the ACM International Symposium on Foundations of Software Engineering. ACM, 97–108. Google ScholarDigital Library
- David Piorkowski, Sean Penney, Austin Z. Henley, Marco Pistoia, Margaret Burnett, Omer Tripp, and Pietro Ferrara. 2017. Foraging goes mobile: Foraging while debugging on mobile devices. In Proceedings of the IEEE Symposium on Visual Languages and Human-centric Computing (VL/HCC’17). IEEE, 9–17.Google ScholarCross Ref
- P. Pirolli. 2007. Information Foraging Theory: Adaptive Interaction with Information. Oxford University Press. Google ScholarDigital Library
- S. S. Ragavan, S. Kuttal, C. Hill, A. Sarma, D. Piorkowski, and M. Burnett. 2016. Foraging among an overabundance of similar variants. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 3509–3521. Google ScholarDigital Library
- Sruti Srinivasa Ragavan, Mihai Codoban, David Piorkowski, Danny Dig, and Burnett Margaret. 2019. Version control systems: An information foraging perspective. IEEE Trans. Softw. Eng. (2019). DOI:https://doi.org/10.1109/TSE.2019.2931296Google Scholar
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1135–1144. Google ScholarDigital Library
- Stephanie Rosenthal, Sai P. Selvaraj, and Manuela Veloso. 2016. Verbalization: Narration of autonomous robot experience. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’16). AAAI Press, 862–868. Retrieved from: http://dl.acm.org/citation.cfm?id=3060621.3060741 Google ScholarDigital Library
- Quentin Roy, Futian Zhang, and Daniel Vogel. 2019. Automation accuracy is good, but high controllability may be better. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI’19). ACM, New York, NY. DOI:https://doi.org/10.1145/3290605.3300750 Google ScholarDigital Library
- Stuart J. Russell and Peter Norvig. 2003. Artificial Intelligence: A Modern Approach (2nd ed.). Pearson Education. Google ScholarDigital Library
- Robert Spence. 2007. Information Visualization: Design for Interaction (2nd ed.). Prentice-Hall, Inc., Upper Saddle River, NJ. Google ScholarDigital Library
- Sruti Srinivasa Ragavan, Sandeep Kaur Kuttal, Charles Hill, Anita Sarma, David Piorkowski, and Margaret Burnett. 2016. Foraging among an overabundance of similar variants. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 3509–3521. Google ScholarDigital Library
- David J. Stracuzzi, Alan Fern, Kamal Ali, Robin Hess, Jervis Pinto, Nan Li, Tolga Konik, and Daniel G. Shapiro. 2011. An application of transfer to American football: From observation of raw video to control in a simulated environment. AI Mag. 32, 2 (2011), 107–125.Google ScholarCross Ref
- S. Stumpf, E. Sullivan, E. Fitzhenry, I. Oberst, W. Wong, and M. Burnett. 2008. Integrating rich user feedback into intelligent user interfaces. In Proceedings of the ACM International Conference on Intelligent User Interfaces. ACM, 50–59. Google ScholarDigital Library
- Adam Summerville, Michael Cook, and Ben Steenhuisen. 2016. Draft-analysis of the ancients: Predicting draft picks in DotA 2 using machine learning. Retrieved from https://aaai.org/ocs/index.php/AIIDE/AIIDE16/paper/view/14075Google Scholar
- Katia Sycara, Christian Lebiere, Yulong Pei, Donald Morrison, and Michael Lewis. 2015. Abstraction of analytical models from cognitive models of human control of robotic swarms. In Proceedings of the International Conference on Cognitive Modeling.Google Scholar
- J. Tullio, A. Dey, J. Chalecki, and J. Fogarty. 2007. How it works: A field study of non-technical users interacting with an intelligent system. In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM, 31–40. Google ScholarDigital Library
- J. Vermeulen, G. Vanderhulst, K. Luyten, and K. Coninx. 2010. PervasiveCrystal: Asking and answering why and why not questions about pervasive computing applications. In Proceedings of the IEEE International Conference on Intelligent Environments (IE’10). IEEE, 271–276. Google ScholarDigital Library
- Oriol Vinyals. 2017. DeepMind and Blizzard open StarCraft II as an AI research environment. Retrieved from https://deepmind.com/blog/deepmind-and-blizzard-open-starcraft-ii-ai-research-environment/.Google Scholar
- Oriol Vinyals, David Silver, et al. 2019. AlphaStar: Mastering the real-time strategy game StarCraft II. Retrieved from https://deepmind.com/blog/article/alphastar-mastering-real-time-strategy-game-starcraft-ii.Google Scholar
- Claes Wohlin, Per Runeson, Martin Höst, Magnus C. Ohlsson, Bjöorn Regnell, and Anders Wesslén. 2000. Experimentation in Software Engineering: An Introduction. Kluwer Academic Publishers, Norwell, MA. Google ScholarCross Ref
- Kevin Wong. 2016. StarCraft 2 and the quest for the highest APM. Retrieved from https://www.engadget.com/2014/10/24/starcraft-2-and-the-quest-for-the-highest-apm/Google Scholar
- Robert H. Wortham, Andreas Theodorou, and Joanna J. Bryson. 2017. Improving robot transparency: Real-time visualisation of robot AI substantially improves understanding in naive observers, In Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN’17). Retrieved from http://opus.bath.ac.uk/55793/Google Scholar
- Tom Zahavy, Nir Ben Zrihem, and Shie Mannor. 2016. Graying the black box: Understanding DQNs. In Proceedings of the International Conference on Machine Learning (ICML’16). JMLR.org, 1899–1908. Retrieved from http://dl.acm.org/citation.cfm?id=3045390.3045591 Google ScholarDigital Library
- Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and Understanding Convolutional Networks. Springer International Publishing, Cham, 818–833. DOI:https://doi.org/10.1007/978-3-319-10590-1_53Google Scholar
Index Terms
- The Shoutcasters, the Game Enthusiasts, and the AI: Foraging for Explanations of Real-time Strategy Players
Recommendations
How the Experts Do It: Assessing and Explaining Agent Behaviors in Real-Time Strategy Games
CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing SystemsHow should an AI-based explanation system explain an agent's complex behavior to ordinary end users who have no background in AI? Answering this question is an active research area, for if an AI-based explanation system could effectively explain ...
Toward Foraging for Understanding of StarCraft Agents: An Empirical Study
IUI '18: Proceedings of the 23rd International Conference on Intelligent User InterfacesAssessing and understanding intelligent agents is a difficult task for users that lack an AI background. A relatively new area, called "Explainable AI," is emerging to help address this problem, but little is known about how users would forage through ...
Comments