ABSTRACT
The results of a machine learning from user behavior can be thought of as a program, and like all programs, it may need to be debugged. Providing ways for the user to debug it matters, because without the ability to fix errors users may find that the learned program's errors are too damaging for them to be able to trust such programs. We present a new approach to enable end users to debug a learned program. We then use an early prototype of our new approach to conduct a formative study to determine where and when debugging issues arise, both in general and also separately for males and females. The results suggest opportunities to make machine-learned programs more effective tools.
- Bandura, A. Self-efficacy: Toward a unifying theory of behavioral change. Psychological Review 8, 2 (1977), 191--215.Google Scholar
- Becker, B., Kohavi, R., and Sommerfield, D. Visualizing the simple Bayesian classifier. In Fayyad, U, Grinstein, G. and Wierse A. (Eds.) Information Visualization in Data Mining and Knowledge Discovery, (2001), 237--249. Google ScholarDigital Library
- Beckwith, L. Burnett, M., Wiedenbeck, S., Cook, C., Sorte, S., and Hastings, M. Effectiveness of end-user debugging software features: Are there gender issues? Proc. CHI (2005), 869--878. Google ScholarDigital Library
- Burnett, M., Cook, C., Pendse, O., Rothermel, G., Summet, J., and Wallace, C. End-user software engineering with assertions in the spreadsheet paradigm. International Conference on Software Engineering, 2003, 93--103. Google ScholarDigital Library
- Chan, H. and Darwiche, A. When do numbers really matter? Journal of Artificial Intelligence Research, 17 (2002), 265--287. Google ScholarDigital Library
- Chen, J. and Weld, D. S. Recovering from errors during programming by demonstration. Proc. IUI (2008), 159--168. Google ScholarDigital Library
- Compeau, D. and Higgins, C. Application of social cognitive theory to training for computer skills. Information Systems Research, 6,2 (1995), 118--143.Google ScholarDigital Library
- Davies, S. P. Display-based problem solving strategies in computer programming, Proc. Wkshp. Empirical Studies of Programmers, Ablex, (1996), 59--76.Google Scholar
- Glass, A., McGuinness, D. and Wolverton, M. Toward establishing trust in adaptive agents, Proc. IUI (2008), 227--236. Google ScholarDigital Library
- Grigoreanu, V., Cao, J., Kulesza, T., Bogart, C., Rector, K., Burnett, M., Wiedenbeck, S. Can feature design reduce the gender gap in end-user software development environments? Proc. VL/HCC 2008, IEEE, (2008). Google ScholarDigital Library
- Ko, A. J. Asking and answering questions about the causes of software behaviors, Ph.D. thesis available as Human-Computer Interaction Institute Technical Report CMU-CS-08-122 (2008).Google Scholar
- Ko, A. J., Myers, B., and Aung, H. Six learning barriers in end-user programming systems. Proc. VL/HCC 2004, IEEE Computer Society (2004), 199--206. Google ScholarDigital Library
- Kononenko, I. Inductive and Bayesian learning in medical diagnosis. Applied Artificial Intelligence, 7, (1993), 317--337.Google ScholarCross Ref
- Leave, C. and Diez, F. A review of explanation methods for Bayesian networks. The Knowledge Engineering Review, 17, 2, Cambridge University Press, (2002) 107--127. Google ScholarDigital Library
- Lieberman, H. (ed.) Your Wish is My Command: Programming By Example, Morgan Kaufmann Publishers, Inc (2001).Google Scholar
- Little, G., Lau, T., Cypher, A., Lin, J., Haber, E., and Kandogan, E. Koala: Capture, share, automate, personalize business processes on the web. Proc. CHI (2007), 943--946. Google ScholarDigital Library
- McDaniel, R. and Myers, B. Getting more out of programming-by-demonstration, Proc. CHI (1999), 442--449. Google ScholarDigital Library
- Meyers--Levy, J. Gender differences in information processing: A selectivity interpretation. P. Cafferata & A. Tybout, (Eds) Cognitive and Affective Responses to Advertising, Lexington Books (1989).Google Scholar
- Myers, B., Weitzman, D., Ko, A. J., and Chau, D. H., Answering why and why not questions in user interfaces. Proc.CHI (2006), 397--406. Google ScholarDigital Library
- Patel, K., Fogarty, J., Landay, J., and Harrison, B. (2008). Investigating statistical machine learning as a tool for software development. Proc. CHI (2008), 667--676. Google ScholarDigital Library
- Poulin, B., Eisner, R., Szafron, D., Lu, P., Greiner, R., Wishart, D. S., Fyshe, A., Pearcy, B., MacDonnell, C., and Anvik, J. Visual explanation of evidence in additive classifiers. Proc. IAAI, (2006). Google ScholarDigital Library
- Russell, S. J., and Norvig, P. Artificial Intelligence: A Modern Approach. Upper Saddle River, NJ: Prentice Hall. 2003. Google ScholarDigital Library
- Stumpf S., Rajaram V., Li L., Burnett M., Dietterich T., Sullivan E., Drummond R., Herlocker J. Toward harnessing user feedback for machine learning. Proc. IUI (2007), 82--91. Google ScholarDigital Library
- Stumpf, S., Sullivan, E., Fitzhenry, E., Oberst, I., Wong, W.-K., and Burnett, M. Integrating rich user feedback into intelligent user interfaces. Proc. IUI (2008), 50--59. Google ScholarDigital Library
- Subrahmaniyan, N., Beckwith, L., Grigoreanu, V., Burnett, M., Wiedenbeck, S., Narayanan, V., Bucht, K., Drummond, R., and Fern, X. (2008). Testing vs. code inspection vs. .. what else? Male and female end users' debugging strategies. Proc.CHI (2008), 617--626. Google ScholarDigital Library
- Vander Zanden, B. and Myers, B. Demonstrational and constraint-based techniques for pictorially specifying application objects and behaviors. Transactions on Computer-Human Interaction, 2,4 (1995), 308--356. Google ScholarDigital Library
- Wagner, E. and Lieberman, H. Supporting user hypotheses in problem diagnosis on the web and elsewhere. Proc. IUI (2004), 30--37. Google ScholarDigital Library
Index Terms
- Fixing the program my computer learned: barriers for end users, challenges for the machine
Recommendations
Why-oriented end-user debugging of naive Bayes text classification
Machine learning techniques are increasingly used in intelligent assistants, that is, software targeted at and continuously adapting to assist end users with email, shopping, and other tasks. Examples include desktop SPAM filters, recommender systems, ...
Testing vs. code inspection vs. what else?: male and female end users' debugging strategies
CHI '08: Proceedings of the SIGCHI Conference on Human Factors in Computing SystemsLittle is known about the strategies end-user programmers use in debugging their programs, and even less is known about gender differences that may exist in these strategies. Without this type of information, designers of end-user programming systems ...
Toward End-User Debugging of Machine-Learned Classifiers
VLHCC '10: Proceedings of the 2010 IEEE Symposium on Visual Languages and Human-Centric ComputingMany machine-learning algorithms learn rules of behavior from individual end users, such as task-oriented desktop organizers and handwriting recognizers. These rules form a generated “program” tailored specifically to the behaviors of that end user, ...
Comments