Skip to main content
Top
Published in: Knowledge and Information Systems 2/2017

07-12-2016 | Regular Paper

Evaluating intelligent knowledge systems: experiences with a user-adaptive assistant agent

Authors: Pauline M. Berry, Thierry Donneau-Golencer, Khang Duong, Melinda Gervasio, Bart Peintner, Neil Yorke-Smith

Published in: Knowledge and Information Systems | Issue 2/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This article examines experiences in evaluating a user-adaptive personal assistant agent designed to assist a busy knowledge worker in time management. We examine the managerial and technical challenges of designing adequate evaluation and the tension of collecting adequate data without a fully functional, deployed system. The CALO project was a seminal multi-institution effort to develop a personalized cognitive assistant. It included a significant attempt to rigorously quantify learning capability, which this article discusses for the first time, and ultimately the project led to multiple spin-outs including Siri. Retrospection on negative and positive experiences over the 6 years of the project underscores best practice in evaluating user-adaptive systems. Lessons for knowledge system evaluation include: the interests of multiple stakeholders, early consideration of evaluation and deployment, layered evaluation at system and component levels, characteristics of technology and domains that determine the appropriateness of controlled evaluations, implications of ‘in-the-wild’ versus variations of ‘in-the-lab’ evaluation, and the effect of technology-enabled functionality and its impact upon existing tools and work practices. In the conclusion, we discuss—through the lessons illustrated from this case study of intelligent knowledge system evaluation—how development and infusion of innovative technology must be supported by adequate evaluation of its efficacy.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
While PTIME can be seen as a type of recommender system, evaluating a task-oriented adaptive system such as PTIME differs significantly from evaluating a classical recommender system, due to the generative, incremental, and dynamic nature of the recommendation task.
 
Literature
2.
go back to reference Ambite JL, Barish G, Knoblock CA, Muslea M, Oh J, Minton S (2002) Getting from here to there: Interactive planning and agent execution for optimizing travel. In: Proceedings of fourteenth conference on innovative applications of artificial intelligence (IAAI’02), pp 862–869 Ambite JL, Barish G, Knoblock CA, Muslea M, Oh J, Minton S (2002) Getting from here to there: Interactive planning and agent execution for optimizing travel. In: Proceedings of fourteenth conference on innovative applications of artificial intelligence (IAAI’02), pp 862–869
3.
go back to reference Ambite J-L, Chaudhri VK, Fikes R, Jenkins J, Mishra S, Muslea M, Uribe T, Yang G (2006) Design and implementation of the CALO Query Manager. In: Proceedings of eighteenth conference on innovative applications of artificial intelligence (IAAI’06), pp 1751–1758 Ambite J-L, Chaudhri VK, Fikes R, Jenkins J, Mishra S, Muslea M, Uribe T, Yang G (2006) Design and implementation of the CALO Query Manager. In: Proceedings of eighteenth conference on innovative applications of artificial intelligence (IAAI’06), pp 1751–1758
4.
go back to reference Aylett R, Brazier F, Jennings N, Luck M, Nwana H, Preist C (1998) Agent systems and applications. Knowl Eng Rev 13(3):303–308CrossRef Aylett R, Brazier F, Jennings N, Luck M, Nwana H, Preist C (1998) Agent systems and applications. Knowl Eng Rev 13(3):303–308CrossRef
5.
go back to reference Azvine B, Djian D, Tsui KC, Wobcke W (2000) The intelligent assistant: an overview. In: Intelligent systems and soft computing: prospects, tools and applications. Lecture notes in computer science, vol 1804. Springer, New York, NY, pp 215–238 Azvine B, Djian D, Tsui KC, Wobcke W (2000) The intelligent assistant: an overview. In: Intelligent systems and soft computing: prospects, tools and applications. Lecture notes in computer science, vol 1804. Springer, New York, NY, pp 215–238
6.
go back to reference Bank J, Cain Z, Shoham Y, Suen C, Ariely D (2012) Turning personal calendars into scheduling assistants. In: Extended abstracts of twenty-fourth conference on human factors in computing systems (CHI’12) Bank J, Cain Z, Shoham Y, Suen C, Ariely D (2012) Turning personal calendars into scheduling assistants. In: Extended abstracts of twenty-fourth conference on human factors in computing systems (CHI’12)
7.
go back to reference Berry PM, Gervasio M, Peintner B, Yorke-Smith N (2007) Balancing the needs of personalization and reasoning in a user-centric scheduling assistant. Technical note 561, AI Center, SRI International Berry PM, Gervasio M, Peintner B, Yorke-Smith N (2007) Balancing the needs of personalization and reasoning in a user-centric scheduling assistant. Technical note 561, AI Center, SRI International
8.
go back to reference Berry PM, Donneau-Golencer T, Duong K, Gervasio MT, Peintner B, Yorke-Smith N (2009a) Evaluating user-adaptive systems: lessons from experiences with a personalized meeting scheduling assistant. In: Proceedings of twenty-first conf. on innovative applications of artificial intelligence (IAAI’09), pp 40–46 Berry PM, Donneau-Golencer T, Duong K, Gervasio MT, Peintner B, Yorke-Smith N (2009a) Evaluating user-adaptive systems: lessons from experiences with a personalized meeting scheduling assistant. In: Proceedings of twenty-first conf. on innovative applications of artificial intelligence (IAAI’09), pp 40–46
9.
go back to reference Berry PM, Donneau-Golencer T, Duong K, Gervasio MT, Peintner B, Yorke-Smith N (2009b) Mixed-initiative negotiation: facilitating useful interaction between agent/owner pairs. In: Proceedings of AAMAS’09 workshop on mixed-initiative multiagent systems, pp 8–18 Berry PM, Donneau-Golencer T, Duong K, Gervasio MT, Peintner B, Yorke-Smith N (2009b) Mixed-initiative negotiation: facilitating useful interaction between agent/owner pairs. In: Proceedings of AAMAS’09 workshop on mixed-initiative multiagent systems, pp 8–18
10.
go back to reference Berry PM, Gervasio M, Peintner B, Yorke-Smith N (2011) PTIME: personalized assistance for calendaring. ACM Trans Intell Syst Technol 2(4):40:1–40:22CrossRef Berry PM, Gervasio M, Peintner B, Yorke-Smith N (2011) PTIME: personalized assistance for calendaring. ACM Trans Intell Syst Technol 2(4):40:1–40:22CrossRef
13.
go back to reference Bosse T, Memon ZA, Oorburg R, Treur J, Umair M, de Vos M (2011) A software environment for an adaptive human-aware software agent supporting attention-demanding tasks. Int J Artif Intell Tools 20(5):819–846CrossRef Bosse T, Memon ZA, Oorburg R, Treur J, Umair M, de Vos M (2011) A software environment for an adaptive human-aware software agent supporting attention-demanding tasks. Int J Artif Intell Tools 20(5):819–846CrossRef
14.
go back to reference Brusilovsky P, Karagiannidis C, Sampson D (2004) Layered evaluation of adaptive learning systems. Int J Contin Eng Educ Lifelong Learn 14(4–5):402–421CrossRef Brusilovsky P, Karagiannidis C, Sampson D (2004) Layered evaluation of adaptive learning systems. Int J Contin Eng Educ Lifelong Learn 14(4–5):402–421CrossRef
15.
go back to reference Brusilowsky P (2001) Adaptive hypermedia. User Modell User Adapt Interact 11(1–2):87–110CrossRef Brusilowsky P (2001) Adaptive hypermedia. User Modell User Adapt Interact 11(1–2):87–110CrossRef
16.
go back to reference Brzozowski M, Carattini K, Klemmer SR, Mihelich P, Hu J, Ng AY (2006) groupTime: preference-based group scheduling. In: Proceedings of eighteenth conference on human factors in computing systems (CHI’06), pp 1047–1056 Brzozowski M, Carattini K, Klemmer SR, Mihelich P, Hu J, Ng AY (2006) groupTime: preference-based group scheduling. In: Proceedings of eighteenth conference on human factors in computing systems (CHI’06), pp 1047–1056
17.
go back to reference Campbell M (2009) Talking paperclip inspires less irksome virtual assistant. New Scientist, 29 July 2009 Campbell M (2009) Talking paperclip inspires less irksome virtual assistant. New Scientist, 29 July 2009
18.
go back to reference Carroll JM, Rosson MB (1987) Interfacing thought: cognitive aspects of human-computer interaction. MIT Press, Cambridge Carroll JM, Rosson MB (1987) Interfacing thought: cognitive aspects of human-computer interaction. MIT Press, Cambridge
19.
go back to reference Chalupsky H, Gil Y, Knoblock CA, Lerman K, Oh J, Pynadath DV, Russ TA, Tambe M (2002) Electric elves: agent technology for supporting human organizations. AI Mag 23(2):11–24 Chalupsky H, Gil Y, Knoblock CA, Lerman K, Oh J, Pynadath DV, Russ TA, Tambe M (2002) Electric elves: agent technology for supporting human organizations. AI Mag 23(2):11–24
20.
go back to reference Cheyer A, Park J, Giuli R (2005) IRIS: integrate, relate, infer, share. In: Proceedings of 4th international semantic web conference on workshop on the semantic desktop, p 15 Cheyer A, Park J, Giuli R (2005) IRIS: integrate, relate, infer, share. In: Proceedings of 4th international semantic web conference on workshop on the semantic desktop, p 15
21.
go back to reference Christie CA, Fleischer DN (2010) Insight into evaluation practice: a content analysis of designs and methods used in evaluation studies published in North American evaluation-focused journals. Am J Eval 31(3):326–346CrossRef Christie CA, Fleischer DN (2010) Insight into evaluation practice: a content analysis of designs and methods used in evaluation studies published in North American evaluation-focused journals. Am J Eval 31(3):326–346CrossRef
22.
go back to reference Cohen P (1995) Empirical methods for artificial intelligence. MIT Press, CambridgeMATH Cohen P (1995) Empirical methods for artificial intelligence. MIT Press, CambridgeMATH
23.
go back to reference Cohen P, Howe AE (1989) Toward AI research methodology: three case studies in evaluation. IEEE Trans Syst Man Cybern 19(3):634–646CrossRef Cohen P, Howe AE (1989) Toward AI research methodology: three case studies in evaluation. IEEE Trans Syst Man Cybern 19(3):634–646CrossRef
24.
go back to reference Cohen PR, Howe AE (1988) How evaluation guides AI research: the message still counts more than the medium. AI Mag 9(4):35–43 Cohen PR, Howe AE (1988) How evaluation guides AI research: the message still counts more than the medium. AI Mag 9(4):35–43
25.
go back to reference Cohen PR, Cheyer AJ, Wang M, Baeg SC (1994) An open agent architecture. In: Huhns MN, Singh MP (eds) Readings in agents. Morgan Kaufmann, San Francisco, pp 197–204 Cohen PR, Cheyer AJ, Wang M, Baeg SC (1994) An open agent architecture. In: Huhns MN, Singh MP (eds) Readings in agents. Morgan Kaufmann, San Francisco, pp 197–204
26.
go back to reference Cramer H, Evers V, Ramlal S, Someren M, Rutledge L, Stash N, Aroyo L, Wielinga B (2008) The effects of transparency on trust in and acceptance of a content-based art recommender. User Model User Adap Int 18(5):455–496CrossRef Cramer H, Evers V, Ramlal S, Someren M, Rutledge L, Stash N, Aroyo L, Wielinga B (2008) The effects of transparency on trust in and acceptance of a content-based art recommender. User Model User Adap Int 18(5):455–496CrossRef
27.
go back to reference Davis FD, Bagozzi RP, Warshaw PR (1989) User acceptance of computer technology: a comparison of two theoretical models. Manag Sci 35:982–1003CrossRef Davis FD, Bagozzi RP, Warshaw PR (1989) User acceptance of computer technology: a comparison of two theoretical models. Manag Sci 35:982–1003CrossRef
28.
go back to reference Deans B, Keifer K, Nitz K et al (2009) SKIPAL phase 2 final technical report. Technical report 1981, SPAWAR Systems Center Pacific, San Diego Deans B, Keifer K, Nitz K et al (2009) SKIPAL phase 2 final technical report. Technical report 1981, SPAWAR Systems Center Pacific, San Diego
29.
go back to reference Evers V, Cramer H, Someren M, Wielinga B (2010) Interacting with adaptive systemsInteractive collaborative information systems, volume 281 of studies in computational intelligence. Springer, Heidelberg Evers V, Cramer H, Someren M, Wielinga B (2010) Interacting with adaptive systemsInteractive collaborative information systems, volume 281 of studies in computational intelligence. Springer, Heidelberg
30.
go back to reference Freed M, Carbonell J, Gordon G, Hayes J, Myers B, Siewiorek D, Smith S, Steinfeld A, Tomasic A (2008) RADAR: a personal assistant that learns to reduce email overload. In: Proceedings of twenty-third AAAI conference on artificial intelligence (AAAI’08), pp 1287–1293 Freed M, Carbonell J, Gordon G, Hayes J, Myers B, Siewiorek D, Smith S, Steinfeld A, Tomasic A (2008) RADAR: a personal assistant that learns to reduce email overload. In: Proceedings of twenty-third AAAI conference on artificial intelligence (AAAI’08), pp 1287–1293
31.
go back to reference Gena C (2005) Methods and techniques for the evaluation of user-adaptive systems. Knowl Eng Rev 20(1):1–37CrossRef Gena C (2005) Methods and techniques for the evaluation of user-adaptive systems. Knowl Eng Rev 20(1):1–37CrossRef
32.
go back to reference Grabisch M (1996) The application of fuzzy integrals in multicriteria decision making. Eur J Oper Res 89(3):445–456CrossRefMATH Grabisch M (1996) The application of fuzzy integrals in multicriteria decision making. Eur J Oper Res 89(3):445–456CrossRefMATH
33.
go back to reference Graebner ME, Eisenhardt KM, Roundy PT (2010) Success and failure in technology acquisitions: lessons for buyers and sellers. Acad Manag Perspect 24(3):73–92CrossRef Graebner ME, Eisenhardt KM, Roundy PT (2010) Success and failure in technology acquisitions: lessons for buyers and sellers. Acad Manag Perspect 24(3):73–92CrossRef
34.
go back to reference Greenberg S, Buxton B (2008) Usability evaluation considered harmful (some of the time). In: Proceedings of twentieth conference on human factors in computing systems (CHI’08), pp 111–120 Greenberg S, Buxton B (2008) Usability evaluation considered harmful (some of the time). In: Proceedings of twentieth conference on human factors in computing systems (CHI’08), pp 111–120
35.
go back to reference Greer J, Mark M (2016) Evaluation methods for intelligent tutoring systems revisited. Int J Artif Intell Educ 26(1):387–392CrossRef Greer J, Mark M (2016) Evaluation methods for intelligent tutoring systems revisited. Int J Artif Intell Educ 26(1):387–392CrossRef
36.
go back to reference Grudin J, Palen L (1995) Why groupware succeeds: discretion or mandate? In: Proceedings of 4th European conference on computer-supported cooperative work (ECSCW’95), pp 263–278 Grudin J, Palen L (1995) Why groupware succeeds: discretion or mandate? In: Proceedings of 4th European conference on computer-supported cooperative work (ECSCW’95), pp 263–278
37.
go back to reference Hall J, Zeleznikow J (2001) Acknowledging insufficiency in the evaluation of legal knowledge-based systems: Strategies towards a broad based evaluation model. In: Proceedings of 8th international conference on artificial intelligence and law (ICAIL’01), pp 147–156 Hall J, Zeleznikow J (2001) Acknowledging insufficiency in the evaluation of legal knowledge-based systems: Strategies towards a broad based evaluation model. In: Proceedings of 8th international conference on artificial intelligence and law (ICAIL’01), pp 147–156
38.
go back to reference Hitt LM, Wu DJ, Zhou X (2002) ERP investment: business impact and productivity measures. J Manag Inf Syst 19:71–98 Hitt LM, Wu DJ, Zhou X (2002) ERP investment: business impact and productivity measures. J Manag Inf Syst 19:71–98
39.
go back to reference Höök K (2000) Steps to take before intelligent user interfaces become real. Interact Comput 12(4):409–426CrossRef Höök K (2000) Steps to take before intelligent user interfaces become real. Interact Comput 12(4):409–426CrossRef
40.
go back to reference Horvitz E, Breese J, Heckerman D, Hovel D, Rommelse K (1998) The Lumière project: Bayesian user modeling for inferring the goals and needs of software users. In: Proceedings of 14th conference on uncertainty in artificial intelligence (UAI’98), pp 256–266 Horvitz E, Breese J, Heckerman D, Hovel D, Rommelse K (1998) The Lumière project: Bayesian user modeling for inferring the goals and needs of software users. In: Proceedings of 14th conference on uncertainty in artificial intelligence (UAI’98), pp 256–266
41.
go back to reference Jameson AD (2009) Understanding and dealing with usability side effects of intelligent processing. AI Mag 30(4):23–40 Jameson AD (2009) Understanding and dealing with usability side effects of intelligent processing. AI Mag 30(4):23–40
42.
go back to reference Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of 22nd ACM conference on knowledge discovery and data mining (KDD’02), pp 133–142 Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of 22nd ACM conference on knowledge discovery and data mining (KDD’02), pp 133–142
43.
go back to reference Kafali Ö, Yolum P (2016) PISAGOR: a proactive software agent for monitoring interactions. Knowl Inf Syst 47(1):215–239CrossRef Kafali Ö, Yolum P (2016) PISAGOR: a proactive software agent for monitoring interactions. Knowl Inf Syst 47(1):215–239CrossRef
45.
go back to reference Kjeldskov J, Skov MB (2007) Studying usability in sitro: simulating real world phenomena in controlled environments. Int J Hum Comput Interact 22(1–2):7–36CrossRef Kjeldskov J, Skov MB (2007) Studying usability in sitro: simulating real world phenomena in controlled environments. Int J Hum Comput Interact 22(1–2):7–36CrossRef
46.
go back to reference Klimt B, Yang Y (2004) The Enron corpus: a new dataset for email classification research. In: Proceedings of 15th European conference on machine learning (ECML’04), number 3201 in lecture notes in computer science. Springer, pp 217–226 Klimt B, Yang Y (2004) The Enron corpus: a new dataset for email classification research. In: Proceedings of 15th European conference on machine learning (ECML’04), number 3201 in lecture notes in computer science. Springer, pp 217–226
47.
go back to reference Knoblock CA (2006) Beyond the elves: making intelligent agents intelligent. In: Proceedings of AAAI 2006 spring symposium on what went wrong and why: lessons from AI research and applications, p 40 Knoblock CA (2006) Beyond the elves: making intelligent agents intelligent. In: Proceedings of AAAI 2006 spring symposium on what went wrong and why: lessons from AI research and applications, p 40
49.
go back to reference Kozierok R, Maes P (1993) A learning interface agent for scheduling meetings. In: Proceedings of international workshop on intelligent user interfaces (IUI’93), pp 81–88 Kozierok R, Maes P (1993) A learning interface agent for scheduling meetings. In: Proceedings of international workshop on intelligent user interfaces (IUI’93), pp 81–88
50.
go back to reference Krzywicki A, Wobcke W (2008) Closed pattern mining for the discovery of user preferences in a calendar assistant. In: Nguyen NT, Katarzyniak R (eds) New challenges in applied intelligence technologies. Springer, New York, pp 67–76CrossRef Krzywicki A, Wobcke W (2008) Closed pattern mining for the discovery of user preferences in a calendar assistant. In: Nguyen NT, Katarzyniak R (eds) New challenges in applied intelligence technologies. Springer, New York, pp 67–76CrossRef
51.
go back to reference Langley P (1999) User modeling in adaptive interfaces. In: Proceedings of 7th international conference on user modeling (UM’99), pp 357–370 Langley P (1999) User modeling in adaptive interfaces. In: Proceedings of 7th international conference on user modeling (UM’99), pp 357–370
52.
go back to reference Lazar J, Feng JH, Hockheiser H (2010) Research methods in human–computer interaction. Wiley, Chichester Lazar J, Feng JH, Hockheiser H (2010) Research methods in human–computer interaction. Wiley, Chichester
53.
go back to reference Maes P (1994) Agents that reduce work and information overload. J ACM 37(7):30–40CrossRef Maes P (1994) Agents that reduce work and information overload. J ACM 37(7):30–40CrossRef
54.
go back to reference McCorduck P, Feigenbaum EA (1983) The fifth generation: artificial intelligence and Japan’s computer challenge to the world. Addison Wesley, Boston McCorduck P, Feigenbaum EA (1983) The fifth generation: artificial intelligence and Japan’s computer challenge to the world. Addison Wesley, Boston
55.
go back to reference Mitchell T, Caruana R, Freitag D, McDermott J, Zabowski D (1994) Experience with a learning personal assistant. Commun ACM 37(7):80–91CrossRef Mitchell T, Caruana R, Freitag D, McDermott J, Zabowski D (1994) Experience with a learning personal assistant. Commun ACM 37(7):80–91CrossRef
56.
go back to reference Modi PJ, Veloso MM, Smith SF, Oh J (2004) CMRadar: a personal assistant agent for calendar management. In: Proceedings of agent-oriented information systems workshop (AOIS’04), pp 169–181 Modi PJ, Veloso MM, Smith SF, Oh J (2004) CMRadar: a personal assistant agent for calendar management. In: Proceedings of agent-oriented information systems workshop (AOIS’04), pp 169–181
57.
go back to reference Moffitt MD, Peintner B, Yorke-Smith N (2006) Multi-criteria optimization of temporal preferences. In: Proceedings of CP’06 workshop on preferences and soft constraints, pp 79–93 Moffitt MD, Peintner B, Yorke-Smith N (2006) Multi-criteria optimization of temporal preferences. In: Proceedings of CP’06 workshop on preferences and soft constraints, pp 79–93
58.
go back to reference Myers KL, Berry PM, Blythe J, Conley K, Gervasio M, McGuinness D, Morley D, Pfeffer A, Pollack M, Tambe M (2007) An intelligent personal assistant for task and time management. AI Mag 28(2):47–61 Myers KL, Berry PM, Blythe J, Conley K, Gervasio M, McGuinness D, Morley D, Pfeffer A, Pollack M, Tambe M (2007) An intelligent personal assistant for task and time management. AI Mag 28(2):47–61
59.
go back to reference Nielsen J, Levy J (1994) Measuring usability: preference vs. performance. Commun ACM 37(4):66–75CrossRef Nielsen J, Levy J (1994) Measuring usability: preference vs. performance. Commun ACM 37(4):66–75CrossRef
60.
go back to reference Norman DA (1994) How might people interact with agents. Commun ACM 37(7):68–71CrossRef Norman DA (1994) How might people interact with agents. Commun ACM 37(7):68–71CrossRef
61.
go back to reference Oh J, Smith SF (2004) Learning user preferences in distributed calendar scheduling. In: Proceedings of 5th international conference on practice and theory of automated timetabling (PATAT’04), pp 3–16 Oh J, Smith SF (2004) Learning user preferences in distributed calendar scheduling. In: Proceedings of 5th international conference on practice and theory of automated timetabling (PATAT’04), pp 3–16
62.
go back to reference Oppermann R (1994) Adaptively supported adaptivity. Int J Hum Comput Stud 40(3):455–472CrossRef Oppermann R (1994) Adaptively supported adaptivity. Int J Hum Comput Stud 40(3):455–472CrossRef
63.
go back to reference Palen L (1999) Social, individual and technological issues for groupware calendar systems. In: Proceedings of eleventh conference on human factors in computing systems (CHI’99), pp 17–24 Palen L (1999) Social, individual and technological issues for groupware calendar systems. In: Proceedings of eleventh conference on human factors in computing systems (CHI’99), pp 17–24
64.
go back to reference Paramythis A, Weibelzahl S, Masthoff J (2010) Layered evaluation of interactive adaptive systems: framework and formative methods. User Model User Adap Interact 20(5):383–453CrossRef Paramythis A, Weibelzahl S, Masthoff J (2010) Layered evaluation of interactive adaptive systems: framework and formative methods. User Model User Adap Interact 20(5):383–453CrossRef
65.
go back to reference Peintner B, Dinger J, Rodriguez A, Myers K (2009) Task assistant: personalized task management for military environments. In: Proceedings of twenty-first conference on innovative applications of artificialintelligence (IAAI’09), pp 128–134 Peintner B, Dinger J, Rodriguez A, Myers K (2009) Task assistant: personalized task management for military environments. In: Proceedings of twenty-first conference on innovative applications of artificialintelligence (IAAI’09), pp 128–134
66.
go back to reference Refanidis I, Alexiadis A (2011) Deployment and evaluation of Selfplanner, an automated individual task management system. Comput Intell 27(1):41–59MathSciNetCrossRef Refanidis I, Alexiadis A (2011) Deployment and evaluation of Selfplanner, an automated individual task management system. Comput Intell 27(1):41–59MathSciNetCrossRef
67.
go back to reference Refanidis I, Yorke-Smith N (2010) A constraint-based approach to scheduling an individual’s activities. ACM Trans Intell Syst Technol 1(2):121–1232CrossRef Refanidis I, Yorke-Smith N (2010) A constraint-based approach to scheduling an individual’s activities. ACM Trans Intell Syst Technol 1(2):121–1232CrossRef
68.
go back to reference Rychtyckyj N, Turski A (2008) Reasons for success (and failure) in the development and deployment of AI systems. In: Proceedings of AAAI’08 workshop on what went wrong and why: lessons from AI research and applications, pp 25–31 Rychtyckyj N, Turski A (2008) Reasons for success (and failure) in the development and deployment of AI systems. In: Proceedings of AAAI’08 workshop on what went wrong and why: lessons from AI research and applications, pp 25–31
69.
go back to reference Schaub F, Könings B, Lang P, Wiedersheim B, Winkler C, Weber M (2014) PriCal: context-adaptive privacy in ambient calendar displays. In: Proc. of sixteeth international conference on pervasive and ubiquitous computing (UbiComp’14), pp 499–510 Schaub F, Könings B, Lang P, Wiedersheim B, Winkler C, Weber M (2014) PriCal: context-adaptive privacy in ambient calendar displays. In: Proc. of sixteeth international conference on pervasive and ubiquitous computing (UbiComp’14), pp 499–510
70.
go back to reference Shakshuki EM, Hossain SM (2014) A personal meeting scheduling agent. Pers Ubiquit Comput 18(4):909–922CrossRef Shakshuki EM, Hossain SM (2014) A personal meeting scheduling agent. Pers Ubiquit Comput 18(4):909–922CrossRef
71.
go back to reference Shen J, Li L, Dietterich TG, Herlocker JL (2006) A hybrid learning system for recognizing user tasks from desktop activities and email messages. In: Proceedings of eighteenth international conference on intelligent user interfaces (IUI’06), pp 86–92 Shen J, Li L, Dietterich TG, Herlocker JL (2006) A hybrid learning system for recognizing user tasks from desktop activities and email messages. In: Proceedings of eighteenth international conference on intelligent user interfaces (IUI’06), pp 86–92
73.
go back to reference Steinfeld A, Bennett R, Cunningham K et al (2006) The RADAR test methodology: evaluating a multi-task machine learning system with humans in the loop. Report CMU-CS-06-125, Carnegie Mellon University Steinfeld A, Bennett R, Cunningham K et al (2006) The RADAR test methodology: evaluating a multi-task machine learning system with humans in the loop. Report CMU-CS-06-125, Carnegie Mellon University
74.
go back to reference Steinfeld A, Bennett R, Cunningham K, et al. (2007a) Evaluation of an integrated multi-task machine learning system with humans in the loop. In: Proceedings of 7th NIST workshop on performance metrics for intelligent systems (PerMIS’07), pp 182–188 Steinfeld A, Bennett R, Cunningham K, et al. (2007a) Evaluation of an integrated multi-task machine learning system with humans in the loop. In: Proceedings of 7th NIST workshop on performance metrics for intelligent systems (PerMIS’07), pp 182–188
75.
go back to reference Steinfeld A, Quinones P-A, Zimmerman J, Bennett SR, Siewiorek D (2007b) Survey measures for evaluation of cognitive assistants. In: Proceedins of 7th NIST workshop on performance metrics for intelligent systems (PerMIS’07), pp 189–193 Steinfeld A, Quinones P-A, Zimmerman J, Bennett SR, Siewiorek D (2007b) Survey measures for evaluation of cognitive assistants. In: Proceedins of 7th NIST workshop on performance metrics for intelligent systems (PerMIS’07), pp 189–193
76.
go back to reference Stumpf S, Rajaram V, Li L, Wong W-K, Burnett M, Dietterich T, Sullivan E, Herlocker J (2009) Interacting meaningfully with machine learning systems: three experiments. Int J Hum Comput Stud 67(8):639–662CrossRef Stumpf S, Rajaram V, Li L, Wong W-K, Burnett M, Dietterich T, Sullivan E, Herlocker J (2009) Interacting meaningfully with machine learning systems: three experiments. Int J Hum Comput Stud 67(8):639–662CrossRef
77.
go back to reference Tambe M, Bowring E, Pearce JP, Varakantham P, Scerri P, Pynadath DV (2006) Electric Elves: what went wrong and why. In: Proceedings of AAAI 2006 spring symposium on what went wrong and why: lessons from AI research and applications, pp 34–39 Tambe M, Bowring E, Pearce JP, Varakantham P, Scerri P, Pynadath DV (2006) Electric Elves: what went wrong and why. In: Proceedings of AAAI 2006 spring symposium on what went wrong and why: lessons from AI research and applications, pp 34–39
78.
go back to reference Van Velsen L, Van Der Geest T, Klaassen R, Steehouder M (2008) User-centered evaluation of adaptive and adaptable systems: a literature review. Knowl Eng Rev 23(3):261–281 Van Velsen L, Van Der Geest T, Klaassen R, Steehouder M (2008) User-centered evaluation of adaptive and adaptable systems: a literature review. Knowl Eng Rev 23(3):261–281
79.
go back to reference Viappiani P, Faltings B, Pu P (2006) Preference-based search using example-critiquing with suggestions. J Artif Intell Res 27:465–503MATH Viappiani P, Faltings B, Pu P (2006) Preference-based search using example-critiquing with suggestions. J Artif Intell Res 27:465–503MATH
80.
go back to reference Wahlster W (ed) (2006) SmartKom: foundations of multimodal dialogue systems. Cognitive technologies. Springer, New York Wahlster W (ed) (2006) SmartKom: foundations of multimodal dialogue systems. Cognitive technologies. Springer, New York
81.
go back to reference Weber J, Yorke-Smith N (2008) Time management with adaptive reminders: two studies and their design implications. In: Working Notes of CHI’08 workshop: usable artificial intelligence, pp 5–8 Weber J, Yorke-Smith N (2008) Time management with adaptive reminders: two studies and their design implications. In: Working Notes of CHI’08 workshop: usable artificial intelligence, pp 5–8
82.
go back to reference Wobcke W, Nguyen A, Ho VH, Krzywicki A (2007) The smart personal assistant: an overview. In: Proceedings of the AAAI spring symposium on interaction challenges for intelligent assistants, pp 135–136 Wobcke W, Nguyen A, Ho VH, Krzywicki A (2007) The smart personal assistant: an overview. In: Proceedings of the AAAI spring symposium on interaction challenges for intelligent assistants, pp 135–136
83.
go back to reference Yorke-Smith N, Saadati S, Myers KL, Morley DN (2012) The design of a proactive personal agent for task management. Int J Artif Intell Tools 21(1):90–119CrossRef Yorke-Smith N, Saadati S, Myers KL, Morley DN (2012) The design of a proactive personal agent for task management. Int J Artif Intell Tools 21(1):90–119CrossRef
Metadata
Title
Evaluating intelligent knowledge systems: experiences with a user-adaptive assistant agent
Authors
Pauline M. Berry
Thierry Donneau-Golencer
Khang Duong
Melinda Gervasio
Bart Peintner
Neil Yorke-Smith
Publication date
07-12-2016
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 2/2017
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-016-1011-3

Other articles of this Issue 2/2017

Knowledge and Information Systems 2/2017 Go to the issue

Premium Partner