Skip to main content

2015 | OriginalPaper | Buchkapitel

Genetic Programming for Feature Selection and Question-Answer Ranking in IBM Watson

verfasst von : Urvesh Bhowan, D. J. McCloskey

Erschienen in: Genetic Programming

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

IBM Watson is an intelligent open-domain question answering system capable of finding correct answers to natural language questions in real-time. Watson uses machine learning over a large heterogeneous feature set derived from many distinct natural language processing algorithms to identify correct answers. This paper develops a Genetic Programming (GP) approach for feature selection in Watson by evolving ranking functions to order candidate answers generated in Watson. We leverage GP’s automatic feature selection mechanisms to identify Watson’s key features through the learning process. Our experiments show that GP can evolve relatively simple ranking functions that use much fewer features from the original Watson feature set to achieve comparable performances to Watson. This methodology can aid Watson implementers to better identify key components in an otherwise large and complex system for development, troubleshooting, and/or customer or domain-specific enhancements.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The ground-truth dictionary is manually created and curated by the Watson development team.
 
Literatur
1.
Zurück zum Zitat Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Reusing genetic programming for ensemble selection in classification of unbalanced data. IEEE Trans. Evol. Comput. 18(6), 893–908 (2014)CrossRef Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Reusing genetic programming for ensemble selection in classification of unbalanced data. IEEE Trans. Evol. Comput. 18(6), 893–908 (2014)CrossRef
2.
Zurück zum Zitat Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Evolving diverse ensembles using genetic programming for classification with unbalanced data. IEEE Trans. Evol. Comput. 17(3), 368–386 (2012)CrossRef Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Evolving diverse ensembles using genetic programming for classification with unbalanced data. IEEE Trans. Evol. Comput. 17(3), 368–386 (2012)CrossRef
3.
Zurück zum Zitat Davis, R.A., Charlton, A.J., Oehlschlager, S., Wilson, J.C.: Novel feature selection method for genetic programming using metabolomic 1 H NMR data. Chemom. Intell. Lab. Syst. 81(1), 50–59 (2006)CrossRef Davis, R.A., Charlton, A.J., Oehlschlager, S., Wilson, J.C.: Novel feature selection method for genetic programming using metabolomic 1 H NMR data. Chemom. Intell. Lab. Syst. 81(1), 50–59 (2006)CrossRef
4.
Zurück zum Zitat Espejo, P., Ventura, S., Herrera, F.: A survey on the application of genetic programming to classification. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 40(2), 121–144 (2010)CrossRef Espejo, P., Ventura, S., Herrera, F.: A survey on the application of genetic programming to classification. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 40(2), 121–144 (2010)CrossRef
5.
Zurück zum Zitat Fan, W., Gordon, M.D., Pathak, P.: Discovery of context-specific ranking functions for effective information retrieval using genetic programming. IEEE Trans. Knowl. Data Eng. 16(4), 523–527 (2004)CrossRef Fan, W., Gordon, M.D., Pathak, P.: Discovery of context-specific ranking functions for effective information retrieval using genetic programming. IEEE Trans. Knowl. Data Eng. 16(4), 523–527 (2004)CrossRef
6.
Zurück zum Zitat Fan, W., Gordon, M.D., Pathak, P.: A generic ranking function discovery framework by genetic programming for information retrieval. Inf. Process. Manage. 40(4), 587–602 (2004)CrossRefMATH Fan, W., Gordon, M.D., Pathak, P.: A generic ranking function discovery framework by genetic programming for information retrieval. Inf. Process. Manage. 40(4), 587–602 (2004)CrossRefMATH
7.
Zurück zum Zitat Ferrucci, D., Brown, E., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A.A., Lally, A., Murdock, J.W., Nyberg, E., Prager, J., et al.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010) Ferrucci, D., Brown, E., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A.A., Lally, A., Murdock, J.W., Nyberg, E., Prager, J., et al.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)
8.
Zurück zum Zitat Ferrucci, D., Levas, A., Bagchi, S., Gondek, D., Mueller, E.T.: Watson: beyond Jeopardy!. Artif. Intell. 199, 93–105 (2013)CrossRef Ferrucci, D., Levas, A., Bagchi, S., Gondek, D., Mueller, E.T.: Watson: beyond Jeopardy!. Artif. Intell. 199, 93–105 (2013)CrossRef
9.
Zurück zum Zitat Ferrucci, D.A.: Introduction to “This is Watson”. IBM J. Res. Dev. 56(3.4), 1:1–1:15 (2012) Ferrucci, D.A.: Introduction to “This is Watson”. IBM J. Res. Dev. 56(3.4), 1:1–1:15 (2012)
10.
Zurück zum Zitat Gondek, D., Lally, A., Kalyanpur, A., Murdock, J.W., Duboue, P.A., Zhang, L., Pan, Y., Qiu, Z., Welty, C.: A framework for merging and ranking of answers in DeepQA. IBM J. Res. Dev. 56(3.4), 14:1–14:12 (2012) Gondek, D., Lally, A., Kalyanpur, A., Murdock, J.W., Duboue, P.A., Zhang, L., Pan, Y., Qiu, Z., Welty, C.: A framework for merging and ranking of answers in DeepQA. IBM J. Res. Dev. 56(3.4), 14:1–14:12 (2012)
11.
Zurück zum Zitat Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. In: SIGKDD Explorations. vol. 11 (2009) Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. In: SIGKDD Explorations. vol. 11 (2009)
12.
Zurück zum Zitat Koza, J.R.: Genetic Programming: on the programming of computers by means of natural selection, vol. 1. MIT Press, Cambridge (1992)MATH Koza, J.R.: Genetic Programming: on the programming of computers by means of natural selection, vol. 1. MIT Press, Cambridge (1992)MATH
13.
Zurück zum Zitat Muharram, M., Smith, G.: Evolutionary constructive induction. IEEE Trans. Knowl. Data Eng. 17(11), 1518–1528 (2005)CrossRef Muharram, M., Smith, G.: Evolutionary constructive induction. IEEE Trans. Knowl. Data Eng. 17(11), 1518–1528 (2005)CrossRef
14.
Zurück zum Zitat Poli, R., Langdon, W.B., McPhee, N.F., Koza, J.R.: A field guide to genetic programming (2008). Lulu.com Poli, R., Langdon, W.B., McPhee, N.F., Koza, J.R.: A field guide to genetic programming (2008). Lulu.​com
15.
Zurück zum Zitat Tan, X., Bhanu, B., Lin, Y.: Fingerprint classification based on learned features. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 35(3), 287–300 (2005)CrossRef Tan, X., Bhanu, B., Lin, Y.: Fingerprint classification based on learned features. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 35(3), 287–300 (2005)CrossRef
17.
Zurück zum Zitat Wang, L., Fan, W., Yang, R., Xi, W., Luo, M., Zhou, Y., Fox, E.A.: Ranking function discovery by genetic programming for robust retrieval. In: TREC. pp. 828–836 (2003) Wang, L., Fan, W., Yang, R., Xi, W., Luo, M., Zhou, Y., Fox, E.A.: Ranking function discovery by genetic programming for robust retrieval. In: TREC. pp. 828–836 (2003)
18.
Zurück zum Zitat Yeh, J.Y., Lin, J.Y., Ke, H.R., Yang, W.P.: Learning to rank for information retrieval using genetic programming. In: SIGIR Workshop: Learning to Rank for Information Retrieval (2007) Yeh, J.Y., Lin, J.Y., Ke, H.R., Yang, W.P.: Learning to rank for information retrieval using genetic programming. In: SIGIR Workshop: Learning to Rank for Information Retrieval (2007)
Metadaten
Titel
Genetic Programming for Feature Selection and Question-Answer Ranking in IBM Watson
verfasst von
Urvesh Bhowan
D. J. McCloskey
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-16501-1_13