Skip to main content
Top
Published in: International Journal of Speech Technology 2/2012

01-06-2012

The Construction-Integration framework: a means to diminish bias in LSA-based call routing

Authors: Guillermo Jorge-Botana, Ricardo Olmos, Alejandro Barroso

Published in: International Journal of Speech Technology | Issue 2/2012

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Semantic technology is commonly used for two purposes in the field of IVR (Interactive Voice Response). The first is to correct the output of voice recognition devices based on coherence with a context. The second is to perform what is referred to as “call routing”, requiring technology that categorizes utterances and returns a list of the most credible routes. Our paper focuses on the latter, aiming to use the Latent Semantic Analysis (LSA henceforth) computational model (Deerwester et al. in J. Am. Soc. Inf. Sci. 41:391–407, 1990) together with the Construction-Integration model (C-I henceforth), a psycholinguistically motivated algorithm (Kintsch in Int. J. Psychol. 33(6):411–420, 1998), to interpret, manage and successfully route user requests in an efficient and reliable manner. By efficient we mean that training is unnecessary when the destination model is altered, and exhaustive labeling of all utterances is not required, concentrating instead only on some sample destinations. By reliable we mean that the construction-integration algorithm attenuates the risks from intra-destination variability and word saliency. Technical and theoretical aspects are discussed. In addition, some destination assignment methods are tested and debated.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
Folding-In is specified later.
 
2
Matrix of contexts is generically known in Information Retrieval as matrix of documents. In order to be accurate in naming the context window, we prefer to call it “utterances”.
 
3
Strictly speaking, in LSA training these labels do not need to be exhaustive, so they can originate from different means of classification or even different corpora. In extreme cases, a sufficiently large corpus might even be trained without labels. Testing this last hypothesis is one of the aims of the present study.
 
4
The space is set up according to the method followed by Cox and Shahshahani (2001), with a matrix built from terms and utterances, and not terms and grouped categories, like Chu-Carroll and Carpenter (1999). The rows of the occurrence matrix were terms (and also labels in the labeled condition) and the columns were utterances.
 
5
As well as having no labels, this decrease in the size of the matrix is due to single-term utterances (for example “credit”) being excluded from the training—undoubtedly this is a disadvantage of an LSA model without labels.
 
6
We chose such a dimensionalization based on the assumptions made in some previous studies. In those studies it has been suggested that the optimal number of dimensions for specific domain corpora does not have to be extremely low, sometimes even approaching the 300 dimensions recommended by Landauer and Dumais (1997) for general domain corpora (see Jorge-Botana et al. 2010b). Some of the most recent studies simply use 300 dimensions (Wild et al. 2011).
 
7
These models must cover all the functionality of the service.
 
8
Because call utterances are shorter and simpler than propositions within colloquial language, the algorithm which is used is not exactly the original Construction-Integration algorithm. The integration part proposed by Kintsch is a spreading activation algorithm which is iterative until the net is stable (the cycle when the change in the mean activation is lower than a parameterized value), whereas our algorithm is a “one-shot” mechanism. The activation of each node is calculated based on the connections received. Another difference with the CI algorithm as proposed by Kintsch is that we only consider words and not propositions nor situations. In any case, note that the original C-I is more complete and fine grained, but our mechanism is sufficient for our purposes and may be more flexibly programmed, because an OOP (Object Oriented Programming) paradigm has been used, with classes such as net, layer, node, connection, etc., instead of the iterative vector * matrix multiplication in the original (see Kintsch and Welsch 1991 for details of the original conception).
 
9
The cosines are calculated using the previously trained semantic space, in other words each of the terms to be compared is represented by a vector in this space. Any term vector might then be compared with another term vector using the cosine.
 
Literature
go back to reference Bellegarda, J. R. (2000). Exploiting latent semantic information in statistical language modeling. Proceedings of the IEEE, 88(8), 1279–1296. CrossRef Bellegarda, J. R. (2000). Exploiting latent semantic information in statistical language modeling. Proceedings of the IEEE, 88(8), 1279–1296. CrossRef
go back to reference Chu-Carroll, J., & Carpenter, B. (1999). Vector-based natural language call routing. Computational Linguistics, 25(3), 361–388. Chu-Carroll, J., & Carpenter, B. (1999). Vector-based natural language call routing. Computational Linguistics, 25(3), 361–388.
go back to reference Cox, S., & Shahshahani, B. (2001). A comparison of some different techniques for vector based call-routing. In Proceedings of 7th European conf. on speech communication and technology, Aalborg. Cox, S., & Shahshahani, B. (2001). A comparison of some different techniques for vector based call-routing. In Proceedings of 7th European conf. on speech communication and technology, Aalborg.
go back to reference Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41, 391–407. CrossRef Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41, 391–407. CrossRef
go back to reference Foltz, P. W. (1996). Latent semantic analysis for text-based research. Behavior Research Methods, Instruments, & Computers, 28(2), 197–202. CrossRef Foltz, P. W. (1996). Latent semantic analysis for text-based research. Behavior Research Methods, Instruments, & Computers, 28(2), 197–202. CrossRef
go back to reference Haley, D. T., Thomas, P., De Roeck, A., & Petre, M. (2005). A research taxonomy for latent semantic analysis based educational applications. Technical Report no. 2005/09, Open University. Haley, D. T., Thomas, P., De Roeck, A., & Petre, M. (2005). A research taxonomy for latent semantic analysis based educational applications. Technical Report no. 2005/09, Open University.
go back to reference Haley, D. T., Thomas, P., Petre, P., & De Roeck, A. (2007). Seeing the whole picture: comparing computer assisted assessment systems using LSA-based systems as an example. Technical Report Number 2007/07, Open University. Haley, D. T., Thomas, P., Petre, P., & De Roeck, A. (2007). Seeing the whole picture: comparing computer assisted assessment systems using LSA-based systems as an example. Technical Report Number 2007/07, Open University.
go back to reference Jones, M. P., & Martin, J. H. (1997). Contextual spelling correction using latent semantic analysis. In Proceedings of the fifth conference on applied natural language processing (pp. 163–176). Jones, M. P., & Martin, J. H. (1997). Contextual spelling correction using latent semantic analysis. In Proceedings of the fifth conference on applied natural language processing (pp. 163–176).
go back to reference Jorge-Botana, G., Olmos, R., & León, J. A. (2009). Using LSA and the predication algorithm to improve extraction of meanings from a diagnostic corpus. Spanish Journal of Psychology, 12(2), 424–440. Jorge-Botana, G., Olmos, R., & León, J. A. (2009). Using LSA and the predication algorithm to improve extraction of meanings from a diagnostic corpus. Spanish Journal of Psychology, 12(2), 424–440.
go back to reference Jorge-Botana, G., León, J. A., Olmos, R., & Escudero, I. (2010a). Latent semantic analysis parameters for essay evaluation using small-scale corpora. Journal of Quantitative Linguistics, 17(1), 1–29. CrossRef Jorge-Botana, G., León, J. A., Olmos, R., & Escudero, I. (2010a). Latent semantic analysis parameters for essay evaluation using small-scale corpora. Journal of Quantitative Linguistics, 17(1), 1–29. CrossRef
go back to reference Jorge-Botana, G., León, J. A., Olmos, R., & Hassan-Montero, Y. (2010b). Visualizing polysemy using LSA and the predication algorithm. Journal of the American Society for Information Science and Technology, 61(8), 1706–1724. Jorge-Botana, G., León, J. A., Olmos, R., & Hassan-Montero, Y. (2010b). Visualizing polysemy using LSA and the predication algorithm. Journal of the American Society for Information Science and Technology, 61(8), 1706–1724.
go back to reference Jorge-Botana, G., León, J. A., Olmos, R., & Escudero, I. (2011). The representation of polysemy through vectors: some building blocks for constructing models and applications with LSA. International Journal of Continuing Engineering Education and Long Learning, 21(4). Jorge-Botana, G., León, J. A., Olmos, R., & Escudero, I. (2011). The representation of polysemy through vectors: some building blocks for constructing models and applications with LSA. International Journal of Continuing Engineering Education and Long Learning, 21(4).
go back to reference Kintsch, W. (1998). The representation of knowledge in minds and machines. International Journal of Psychology, 33(6), 411–420. CrossRef Kintsch, W. (1998). The representation of knowledge in minds and machines. International Journal of Psychology, 33(6), 411–420. CrossRef
go back to reference Kintsch, W. (2000). Metaphor comprehension: a computational theory. Psychonomic Bulletin & Review, 7, 257–266. CrossRef Kintsch, W. (2000). Metaphor comprehension: a computational theory. Psychonomic Bulletin & Review, 7, 257–266. CrossRef
go back to reference Kintsch, W. (2002). On the notions of theme and topic in psychological process models of text comprehension. In M. Louwerse & W. van Peer (Eds.), Thematics: interdisciplinary studies (pp. 157–170). Amsterdam: Benjamins. Kintsch, W. (2002). On the notions of theme and topic in psychological process models of text comprehension. In M. Louwerse & W. van Peer (Eds.), Thematics: interdisciplinary studies (pp. 157–170). Amsterdam: Benjamins.
go back to reference Kintsch, W. (2007). Meaning in context. In T. K. Landauer, D. McNamara, S. Dennis, & W. Kintsch (Eds.), Handbook of latent semantic analysis (pp. 89–105). Mahwah: Erlbaum. Kintsch, W. (2007). Meaning in context. In T. K. Landauer, D. McNamara, S. Dennis, & W. Kintsch (Eds.), Handbook of latent semantic analysis (pp. 89–105). Mahwah: Erlbaum.
go back to reference Kintsch, W. (2008). Symbol systems and perceptual representations. In M. de Vega, A. M. Glenberg, & A. C. Graesser (Eds.), Symbols and embodiment: debates on meaning and cognition (pp. 145–164). Oxford: Oxford University Press. CrossRef Kintsch, W. (2008). Symbol systems and perceptual representations. In M. de Vega, A. M. Glenberg, & A. C. Graesser (Eds.), Symbols and embodiment: debates on meaning and cognition (pp. 145–164). Oxford: Oxford University Press. CrossRef
go back to reference Kintsch, W., & Bowles, A. (2002). Metaphor comprehension: what makes a metaphor difficult to understand? Metaphor and Symbol, 17, 249–262. CrossRef Kintsch, W., & Bowles, A. (2002). Metaphor comprehension: what makes a metaphor difficult to understand? Metaphor and Symbol, 17, 249–262. CrossRef
go back to reference Kintsch, W., & Welsch, D. (1991). The construction-integration model: a framework for studying memory for text. In W. E. Hockley & S. Lewandowsky (Eds.), Relating theory and data: essays on human memory in honor of Bennet B. Murdock (pp. 367–385). Hillsdale: Erlbaum. Kintsch, W., & Welsch, D. (1991). The construction-integration model: a framework for studying memory for text. In W. E. Hockley & S. Lewandowsky (Eds.), Relating theory and data: essays on human memory in honor of Bennet B. Murdock (pp. 367–385). Hillsdale: Erlbaum.
go back to reference Kintsch, W., Patel, V., & Ericsson, K. A. (1999). The role of long-term working memory in text comprehension. Psychologia, 42, 186–198. Kintsch, W., Patel, V., & Ericsson, K. A. (1999). The role of long-term working memory in text comprehension. Psychologia, 42, 186–198.
go back to reference Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: the latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 211–240. CrossRef Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: the latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 211–240. CrossRef
go back to reference Li, L., & Chou, W. (2002). Improving latent semantic indexing based classifier with information gain. In Proceedings of the 7th international conference on spoken language processing, ICSLP-2002, Denver, Colorado, USA, September 16–20, 2002 (pp. 1141–1144). Li, L., & Chou, W. (2002). Improving latent semantic indexing based classifier with information gain. In Proceedings of the 7th international conference on spoken language processing, ICSLP-2002, Denver, Colorado, USA, September 16–20, 2002 (pp. 1141–1144).
go back to reference Lim, B. P., Ma, B., & Li, H. (2005). Using semantic context to improve voice keyword mining. In Proceedings of the international conference on Chinese computing (ICCC 2005), Singapore, 21–23 March 2005. Lim, B. P., Ma, B., & Li, H. (2005). Using semantic context to improve voice keyword mining. In Proceedings of the international conference on Chinese computing (ICCC 2005), Singapore, 21–23 March 2005.
go back to reference Louwerse, M. M. (2008). Embodied representations are encoded in language. Psychonomic Bulletin & Review, 15, 838–844. CrossRef Louwerse, M. M. (2008). Embodied representations are encoded in language. Psychonomic Bulletin & Review, 15, 838–844. CrossRef
go back to reference Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing. Cambridge: MIT Press. MATH Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing. Cambridge: MIT Press. MATH
go back to reference Nakov, P., Popova, A., & Mateev, P. (2001). Weight functions impact on LSA performance. In Proceedings of the recent advances in natural language processing conference—RANLP 2001, Tzigov Chark, Bulgaria. Nakov, P., Popova, A., & Mateev, P. (2001). Weight functions impact on LSA performance. In Proceedings of the recent advances in natural language processing conference—RANLP 2001, Tzigov Chark, Bulgaria.
go back to reference Olmos, R., León, J. A., Jorge-Botana, G., & Escudero, I. (2009). New algorithms assessing short summaries in expository texts using latent semantic analysis. Behavior Research Methods, 41(3), 944–950. CrossRef Olmos, R., León, J. A., Jorge-Botana, G., & Escudero, I. (2009). New algorithms assessing short summaries in expository texts using latent semantic analysis. Behavior Research Methods, 41(3), 944–950. CrossRef
go back to reference Quesada, J. (2008). Latent problem solving analysis (LPSA): a computational theory of representation in complex, dynamic problem solving tasks. PhD thesis, Psychology, University of Granada. Quesada, J. (2008). Latent problem solving analysis (LPSA): a computational theory of representation in complex, dynamic problem solving tasks. PhD thesis, Psychology, University of Granada.
go back to reference Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval. New York: McGrawHill. MATH Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval. New York: McGrawHill. MATH
go back to reference Serafin, R., & Di Eugenio, B. (2004). FLSA: extending latent semantic analysis with features for dialogue act classification. In Proceedings of ACL04, 42nd annual meeting of the association for computational linguistics Barcelona, Spain, July. Serafin, R., & Di Eugenio, B. (2004). FLSA: extending latent semantic analysis with features for dialogue act classification. In Proceedings of ACL04, 42nd annual meeting of the association for computational linguistics Barcelona, Spain, July.
go back to reference Shi, Y. (2008). An investigation of linguistic information for speech recognition error detection. PhD University of Maryland, Baltimore County, Baltimore. Shi, Y. (2008). An investigation of linguistic information for speech recognition error detection. PhD University of Maryland, Baltimore County, Baltimore.
go back to reference Tyson, N., & Matula, V. C. (2004). Improved LSI-based natural language call routing using speech recognition confidence scores. In Proceedings of EMNLP. Tyson, N., & Matula, V. C. (2004). Improved LSI-based natural language call routing using speech recognition confidence scores. In Proceedings of EMNLP.
go back to reference Wild, F., Haley, D., & Bülow, K. (2011). Using latent-semantic analysis and network analysis for monitoring conceptual development. Journal for Language Technology and Computational Linguistics, 26(1), 9–21. Wild, F., Haley, D., & Bülow, K. (2011). Using latent-semantic analysis and network analysis for monitoring conceptual development. Journal for Language Technology and Computational Linguistics, 26(1), 9–21.
Metadata
Title
The Construction-Integration framework: a means to diminish bias in LSA-based call routing
Authors
Guillermo Jorge-Botana
Ricardo Olmos
Alejandro Barroso
Publication date
01-06-2012
Publisher
Springer US
Published in
International Journal of Speech Technology / Issue 2/2012
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-012-9129-5

Other articles of this Issue 2/2012

International Journal of Speech Technology 2/2012 Go to the issue