Skip to main content
Top

2020 | OriginalPaper | Chapter

Experiments with Cross-Language Speech Retrieval for Lower-Resource Languages

Authors : Suraj Nair, Anton Ragni, Ondrej Klejch, Petra Galuščáková, Douglas Oard

Published in: Information Retrieval Technology

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Cross-language speech retrieval systems face a cascade of errors due to transcription and translation ambiguity. Using 1-best speech recognition and 1-best translation in such a scenario could adversely affect recall if those 1-best system guesses are not correct. Accurately representing transcription and translation probabilities could therefore improve recall, although possibly at some cost in precision. The difficulty of the task is exacerbated when working with languages for which limited resources are available, since both recognition and translation probabilities may be less accurate in such cases. This paper explores the combination of expected term counts from recognition with expected term counts from translation to perform cross-language speech retrieval in which the queries are in English and the spoken content to be retrieved is in Tagalog or Swahili. Experiments were conducted using two query types, one focused on term presence and the other focused on topical retrieval. Overall, the results show that significant improvements in ranking quality result from modeling transcription and recognition ambiguity, even in lower-resource settings, and that adapting the ranking model to specific query types can yield further improvements.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Material is an acronym for Machine Translation for English Retrieval of Information in Any Language [21].
 
2
In the MATERIAL program these are referred to as conceptual and simple queries, but we prefer to refer to them as topical and lexical in keeping with the way those terms are used in information retrieval and natural language processing, respectively. Some topical and lexical queries also contain additional clues (e.g., synonyms or hypernyms) to guide the interpretation of query terms, but we do not make use of these additional clues in our experiments.
 
Literature
1.
go back to reference Can, D., Saraclar, M.: Lattice indexing for spoken term detection. IEEE Trans. Audio Speech Lang. Process. 19(8), 2338–2347 (2011)CrossRef Can, D., Saraclar, M.: Lattice indexing for spoken term detection. IEEE Trans. Audio Speech Lang. Process. 19(8), 2338–2347 (2011)CrossRef
2.
go back to reference Chelba, C., et al.: Retrieval and browsing of spoken content. IEEE Signal Process. Mag. 25(3), 39–49 (2008)CrossRef Chelba, C., et al.: Retrieval and browsing of spoken content. IEEE Signal Process. Mag. 25(3), 39–49 (2008)CrossRef
3.
go back to reference Chen, G., et al.: Using proxies for OOV keywords in the keyword search task. In: ASRU, pp. 416–421 (2013) Chen, G., et al.: Using proxies for OOV keywords in the keyword search task. In: ASRU, pp. 416–421 (2013)
4.
go back to reference Darwish, K., Oard, D.: Probabilistic structured query methods. In: SIGIR, pp. 338–344 (2003) Darwish, K., Oard, D.: Probabilistic structured query methods. In: SIGIR, pp. 338–344 (2003)
5.
go back to reference Fiscus, J., Doddington, G.: Topic detection and tracking evaluation overview. In: Allan, J. (ed.) Topic Detection and Tracking. The Information Retrieval Series, vol. 12, pp. 17–31. Springer, Boston (2002)CrossRef Fiscus, J., Doddington, G.: Topic detection and tracking evaluation overview. In: Allan, J. (ed.) Topic Detection and Tracking. The Information Retrieval Series, vol. 12, pp. 17–31. Springer, Boston (2002)CrossRef
6.
go back to reference Hull, D.: Using structured queries for disambiguation in cross-language information retrieval. In: AAAI Symposium on Cross-Language Text and Speech Retrieval (1997) Hull, D.: Using structured queries for disambiguation in cross-language information retrieval. In: AAAI Symposium on Cross-Language Text and Speech Retrieval (1997)
7.
go back to reference Karakos, D., et al.: Score normalization and system combination for improved keyword spotting. In: ASRU, pp. 210–215 (2013) Karakos, D., et al.: Score normalization and system combination for improved keyword spotting. In: ASRU, pp. 210–215 (2013)
8.
go back to reference Kim, S., et al.: Combining lexical and statistical translation evidence for cross-language information retrieval. JASIST 66(1), 23–39 (2015) Kim, S., et al.: Combining lexical and statistical translation evidence for cross-language information retrieval. JASIST 66(1), 23–39 (2015)
9.
go back to reference Lee, L.S., Chen, B.: Spoken document understanding and organization. IEEE Signal Process. Mag. 22(5), 42–60 (2005)CrossRef Lee, L.S., Chen, B.: Spoken document understanding and organization. IEEE Signal Process. Mag. 22(5), 42–60 (2005)CrossRef
10.
go back to reference Lee, L.S., Pan, Y.C.: Voice-based information retrieval—how far are we from the text-based information retrieval? In: ASRU, pp. 26–43 (2009) Lee, L.S., Pan, Y.C.: Voice-based information retrieval—how far are we from the text-based information retrieval? In: ASRU, pp. 26–43 (2009)
11.
go back to reference Makhoul, J., et al.: Speech and language technologies for audio indexing and retrieval. Proc. IEEE 88(8), 1338–1353 (2000)CrossRef Makhoul, J., et al.: Speech and language technologies for audio indexing and retrieval. Proc. IEEE 88(8), 1338–1353 (2000)CrossRef
12.
go back to reference Mamou, J., et al.: Developing keyword search under the IARPA Babel program. In: Afeka Speech Processing Conference (2013) Mamou, J., et al.: Developing keyword search under the IARPA Babel program. In: Afeka Speech Processing Conference (2013)
13.
go back to reference McNamee, P., Mayfield, J.: Comparing cross-language query expansion techniques by degrading translation resources. In: SIGIR, pp. 159–166 (2002) McNamee, P., Mayfield, J.: Comparing cross-language query expansion techniques by degrading translation resources. In: SIGIR, pp. 159–166 (2002)
15.
go back to reference Och, F., Ney, H.: A systematic comparison of various statistical alignment models. Comput. Linguist. 29(1), 19–51 (2003)CrossRef Och, F., Ney, H.: A systematic comparison of various statistical alignment models. Comput. Linguist. 29(1), 19–51 (2003)CrossRef
17.
go back to reference Pirkola, A.: The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval. In: SIGIR, pp. 55–63 (1998) Pirkola, A.: The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval. In: SIGIR, pp. 55–63 (1998)
18.
go back to reference Ragni, A., Gales, M.: Automatic speech recognition system development in the ‘wild’. In: ICSA, pp. 2217–2221 (2018) Ragni, A., Gales, M.: Automatic speech recognition system development in the ‘wild’. In: ICSA, pp. 2217–2221 (2018)
19.
go back to reference Riedhammer, K., et al.: A study on LVCSR and keyword search for tagalog. In: INTERSPEECH, pp. 2529–2533 (2013) Riedhammer, K., et al.: A study on LVCSR and keyword search for tagalog. In: INTERSPEECH, pp. 2529–2533 (2013)
20.
go back to reference Robertson, S.: Okapi at TREC-7: automatic ad hoc, filtering, VLC and interactive track. In: TREC (1998) Robertson, S.: Okapi at TREC-7: automatic ad hoc, filtering, VLC and interactive track. In: TREC (1998)
22.
go back to reference Saraclar, M., Sproat, R.: Lattice-based search for spoken utterance retrieval. In: NAACL (2004) Saraclar, M., Sproat, R.: Lattice-based search for spoken utterance retrieval. In: NAACL (2004)
23.
go back to reference Strohman, T., et al.: Indri: a language model-based search engine for complex queries. In: International Conference on Intelligence Analysis (2005) Strohman, T., et al.: Indri: a language model-based search engine for complex queries. In: International Conference on Intelligence Analysis (2005)
24.
go back to reference Tur, G., De Mori, R.: Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. Wiley, New York (2011)CrossRef Tur, G., De Mori, R.: Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. Wiley, New York (2011)CrossRef
25.
go back to reference Wang, J., Oard, D.: Matching meaning for cross-language information retrieval. Inf. Process. Manag. 48(4), 631–653 (2012)CrossRef Wang, J., Oard, D.: Matching meaning for cross-language information retrieval. Inf. Process. Manag. 48(4), 631–653 (2012)CrossRef
26.
go back to reference Wegmann, S., et al.: The TAO of ATWV: probing the mysteries of keyword search performance. In: ASRU, pp. 192–197 (2013) Wegmann, S., et al.: The TAO of ATWV: probing the mysteries of keyword search performance. In: ASRU, pp. 192–197 (2013)
27.
go back to reference Weintraub, M.: Keyword-spotting using SRI’s DECIPHER large-vocabulary speech-recognition system. In: ICASSP, vol. 2, pp. 463–466 (1993) Weintraub, M.: Keyword-spotting using SRI’s DECIPHER large-vocabulary speech-recognition system. In: ICASSP, vol. 2, pp. 463–466 (1993)
29.
go back to reference Xu, J., Weischedel, R.: Cross-lingual information retrieval using hidden Markov models. In: EMNLP, pp. 95–103 (2000) Xu, J., Weischedel, R.: Cross-lingual information retrieval using hidden Markov models. In: EMNLP, pp. 95–103 (2000)
30.
go back to reference Zbib, R., et al.: Neural-network lexical translation for cross-lingual IR from text and speech. In: SIGIR (2019) Zbib, R., et al.: Neural-network lexical translation for cross-lingual IR from text and speech. In: SIGIR (2019)
Metadata
Title
Experiments with Cross-Language Speech Retrieval for Lower-Resource Languages
Authors
Suraj Nair
Anton Ragni
Ondrej Klejch
Petra Galuščáková
Douglas Oard
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-42835-8_13