Skip to main content
Erschienen in: International Journal of Speech Technology 1/2022

05.11.2021

A method for constructing Korean spontaneous spoken language corpus based on an imitation of abbreviated and transformed particles

verfasst von: Hyok-Chol Ri, Chol Kim, Mok-Ran Jo

Erschienen in: International Journal of Speech Technology | Ausgabe 1/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the paper, we proposed a method of constructing a language corpus based on the imitation of abbreviated and transformed particles that are distinctive feature of Korean spontaneous spoken language. Since it is not practical to train a spoken-style model using numerous spoken transcripts, the proposed approach generates a spoken-style text from a written-style one such as newspapers, based on characteristics of pronouncing variations, dependent on spoken styles, of typical particles. This method for constructing spoken-style text is based on statistical analysis on particles that play same function in both of written and spoken language. We analyze grammatical functions and pronouncing features of particles that distinguish between written and spoken language, and generate spoken-style text from written-style text by imitating typical abbreviated and transformed particles which play same function. Abbreviated and transformed particles to be imitated have proper and typical pronouncing features of spoken language. We replace particles with abbreviated and transformed particles in written-style text according to correspondence of written particles to spoken ones, which results in spoken-style text. The language model, which is trained from spoken-style text imitating abbreviated and transformed particles, significantly improved a word error rate (WER) on spontaneous speech.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Cettolo, M., Brugnara, F., & Federico, M. (2004). Advances in the automatic transcription of lectures. In Proc. ICASSP (pp. 769–772). Cettolo, M., Brugnara, F., & Federico, M. (2004). Advances in the automatic transcription of lectures. In Proc. ICASSP (pp. 769–772).
Zurück zum Zitat Furui, S., Maekawa, K., & Isahara, H. (2000). Toward the realization of spontaneous speech recognition—Introduction of a Japanese priority program and preliminary results. In Proc. ICSLP (pp. 518–521). Furui, S., Maekawa, K., & Isahara, H. (2000). Toward the realization of spontaneous speech recognition—Introduction of a Japanese priority program and preliminary results. In Proc. ICSLP (pp. 518–521).
Zurück zum Zitat Garofolo, J., Laprun, C., & Fiscus, J. (2004). The rich transcription 2004 spring meeting recognition evaluation. In Proc. ICASSP Meeting Recognition Workshop. Garofolo, J., Laprun, C., & Fiscus, J. (2004). The rich transcription 2004 spring meeting recognition evaluation. In Proc. ICASSP Meeting Recognition Workshop.
Zurück zum Zitat Glass, J., Hazen, T., Cyphers, S., Malioutov, I., Huynh, D., & Barzilay, R. (2007). Recent progress in the MIT spoken lecture processing project. In Proc. Eurospeech (pp. 2553–2556). Glass, J., Hazen, T., Cyphers, S., Malioutov, I., Huynh, D., & Barzilay, R. (2007). Recent progress in the MIT spoken lecture processing project. In Proc. Eurospeech (pp. 2553–2556).
Zurück zum Zitat Hain, T., Woodland, P., Niesler, T., & Whittaker, E. (1999). The 1998 HTK system for transcription of conversational telephone speech. In Proc. ICASSP (pp. 57–60). Hain, T., Woodland, P., Niesler, T., & Whittaker, E. (1999). The 1998 HTK system for transcription of conversational telephone speech. In Proc. ICASSP (pp. 57–60).
Zurück zum Zitat Kawahara, T., Nemoto, Y., & Akita, Y. (2008). Automatic lecture transcription by exploiting presentation slide information for language model adaptation. In Proc. ICASSP (pp. 4929–4932). Kawahara, T., Nemoto, Y., & Akita, Y. (2008). Automatic lecture transcription by exploiting presentation slide information for language model adaptation. In Proc. ICASSP (pp. 4929–4932).
Zurück zum Zitat Lamel, L., Adda, G., Bilinski, E., & Gauvain, J. (2005). Transcribing lectures and seminars. In Proc. Eurospeech (pp. 1657–1660). Lamel, L., Adda, G., Bilinski, E., & Gauvain, J. (2005). Transcribing lectures and seminars. In Proc. Eurospeech (pp. 1657–1660).
Zurück zum Zitat Lamel, L., Gauvain, J.L., Adda, G., Barras, C., Bilinski, E., et al. (2007). The LIMSI 2006 TC-STAR EPPS transcription systems. In Proc. ICASSP (pp. 997–1000). Lamel, L., Gauvain, J.L., Adda, G., Barras, C., Bilinski, E., et al. (2007). The LIMSI 2006 TC-STAR EPPS transcription systems. In Proc. ICASSP (pp. 997–1000).
Zurück zum Zitat Leeuwis, E., Federico, M., & Cettolo, M. (2003). Language modeling and transcription of the TED corpus lectures. In Proc. ICASSP (pp. 232–235). Leeuwis, E., Federico, M., & Cettolo, M. (2003). Language modeling and transcription of the TED corpus lectures. In Proc. ICASSP (pp. 232–235).
Zurück zum Zitat Loof, J., Bisani, M., Gollan, C., Heigold, G., Hoffmeister, B., Plahl, C., Schluter, R., & Ney, H. (2006). The 2006 RWTH parliamentary speeches transcription system. In Proc. ICSLP (pp. 105–108). Loof, J., Bisani, M., Gollan, C., Heigold, G., Hoffmeister, B., Plahl, C., Schluter, R., & Ney, H. (2006). The 2006 RWTH parliamentary speeches transcription system. In Proc. ICSLP (pp. 105–108).
Zurück zum Zitat Masumura, R., Hahm, S., & Ito, A. (2011). Training a language model using web data for large vocabulary Japanese spontaneous speech recognition. In Proc. Interspeech (pp. 1465–1468). Masumura, R., Hahm, S., & Ito, A. (2011). Training a language model using web data for large vocabulary Japanese spontaneous speech recognition. In Proc. Interspeech (pp. 1465–1468).
Zurück zum Zitat Prasad, R., Nguyen, L., Schwartz, R., & Makhoul, J. (2002). Automatic transcription of courtroom speech. In Proc. ICSLP (pp. 1745–1748). Prasad, R., Nguyen, L., Schwartz, R., & Makhoul, J. (2002). Automatic transcription of courtroom speech. In Proc. ICSLP (pp. 1745–1748).
Zurück zum Zitat Ramabhadran, B., Siohan, O., Mangu, L., Zweig, G., et al. (2006). The IBM 2006 speech transcription system for European parliamentary speeches. In Proc. ICSLP (pp. 1225–1228). Ramabhadran, B., Siohan, O., Mangu, L., Zweig, G., et al. (2006). The IBM 2006 speech transcription system for European parliamentary speeches. In Proc. ICSLP (pp. 1225–1228).
Zurück zum Zitat Renals, S., Hain, T., & Bourlard, H. (2007). Recognition and understanding of meetings: The AMI and AMIDA projects. In Proc. ASRU (pp. 238–247). Renals, S., Hain, T., & Bourlard, H. (2007). Recognition and understanding of meetings: The AMI and AMIDA projects. In Proc. ASRU (pp. 238–247).
Zurück zum Zitat Stolcke, A. (2002). SRILM—an extensible language modeling toolkit. In Proc. Int. Conf. on Spoken Language Processing (pp. 901–904). Colorado: Denver. Stolcke, A. (2002). SRILM—an extensible language modeling toolkit. In Proc. Int. Conf. on Spoken Language Processing (pp. 901–904). Colorado: Denver.
Zurück zum Zitat Xinhui, H., Shigeki, M., Chori, H., & Hideki, K. (2013). Collecting colloquial and spontaneous-like sentences from web resources for constructing Chinese language models of speech recognition. Journal of Information Processing, 21(2), 168–175.CrossRef Xinhui, H., Shigeki, M., Chori, H., & Hideki, K. (2013). Collecting colloquial and spontaneous-like sentences from web resources for constructing Chinese language models of speech recognition. Journal of Information Processing, 21(2), 168–175.CrossRef
Zurück zum Zitat Young, S., et al. (2006). The HTK Book Version 3.4. Cambridge: Cambridge University. Young, S., et al. (2006). The HTK Book Version 3.4. Cambridge: Cambridge University.
Zurück zum Zitat Zavaliagkos, G., McDonough, J., Miller, D., El-Jaroudi, et al. (1998). The BBN Byblos 1997 large vocabulary conversational speech recognition system. in Proc. ICASSP (pp. 905–908.) Zavaliagkos, G., McDonough, J., Miller, D., El-Jaroudi, et al. (1998). The BBN Byblos 1997 large vocabulary conversational speech recognition system. in Proc. ICASSP (pp. 905–908.)
Metadaten
Titel
A method for constructing Korean spontaneous spoken language corpus based on an imitation of abbreviated and transformed particles
verfasst von
Hyok-Chol Ri
Chol Kim
Mok-Ran Jo
Publikationsdatum
05.11.2021
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 1/2022
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-021-09937-6

Weitere Artikel der Ausgabe 1/2022

International Journal of Speech Technology 1/2022 Zur Ausgabe

Neuer Inhalt