Skip to main content

2018 | OriginalPaper | Buchkapitel

Restoring Punctuation and Capitalization Using Transformer Models

verfasst von : Andris Vāravs, Askars Salimbajevs

Erschienen in: Statistical Language and Speech Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Restoring punctuation and capitalization in the output of automatic speech recognition (ASR) system greatly improves readability and extends the number of downstream applications. We present a Transformer-based method for restoring punctuation and capitalization for Latvian and English, following the established approach of using neural machine translation (NMT) models. NMT methods here pose a challenge as the length of the predicted sequence does not always match the length of the input sequence. We offer two solutions to this problem: a simple target sequence cutting or padding by force and a more sophisticated attention alignment-based method. Our approach reaches new state of the art results for Latvian and competitive results on English.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Europarl results were not in print version of the paper, but they can be found at https://​github.​com/​ottokart/​punctuator2.
 
Literatur
1.
Zurück zum Zitat Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. In: OSDI, vol. 16, pp. 265–283 (2016) Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. In: OSDI, vol. 16, pp. 265–283 (2016)
2.
Zurück zum Zitat Agbago, A., Foster, G.: Truecasing for the portage system. In. Recent Advances in Natural Language Processing (2005) Agbago, A., Foster, G.: Truecasing for the portage system. In. Recent Advances in Natural Language Processing (2005)
3.
Zurück zum Zitat Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014) Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:​1409.​0473 (2014)
4.
Zurück zum Zitat Bai, S., Kolter, J.Z., Koltun, V.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271 (2018) Bai, S., Kolter, J.Z., Koltun, V.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:​1803.​01271 (2018)
5.
Zurück zum Zitat Batista, F., Moniz, H., Trancoso, I., Mamede, N.: Bilingual experiments on automatic recovery of capitalization and punctuation of automatic speech transcripts. IEEE Trans. Audio Speech Lang. Process. 20(2), 474–485 (2012)CrossRef Batista, F., Moniz, H., Trancoso, I., Mamede, N.: Bilingual experiments on automatic recovery of capitalization and punctuation of automatic speech transcripts. IEEE Trans. Audio Speech Lang. Process. 20(2), 474–485 (2012)CrossRef
6.
Zurück zum Zitat Beaufays, F., Strope, B.: Language model capitalization. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6749–6752. IEEE (2013) Beaufays, F., Strope, B.: Language model capitalization. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6749–6752. IEEE (2013)
8.
Zurück zum Zitat Cettolo, M., Niehues, J., Stüker, S., Bentivogli, L., Federico, M.: Report on the 10th IWSLT evaluation campaign. In: Proceedings of the International Workshop on Spoken Language Translation, Heidelberg, Germany (2013) Cettolo, M., Niehues, J., Stüker, S., Bentivogli, L., Federico, M.: Report on the 10th IWSLT evaluation campaign. In: Proceedings of the International Workshop on Spoken Language Translation, Heidelberg, Germany (2013)
9.
Zurück zum Zitat Cettolo, M., Niehues, J., Stüker, S., Bentivogli, L., Federico, M.: Report on the 11th IWSLT evaluation campaign, IWSLT 2014. In: Proceedings of the International Workshop on Spoken Language Translation, Hanoi, Vietnam (2014) Cettolo, M., Niehues, J., Stüker, S., Bentivogli, L., Federico, M.: Report on the 11th IWSLT evaluation campaign, IWSLT 2014. In: Proceedings of the International Workshop on Spoken Language Translation, Hanoi, Vietnam (2014)
10.
Zurück zum Zitat Cettolo, M., Niehues, J., Stüker, S., Bentivogli, L., Federico, M.: Report on the 12th IWSLT evaluation campaign, IWSLT 2015. In: Proceedings of the International Workshop on Spoken Language Translation, Da Nang, Vietnam (2015) Cettolo, M., Niehues, J., Stüker, S., Bentivogli, L., Federico, M.: Report on the 12th IWSLT evaluation campaign, IWSLT 2015. In: Proceedings of the International Workshop on Spoken Language Translation, Da Nang, Vietnam (2015)
11.
Zurück zum Zitat Chelba, C., Acero, A.: Adaptation of maximum entropy capitalizer: little data can help a lot. Comput. Speech Lang. 20(4), 382–399 (2006)CrossRef Chelba, C., Acero, A.: Adaptation of maximum entropy capitalizer: little data can help a lot. Comput. Speech Lang. 20(4), 382–399 (2006)CrossRef
12.
Zurück zum Zitat Chen, M.X., et al.: The best of both worlds: combining recent advances in neural machine translation. arXiv preprint arXiv:1804.09849 (2018) Chen, M.X., et al.: The best of both worlds: combining recent advances in neural machine translation. arXiv preprint arXiv:​1804.​09849 (2018)
13.
Zurück zum Zitat Cho, E., et al.: A real-world system for simultaneous translation of German lectures. In: INTERSPEECH, pp. 3473–3477 (2013) Cho, E., et al.: A real-world system for simultaneous translation of German lectures. In: INTERSPEECH, pp. 3473–3477 (2013)
14.
Zurück zum Zitat Cho, E., Niehues, J., Waibel, A.: Segmentation and punctuation prediction in speech language translation using a monolingual translation system. In: International Workshop on Spoken Language Translation (IWSLT) 2012 (2012) Cho, E., Niehues, J., Waibel, A.: Segmentation and punctuation prediction in speech language translation using a monolingual translation system. In: International Workshop on Spoken Language Translation (IWSLT) 2012 (2012)
16.
Zurück zum Zitat Gravano, A., Jansche, M., Bacchiani, M.: Restoring punctuation and capitalization in transcribed speech. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009, pp. 4741–4744. IEEE (2009) Gravano, A., Jansche, M., Bacchiani, M.: Restoring punctuation and capitalization in transcribed speech. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009, pp. 4741–4744. IEEE (2009)
18.
Zurück zum Zitat Koehn, P.: Europarl: a parallel corpus for statistical machine translation. In: MT Summit, vol. 5, pp. 79–86 (2005) Koehn, P.: Europarl: a parallel corpus for statistical machine translation. In: MT Summit, vol. 5, pp. 79–86 (2005)
19.
Zurück zum Zitat Koehn, P., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pp. 177–180. Association for Computational Linguistics (2007) Koehn, P., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pp. 177–180. Association for Computational Linguistics (2007)
20.
Zurück zum Zitat Lita, L.V., Ittycheriah, A., Roukos, S., Kambhatla, N.: tRuEcasing. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, ACL 2003, vol. 1. pp. 152–159. Association for Computational Linguistics, Stroudsburg (2003). https://doi.org/10.3115/1075096.1075116 Lita, L.V., Ittycheriah, A., Roukos, S., Kambhatla, N.: tRuEcasing. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, ACL 2003, vol. 1. pp. 152–159. Association for Computational Linguistics, Stroudsburg (2003). https://​doi.​org/​10.​3115/​1075096.​1075116
21.
Zurück zum Zitat Lu, W., Ng, H.T.: Better punctuation prediction with dynamic conditional random fields. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 177–186. Association for Computational Linguistics (2010) Lu, W., Ng, H.T.: Better punctuation prediction with dynamic conditional random fields. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 177–186. Association for Computational Linguistics (2010)
22.
Zurück zum Zitat Ostendorf, M., et al.: Speech segmentation and spoken document processing. IEEE Sig. Process. Mag. 25(3), 59–69 (2008)CrossRef Ostendorf, M., et al.: Speech segmentation and spoken document processing. IEEE Sig. Process. Mag. 25(3), 59–69 (2008)CrossRef
23.
Zurück zum Zitat Peitz, S., Freitag, M., Mauser, A., Ney, H.: Modeling punctuation prediction as machine translation. In: International Workshop on Spoken Language Translation (IWSLT) 2011 (2011) Peitz, S., Freitag, M., Mauser, A., Ney, H.: Modeling punctuation prediction as machine translation. In: International Workshop on Spoken Language Translation (IWSLT) 2011 (2011)
24.
Zurück zum Zitat Rao, S., Lane, I., Schultz, T.: Optimizing sentence segmentation for spoken language translation. In: Eighth Annual Conference of the International Speech Communication Association (2007) Rao, S., Lane, I., Schultz, T.: Optimizing sentence segmentation for spoken language translation. In: Eighth Annual Conference of the International Speech Communication Association (2007)
25.
Zurück zum Zitat Salimbajevs, A.: Bidirectional LSTM for automatic punctuation restoration. In: Human Language Technologies-The Baltic Perspective: Proceedings of the Seventh International Conference Baltic HLT 2016, vol. 289, p. 59. IOS Press (2016) Salimbajevs, A.: Bidirectional LSTM for automatic punctuation restoration. In: Human Language Technologies-The Baltic Perspective: Proceedings of the Seventh International Conference Baltic HLT 2016, vol. 289, p. 59. IOS Press (2016)
26.
Zurück zum Zitat Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909 (2015) Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. arXiv preprint arXiv:​1508.​07909 (2015)
27.
Zurück zum Zitat Tilk, O., Alumäe, T.: Bidirectional recurrent neural network with attention mechanism for punctuation restoration. In: Interspeech, pp. 3047–3051 (2016) Tilk, O., Alumäe, T.: Bidirectional recurrent neural network with attention mechanism for punctuation restoration. In: Interspeech, pp. 3047–3051 (2016)
29.
Zurück zum Zitat Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 6000–6010 (2017) Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 6000–6010 (2017)
30.
Zurück zum Zitat Wang, W., Knight, K., Marcu, D.: Capitalizing machine translation. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 1–8. Association for Computational Linguistics (2006) Wang, W., Knight, K., Marcu, D.: Capitalizing machine translation. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 1–8. Association for Computational Linguistics (2006)
Metadaten
Titel
Restoring Punctuation and Capitalization Using Transformer Models
verfasst von
Andris Vāravs
Askars Salimbajevs
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-00810-9_9