Skip to main content
Top

2015 | OriginalPaper | Chapter

The Multi-level Approach to Speech Corpora Annotation for Automatic Speech Recognition

Authors : Igor Glavatskih, Tatyana Platonova, Valeria Rogozhina, Anna Shirokova, Anna Smolina, Mikhail Kotov, Anna Ovsyannikova, Sergey Repalov, Mikhail Zulkarneev

Published in: Speech and Computer

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In the paper the multi-level approach to audio files annotation is briefly summarized. The emphasis is mainly placed on the development of annotation rules. Firstly, some general requirements are outlined and more specific markers are listed, which may or may not be included in a particular rule set depending on the given practical task. Then software tools used for creating annotations and its spell-checking are described, and an example of a database created on the basis of the multi-level approach to annotation is given. Lastly, the application of tag sorting in ASR training and testing is discussed.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Bogdanov, D.S., Brukhtii, A.V., Krivnova, O.F., Podrabinovich, A.Ya., Strokin, G.S.: The technology of speech databases formation. Collected papers of system Analysis Institute of RAS, pp. 238–259. Editorial URSS, Moscow (2003–2004) (in Russian) Bogdanov, D.S., Brukhtii, A.V., Krivnova, O.F., Podrabinovich, A.Ya., Strokin, G.S.: The technology of speech databases formation. Collected papers of system Analysis Institute of RAS, pp. 238–259. Editorial URSS, Moscow (2003–2004) (in Russian)
2.
go back to reference Lane, I., Wailbel, A.: Tools of collecting speech corpora via mechanishanical-truk. In: 11th NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Truck, California, pp. 184–187 (2010) Lane, I., Wailbel, A.: Tools of collecting speech corpora via mechanishanical-truk. In: 11th NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Truck, California, pp. 184–187 (2010)
3.
go back to reference Matoušek, J., Romportl, J.: Recording and annotation of speech corpus for Czech unit selection speech synthesis. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 326–333. Springer, Heidelberg (2007) Matoušek, J., Romportl, J.: Recording and annotation of speech corpus for Czech unit selection speech synthesis. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 326–333. Springer, Heidelberg (2007)
4.
go back to reference Anumanchipalli, G., Chitturi, R., Joshi, S., Kumar, R., Singh, S.P., Sitaram, R.N.V., Kishore, S.P.: Development of Indian language speech databases for large vocabulary speech recognition systems. In: 10th SPECOM International Conference on Speech and Computer, Patras, pp. 245–254 (2005) Anumanchipalli, G., Chitturi, R., Joshi, S., Kumar, R., Singh, S.P., Sitaram, R.N.V., Kishore, S.P.: Development of Indian language speech databases for large vocabulary speech recognition systems. In: 10th SPECOM International Conference on Speech and Computer, Patras, pp. 245–254 (2005)
5.
go back to reference Zulkarneev, M., Grigoryan, R., Shamraev, N.: Acoustic modeling with deep belief networks for Russian speech recognition. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 17–24. Springer, Heidelberg (2013) Zulkarneev, M., Grigoryan, R., Shamraev, N.: Acoustic modeling with deep belief networks for Russian speech recognition. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 17–24. Springer, Heidelberg (2013)
7.
go back to reference Krivnova, O.F., Zakharov, L.M., Strokin, G.S.: Automatic transcriber of russian texts: problems, structure and application. In: 6th SPECOM International Conference on Speech and Computer, Moscow, pp. 408–409 (2001) Krivnova, O.F., Zakharov, L.M., Strokin, G.S.: Automatic transcriber of russian texts: problems, structure and application. In: 6th SPECOM International Conference on Speech and Computer, Moscow, pp. 408–409 (2001)
Metadata
Title
The Multi-level Approach to Speech Corpora Annotation for Automatic Speech Recognition
Authors
Igor Glavatskih
Tatyana Platonova
Valeria Rogozhina
Anna Shirokova
Anna Smolina
Mikhail Kotov
Anna Ovsyannikova
Sergey Repalov
Mikhail Zulkarneev
Copyright Year
2015
DOI
https://doi.org/10.1007/978-3-319-23132-7_54

Premium Partner