Top

Published in:

2021 | OriginalPaper | Chapter

On the Use of Phonotactic Vector Representations with FastText for Language Identification

Authors : David Romero, Christian Salamea

Published in: Conversational Dialogue Systems for the Next Decade

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

This paper explores a better way to learn word vector representations for language identification (LID). We have focused on a phonotactic approach using phoneme sequences in order to make phonotactic units (phone-grams) to incorporate context information. In order to take into consideration the morphology of phone-grams, we have considered the use of sub-word information (lower-order n-grams) to learn phone-grams embeddings using FastText. These embeddings are used as input to an i-Vector framework to train a multiclass logistic classifier. Our approach has been compared with a LID system that uses phone-gram embeddings learned through Skipgram that do not implement sub-word information, using Cavg as a metric for our experiments. Our approach to LID to incorporate sub-word information in phone-grams embeddings significantly improves the results obtained by using embeddings that are learned ignoring the structure of phone-grams. Furthermore, we have shown that our system provides complementary information to an acoustic system, improving it through the fusion of both systems.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Packing, Stacking, and Tracking: An Empirical Study of Online User Adaptation

next chapter The Influence of Syntax on the Perception of In-Vehicle Prompts and Driving Performance

Ace P, Schwarz P, Ace V (2009) Phoneme recognition based on long temporal context

Barbaresi A (2017) Discriminating between similar languages using weighted subword features. In: Fourth workshop on NLP for similar languages, pp 184–189

Berkling K, Arai T, Barnard E (1994) Analysis of phoneme-based features for language identification. In: Proceedings of the international conference on acoustics, speech and signal processing. IEEE, pp 289–292

Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. arXiv:1607.04606v2

Chaudhary A, Zhou C, Levin L, Neubig G, Mortensen D, Carbonell J (2018) Adapting word embeddings to new languages with morphological and phonological subword representations. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3285–3295

D’Haro L, Glembek O, Plchot O, Matejka P, Soufifar M, Córdoba R, Cernocky J (2012) Phonotactic language recognition using i-vectors and phoneme posteriogram counts. In: ISCA 13th annual conference, Proceedings of the INTERSPEECH, pp 42–45

Karpathy A, Fei-Fei L (2017) Deep visual-semantic alignments for generating image description. IEEE Trans Pattern Anal Mach Intell 39(4):664–676CrossRef

Kulmizev A, Blankers B, Bjerva J, Nissim M, Noord G, Plank B, Wieling M (2017) The power of character n-grams in native language identification. In: Proceedings of the 12th workshop on innovative use of NLP for building educational applications, pp 382–389

Livescu K, Fosler-Lussier E, Metze F (2012) Sub-word modeling for automatic speech recognition. IEEE Signal Process Mag 29:44–57CrossRef

10.

Matejka P, Schwarz P, Cernock J, Chytil P (2005) Phonotactic language identification using high quality phonome recognition. In: Proceedings of the IberSPEECH, pp 2237–2240

11.

Martin A, Greenberg C (2010) The 2009 NIST language recognition evaluation. In: Odyssey, p 30

12.

Mager M, Cetinoglu O, Kann K (2019) Subword-level language identification for intra-word code-switching. arXiv:1904.01989v1

13.

Mikolov T, Sutskever I, Deoras A, Le H, Kombrink S, Cernocky J (2011) Subword language modeling with neural networks

14.

Palaskar S, Raunak V, Metze F (2019) Learned in speech recognition: contextual acoustic word embeddings. arXiv:1902.06833v1

15.

Qi Y, Sachan D, Felix M, Padmanabhan S, Neubig G (2018) When and why are pre-trained word embeddings useful for neural machine translation? arXiv:1804.06323v2

16.

Rodriguez L, Penagarikano M, Varona A, Diez M, Bordel G (2016) KALAKA-3: a database for the assessment of spoken language recognition technology on YouTube audios. Lang Resour Eval 50(2):221–243CrossRef

17.

Salamea C, Córdoba R, D’Haro L, Segundo R, Ferreiros J (2018) On the use of phone-based embeddings for language recognition. In: Proceedings of the IberSPEECH, pp 55–59

18.

Singh R, Raj B, Stern R (2002) Automatic generation of subword units for speech recognition systems. IEEE Trans Speech Audio Process 10(2):89–99CrossRef

19.

Xia M (2016) Codeswitching language identification using subword information enriched word vectors. In: Proceedings of the second workshop on computational approaches to code switching, pp 132–136

20.

Zhang Z, Huang Y, Zhu P, Zhao H (2018) Effective character-augmented word embedding for machine reading comprehension. arXiv:1808.02772v1

Title: On the Use of Phonotactic Vector Representations with FastText for Language Identification
Authors: David Romero
Christian Salamea
Publisher: Springer Singapore
Book: Conversational Dialogue Systems for the Next Decade
Print ISBN: 978-981-15-8394-0

Electronic ISBN: 978-981-15-8395-7

Copyright Year: 2021
DOI: https://doi.org/10.1007/978-981-15-8395-7_25

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"