Skip to main content
Top
Published in: KI - Künstliche Intelligenz 4/2018

10-09-2018 | Dissertation and Habilitation Abstracts

Multitask and Multilingual Modelling for Lexical Analysis

Author: Johannes Bjerva

Published in: KI - Künstliche Intelligenz | Issue 4/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In Natural Language Processing (NLP), one traditionally considers a single task (e.g. part-of-speech tagging) for a single language (e.g. English) at a time. However, recent work has shown that it can be beneficial to take advantage of relatedness between tasks, as well as between languages. In this work I examine the concept of relatedness and explore how it can be utilised to build NLP models that require less manually annotated data. A large selection of NLP tasks is investigated for a substantial language sample comprising 60 languages. The results show potential for joint multitask and multilingual modelling, and hints at linguistic insights which can be gained from such models.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

KI - Künstliche Intelligenz

The Scientific journal "KI – Künstliche Intelligenz" is the official journal of the division for artificial intelligence within the "Gesellschaft für Informatik e.V." (GI) – the German Informatics Society - with constributions from troughout the field of artificial intelligence.

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Footnotes
1
In NLP words are commonly represented by embedding them in a vector space, typically with 64–256 dimensions. These representations are learnt by predicting contexts in large text corpora, such that words occurring in similar contexts are close to one another, which is useful since such words tend to have similar meanings (i.e. distributional semantics).
 
2
SemTags: [1, 9]. POS: UD1.3 (universaldependencies.org).
 
3
This can be done by learning multilingual word embeddings, in which, e.g., the words dialects and Dialekten are close to one another.
 
4
Bi-directional RNNs are frequently used in NLP. One advantage of this is that one can use both the preceding and succeeding contexts of a word when predicting its tag.
 
5
Evaluation of a model trained on one language on a test instance for an unobserved language.
 
Literature
1.
go back to reference Abzianidze L, Bjerva J, Evang K, Haagsma H, van Noord R, Ludmann P, Nguyen DD, Bos J (2017) The parallel meaning bank: towards a multilingual corpus of translations annotated with compositional meaning representations. In: EACL, pp 242–247 Abzianidze L, Bjerva J, Evang K, Haagsma H, van Noord R, Ludmann P, Nguyen DD, Bos J (2017) The parallel meaning bank: towards a multilingual corpus of translations annotated with compositional meaning representations. In: EACL, pp 242–247
2.
go back to reference Bjerva J (2016) Byte-based language identification with deep convolutional networks. In: VarDial3, pp 119–125 Bjerva J (2016) Byte-based language identification with deep convolutional networks. In: VarDial3, pp 119–125
4.
go back to reference Bjerva J (2017) Will my auxiliary tagging task help? Estimating auxiliary tasks effectivity in multi-task learning. In: NoDaLiDa, pp 216–220 Bjerva J (2017) Will my auxiliary tagging task help? Estimating auxiliary tasks effectivity in multi-task learning. In: NoDaLiDa, pp 216–220
5.
go back to reference Bjerva J, Augenstein I (2018) From phonology to syntax: unsupervised linguistic typology at different levels with language embeddings. In: NAACL-HLT Bjerva J, Augenstein I (2018) From phonology to syntax: unsupervised linguistic typology at different levels with language embeddings. In: NAACL-HLT
6.
go back to reference Bjerva J, Augenstein I (2018) Tracking typological features of uralic languages in distributed language representations. In: IWCLUL Bjerva J, Augenstein I (2018) Tracking typological features of uralic languages in distributed language representations. In: IWCLUL
7.
go back to reference Bjerva J, Bos J, Van der Goot R, Nissim M (2014) The meaning factory: Formal semantics for recognizing textual entailment and determining semantic similarity. In: SemEval 2014, pp 642–646 Bjerva J, Bos J, Van der Goot R, Nissim M (2014) The meaning factory: Formal semantics for recognizing textual entailment and determining semantic similarity. In: SemEval 2014, pp 642–646
8.
go back to reference Bjerva J, Östling R (2017) Cross-lingual learning of semantic textual similarity with multilingual word representations. In: NoDaLiDa, pp 211–215 Bjerva J, Östling R (2017) Cross-lingual learning of semantic textual similarity with multilingual word representations. In: NoDaLiDa, pp 211–215
9.
go back to reference Bjerva J, Plank B, Bos J (2016) Semantic tagging with deep residual networks. In: COLING, pp 3531–3541 Bjerva J, Plank B, Bos J (2016) Semantic tagging with deep residual networks. In: COLING, pp 3531–3541
10.
go back to reference Bos J, Basile V, Evang K, Venhuizen NJ, Bjerva J (2017) The Groningen meaning bank. Springer, Dordrecht, pp 463–496 Bos J, Basile V, Evang K, Venhuizen NJ, Bjerva J (2017) The Groningen meaning bank. Springer, Dordrecht, pp 463–496
12.
go back to reference de Lhoneux M, Bjerva J, Augenstein I, Søgaard A (2018) Parameter sharing between dependency parsers for related languages. In: EMNLP de Lhoneux M, Bjerva J, Augenstein I, Søgaard A (2018) Parameter sharing between dependency parsers for related languages. In: EMNLP
Metadata
Title
Multitask and Multilingual Modelling for Lexical Analysis
Author
Johannes Bjerva
Publication date
10-09-2018
Publisher
Springer Berlin Heidelberg
Published in
KI - Künstliche Intelligenz / Issue 4/2018
Print ISSN: 0933-1875
Electronic ISSN: 1610-1987
DOI
https://doi.org/10.1007/s13218-018-0557-5

Other articles of this Issue 4/2018

KI - Künstliche Intelligenz 4/2018 Go to the issue

News

News

Premium Partner