Skip to main content
Erschienen in:
Buchtitelbild

2018 | OriginalPaper | Buchkapitel

Overview of Character-Based Models for Natural Language Processing

verfasst von : Heike Adel, Ehsaneddin Asgari, Hinrich Schütze

Erschienen in: Computational Linguistics and Intelligent Text Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Character-based models become more and more popular for different natural language processing task, especially due to the success of neural networks. They provide the possibility of directly model text sequences without the need of tokenization and, therefore, enhance the traditional preprocessing pipeline. This paper provides an overview of character-based models for a variety of natural language processing tasks. We group existing work in three categories: tokenization-based approaches, bag-of-n-gram models and end-to-end models. For each category, we present prominent examples of studies with a particular focus on recent character-based deep learning work.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
There are also difficult cases in English, such as “Yahoo!” or “San Francisco-Los Angeles flights”.
 
2
In our view, morpheme-based models are not true instances of character-level models as linguistically motivated morphological segmentation is an equivalent step to tokenization, but on a different level. We therefore do not cover most work on morphological segmentation in this paper.
 
Literatur
1.
Zurück zum Zitat Alex, B.: An unsupervised system for identifying english inclusions in german text. In: Annual Meeting of the Association for Computational Linguistics (2005) Alex, B.: An unsupervised system for identifying english inclusions in german text. In: Annual Meeting of the Association for Computational Linguistics (2005)
2.
Zurück zum Zitat Andor, D., et al.: Globally normalized transition-based neural networks. In: Annual Meeting of the Association for Computational Linguistics (2016) Andor, D., et al.: Globally normalized transition-based neural networks. In: Annual Meeting of the Association for Computational Linguistics (2016)
3.
Zurück zum Zitat Asgari, E., Mofrad, M.R.K.: Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS ONE 10(11), 1–15 (2015)CrossRef Asgari, E., Mofrad, M.R.K.: Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS ONE 10(11), 1–15 (2015)CrossRef
4.
Zurück zum Zitat Asgari, E., Mofrad, M.R.K.: Comparing fifty natural languages and twelve genetic languages using word embedding language divergence (WELD) as a quantitative measure of language distance. In: Workshop on Multilingual and Cross-lingual Methods in NLP, pp. 65–74 (2016) Asgari, E., Mofrad, M.R.K.: Comparing fifty natural languages and twelve genetic languages using word embedding language divergence (WELD) as a quantitative measure of language distance. In: Workshop on Multilingual and Cross-lingual Methods in NLP, pp. 65–74 (2016)
5.
Zurück zum Zitat Bahdanau, D., Chorowski, J., Serdyuk, D., Brakel, P., Bengio, Y.: End-to-end attention-based large vocabulary speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4945–4949 (2016) Bahdanau, D., Chorowski, J., Serdyuk, D., Brakel, P., Bengio, Y.: End-to-end attention-based large vocabulary speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4945–4949 (2016)
6.
Zurück zum Zitat Baldwin, T., Lui, M.: Language identification: the long and the short of the matter. In: Conference of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies, pp. 229–237 (2010) Baldwin, T., Lui, M.: Language identification: the long and the short of the matter. In: Conference of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies, pp. 229–237 (2010)
7.
Zurück zum Zitat Ballesteros, M., Dyer, C., Smith, N.A.: Improved transition-based parsing by modeling characters instead of words with LSTMS. In: Conference on Empirical Methods in Natural Language Processing (2015) Ballesteros, M., Dyer, C., Smith, N.A.: Improved transition-based parsing by modeling characters instead of words with LSTMS. In: Conference on Empirical Methods in Natural Language Processing (2015)
8.
Zurück zum Zitat Bilmes, J., Kirchhoff, K.: Factored language models and generalized parallel backoff. In: Conference of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies (2003) Bilmes, J., Kirchhoff, K.: Factored language models and generalized parallel backoff. In: Conference of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies (2003)
9.
Zurück zum Zitat Bisani, M., Ney, H.: Joint-sequence models for grapheme-to-phoneme conversion. Speech Commun. 50(5), 434–451 (2008)CrossRef Bisani, M., Ney, H.: Joint-sequence models for grapheme-to-phoneme conversion. Speech Commun. 50(5), 434–451 (2008)CrossRef
10.
Zurück zum Zitat Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics (2017) Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics (2017)
11.
Zurück zum Zitat Bojanowski, P., Joulin, A., Mikolov, T.: Alternative structures for character-level RNNS. In: Workshop at International Conference on Learning Representations (2016) Bojanowski, P., Joulin, A., Mikolov, T.: Alternative structures for character-level RNNS. In: Workshop at International Conference on Learning Representations (2016)
12.
Zurück zum Zitat Botha, J.A., Blunsom, P.: Compositional morphology for word representations and language modelling. In: International Conference on Machine Learning (2014) Botha, J.A., Blunsom, P.: Compositional morphology for word representations and language modelling. In: International Conference on Machine Learning (2014)
13.
Zurück zum Zitat Cao, K., Rei, M.: A joint model for word embedding and word morphology. In: Annual Meeting of the Association for Computational Linguistics, pp. 18–26 (2016) Cao, K., Rei, M.: A joint model for word embedding and word morphology. In: Annual Meeting of the Association for Computational Linguistics, pp. 18–26 (2016)
14.
Zurück zum Zitat Cavnar, W.: Using an n-gram-based document representation with a vector processing retrieval model. NIST SPECIAL PUBLICATION SP, pp. 269–269 (1995) Cavnar, W.: Using an n-gram-based document representation with a vector processing retrieval model. NIST SPECIAL PUBLICATION SP, pp. 269–269 (1995)
15.
Zurück zum Zitat Chan, W., Jaitly, N., Le, Q.V., Vinyals, O.: Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4960–4964 (2016) Chan, W., Jaitly, N., Le, Q.V., Vinyals, O.: Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4960–4964 (2016)
16.
Zurück zum Zitat Chen, A., He, J., Xu, L., Gey, F.C., Meggs, J.: Chinese text retrieval without using a dictionary. ACM SIGIR Forum 31(SI), 42–49 (1997)CrossRef Chen, A., He, J., Xu, L., Gey, F.C., Meggs, J.: Chinese text retrieval without using a dictionary. ACM SIGIR Forum 31(SI), 42–49 (1997)CrossRef
17.
Zurück zum Zitat Chen, X., Xu, L., Liu, Z., Sun, M., Luan, H.: Joint learning of character and word embeddings. In: International Joint Conference on Artificial Intelligence, pp. 1236–1242 (2015) Chen, X., Xu, L., Liu, Z., Sun, M., Luan, H.: Joint learning of character and word embeddings. In: International Joint Conference on Artificial Intelligence, pp. 1236–1242 (2015)
18.
Zurück zum Zitat Chiu, J.P.C., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNS. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016) Chiu, J.P.C., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNS. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016)
19.
Zurück zum Zitat Chung, J., Ahn, S., Bengio, Y.: Hierarchical multiscale recurrent neural networks. In: Proceedings of International Conference on Learning Representations (2017) Chung, J., Ahn, S., Bengio, Y.: Hierarchical multiscale recurrent neural networks. In: Proceedings of International Conference on Learning Representations (2017)
20.
Zurück zum Zitat Chung, J., Cho, K., Bengio, Y.: A character-level decoder without explicit segmentation for neural machine translation. In: Annual Meeting of the Association for Computational Linguistics (2016) Chung, J., Cho, K., Bengio, Y.: A character-level decoder without explicit segmentation for neural machine translation. In: Annual Meeting of the Association for Computational Linguistics (2016)
21.
Zurück zum Zitat Church, K.W.: Char\(\_\)align: a program for aligning parallel texts at the character level. In: Annual Meeting of the Association for Computational Linguistics, pp. 1–8 (1993) Church, K.W.: Char\(\_\)align: a program for aligning parallel texts at the character level. In: Annual Meeting of the Association for Computational Linguistics, pp. 1–8 (1993)
22.
Zurück zum Zitat Clark, A.: Combining distributional and morphological information for part of speech induction. In: Conference of the European Chapter of the Association for Computational Linguistics, pp. 59–66 (2003) Clark, A.: Combining distributional and morphological information for part of speech induction. In: Conference of the European Chapter of the Association for Computational Linguistics, pp. 59–66 (2003)
23.
Zurück zum Zitat Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)MATH Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)MATH
24.
Zurück zum Zitat Costa-Jussà, M.R., Fonollosa, J.A.R.: Character-based neural machine translation. In: Annual Meeting of the Association for Computational Linguistics (2016) Costa-Jussà, M.R., Fonollosa, J.A.R.: Character-based neural machine translation. In: Annual Meeting of the Association for Computational Linguistics (2016)
25.
Zurück zum Zitat Cotterell, R., Vieira, T., Schütze, H.: A joint model of orthography and morphological segmentation. In: Conference of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies (2016) Cotterell, R., Vieira, T., Schütze, H.: A joint model of orthography and morphological segmentation. In: Conference of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies (2016)
26.
Zurück zum Zitat Damashek, M.: Gauging similarity with n-grams: language-independent categorization of text. Science 267, 843–848 (1995)CrossRef Damashek, M.: Gauging similarity with n-grams: language-independent categorization of text. Science 267, 843–848 (1995)CrossRef
27.
Zurück zum Zitat De Heer, T.: Experiments with syntactic traces in information retrieval. Inf. Storage Retr. 10(3–4), 133–144 (1974)CrossRef De Heer, T.: Experiments with syntactic traces in information retrieval. Inf. Storage Retr. 10(3–4), 133–144 (1974)CrossRef
28.
Zurück zum Zitat Dunning, T.: Statistical identification of language. Technical Report MCCS 940–273, Computing Research Laboratory, New Mexico State (1994) Dunning, T.: Statistical identification of language. Technical Report MCCS 940–273, Computing Research Laboratory, New Mexico State (1994)
29.
Zurück zum Zitat Eyben, F., Wöllmer, M., Schuller, B.W., Graves, A.: From speech to letters - using a novel neural network architecture for grapheme based ASR. In: IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), pp. 376–380 (2009) Eyben, F., Wöllmer, M., Schuller, B.W., Graves, A.: From speech to letters - using a novel neural network architecture for grapheme based ASR. In: IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), pp. 376–380 (2009)
30.
Zurück zum Zitat Eyecioglu, A., Keller, B.: ASOBEK at SemEval-2016 task 1: sentence representation with character n-gram embeddings for semantic textual similarity. In: SemEval-2016: The 10th International Workshop on Semantic Evaluation, pp. 1320–1324 (2016) Eyecioglu, A., Keller, B.: ASOBEK at SemEval-2016 task 1: sentence representation with character n-gram embeddings for semantic textual similarity. In: SemEval-2016: The 10th International Workshop on Semantic Evaluation, pp. 1320–1324 (2016)
31.
Zurück zum Zitat Faruqui, M., Tsvetkov, Y., Neubig, G., Dyer, C.: Morphological inflection generation using character sequence to sequence learning. In: Conference of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies (2016) Faruqui, M., Tsvetkov, Y., Neubig, G., Dyer, C.: Morphological inflection generation using character sequence to sequence learning. In: Conference of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies (2016)
32.
Zurück zum Zitat Gerdjikov, S., Schulz, K.U.: Corpus analysis without prior linguistic knowledge-unsupervised mining of phrases and subphrase structure. CoRR abs/1602.05772 (2016) Gerdjikov, S., Schulz, K.U.: Corpus analysis without prior linguistic knowledge-unsupervised mining of phrases and subphrase structure. CoRR abs/1602.05772 (2016)
33.
Zurück zum Zitat Gillick, D., Brunk, C., Vinyals, O., Subramanya, A.: Multilingual language processing from bytes. In: North American Chapter of the Association for Computational Linguistics, pp. 1296–1306, June 2016 Gillick, D., Brunk, C., Vinyals, O., Subramanya, A.: Multilingual language processing from bytes. In: North American Chapter of the Association for Computational Linguistics, pp. 1296–1306, June 2016
34.
Zurück zum Zitat Golub, D., He, X.: Character-level question answering with attention. In: Conference on Empirical Methods in Natural Language Processing (2016) Golub, D., He, X.: Character-level question answering with attention. In: Conference on Empirical Methods in Natural Language Processing (2016)
35.
Zurück zum Zitat Graves, A.: Generating sequences with recurrent neural networks. CoRR abs/1308.0850 (2013) Graves, A.: Generating sequences with recurrent neural networks. CoRR abs/1308.0850 (2013)
36.
Zurück zum Zitat Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: International Conference on Machine Learning, pp. 1764–1772 (2014) Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: International Conference on Machine Learning, pp. 1764–1772 (2014)
37.
Zurück zum Zitat Haizhou, L., Min, Z., Jian, S.: A joint source-channel model for machine transliteration. In: Annual Meeting of the Association for Computational Linguistics, p. 159 (2004) Haizhou, L., Min, Z., Jian, S.: A joint source-channel model for machine transliteration. In: Annual Meeting of the Association for Computational Linguistics, p. 159 (2004)
38.
Zurück zum Zitat Hardmeier, C.: A neural model for part-of-speech tagging in historical texts. In: International Conference on Computational Linguistics, pp. 922–931 (2016) Hardmeier, C.: A neural model for part-of-speech tagging in historical texts. In: International Conference on Computational Linguistics, pp. 922–931 (2016)
39.
Zurück zum Zitat Hirsimäki, T., Creutz, M., Siivola, V., Kurimo, M., Virpioja, S., Pylkkönen, J.: Unlimited vocabulary speech recognition with morph language models applied to finnish. Comput. Speech Lang. 20(4), 515–541 (2006)CrossRef Hirsimäki, T., Creutz, M., Siivola, V., Kurimo, M., Virpioja, S., Pylkkönen, J.: Unlimited vocabulary speech recognition with morph language models applied to finnish. Comput. Speech Lang. 20(4), 515–541 (2006)CrossRef
40.
Zurück zum Zitat Ircing, P., et al.: On large vocabulary continuous speech recognition of highly inflectional language-czech. In: Proceedings of the 7th European Conference on Speech Communication and Technology, vol. 1, pp. 487–490. ISCA: International Speech Communication Association (2001) Ircing, P., et al.: On large vocabulary continuous speech recognition of highly inflectional language-czech. In: Proceedings of the 7th European Conference on Speech Communication and Technology, vol. 1, pp. 487–490. ISCA: International Speech Communication Association (2001)
41.
Zurück zum Zitat Jaech, A., Mulcaire, G., Hathi, S., Ostendorf, M., Smith, N.A.: Hierarchical character-word models for language identification. In: Proceedings of The Fourth International Workshop on Natural Language Processing for Social Media, pp. 84–93 (2016) Jaech, A., Mulcaire, G., Hathi, S., Ostendorf, M., Smith, N.A.: Hierarchical character-word models for language identification. In: Proceedings of The Fourth International Workshop on Natural Language Processing for Social Media, pp. 84–93 (2016)
42.
Zurück zum Zitat Kalchbrenner, N., Espeholt, L., Simonyan, K., van den Oord, A., Graves, A., Kavukcuoglu, K.: Neural machine translation in linear time. CoRR abs/1610.10099 (2016) Kalchbrenner, N., Espeholt, L., Simonyan, K., van den Oord, A., Graves, A., Kavukcuoglu, K.: Neural machine translation in linear time. CoRR abs/1610.10099 (2016)
43.
Zurück zum Zitat Kann, K., Cotterell, R., Schütze, H.: Neural morphological analysis: encoding-decoding canonical segments. In: Conference on Empirical Methods in Natural Language Processing (2016) Kann, K., Cotterell, R., Schütze, H.: Neural morphological analysis: encoding-decoding canonical segments. In: Conference on Empirical Methods in Natural Language Processing (2016)
44.
Zurück zum Zitat Kann, K., Schütze, H.: MED: The LMU system for the SIGMORPHON 2016 shared task on morphological reinflection. In: SIGMORPHON Workshop (2016) Kann, K., Schütze, H.: MED: The LMU system for the SIGMORPHON 2016 shared task on morphological reinflection. In: SIGMORPHON Workshop (2016)
45.
Zurück zum Zitat Kann, K., Schütze, H.: Single-model encoder-decoder with explicit morphological representation for reinflection. In: Annual Meeting of the Association for Computational Linguistics (2016) Kann, K., Schütze, H.: Single-model encoder-decoder with explicit morphological representation for reinflection. In: Annual Meeting of the Association for Computational Linguistics (2016)
46.
Zurück zum Zitat Kaplan, R.M., Kay, M.: Regular models of phonological rule systems. Comput. Linguist. 20(3), 331–378 (1994) Kaplan, R.M., Kay, M.: Regular models of phonological rule systems. Comput. Linguist. 20(3), 331–378 (1994)
47.
Zurück zum Zitat Kettunen, K., McNamee, P., Baskaya, F.: Using syllables as indexing terms in full-text information retrieval. In: Human Language Technologies - The Baltic Perspective - Proceedings of the Fourth International Conference Baltic HLT 2010, Riga, Latvia, October 7–8, 2010, pp. 225–232 (2010) Kettunen, K., McNamee, P., Baskaya, F.: Using syllables as indexing terms in full-text information retrieval. In: Human Language Technologies - The Baltic Perspective - Proceedings of the Fourth International Conference Baltic HLT 2010, Riga, Latvia, October 7–8, 2010, pp. 225–232 (2010)
48.
Zurück zum Zitat Kim, Y., Jernite, Y., Sontag, D., Rush, A.M.: Character-aware neural language models. In: AAAI Conference on Artificial Intelligence, pp. 2741–2749 (2016) Kim, Y., Jernite, Y., Sontag, D., Rush, A.M.: Character-aware neural language models. In: AAAI Conference on Artificial Intelligence, pp. 2741–2749 (2016)
49.
Zurück zum Zitat Kirchhoff, K., Vergyri, D., Bilmes, J., Duh, K., Stolcke, A.: Morphology-based language modeling for conversational arabic speech recognition. Comput. Speech Lang. 20(4), 589–608 (2006)CrossRef Kirchhoff, K., Vergyri, D., Bilmes, J., Duh, K., Stolcke, A.: Morphology-based language modeling for conversational arabic speech recognition. Comput. Speech Lang. 20(4), 589–608 (2006)CrossRef
50.
Zurück zum Zitat Klein, D., Smarr, J., Nguyen, H., Manning, C.D.: Named entity recognition with character-level models. In: Computational Natural Language Learning, pp. 180–183 (2003) Klein, D., Smarr, J., Nguyen, H., Manning, C.D.: Named entity recognition with character-level models. In: Computational Natural Language Learning, pp. 180–183 (2003)
51.
Zurück zum Zitat Knight, K., Graehl, J.: Machine transliteration. Comput. Linguist. 24(4), 599–612 (1998) Knight, K., Graehl, J.: Machine transliteration. Comput. Linguist. 24(4), 599–612 (1998)
54.
Zurück zum Zitat Kou, W., Li, F., Baldwin, T.: Automatic labelling of topic models using word vectors and letter trigram vectors. In: Asia Information Retrieval Societies Conference (AIRS), pp. 253–264 (2015) Kou, W., Li, F., Baldwin, T.: Automatic labelling of topic models using word vectors and letter trigram vectors. In: Asia Information Retrieval Societies Conference (AIRS), pp. 253–264 (2015)
55.
Zurück zum Zitat Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Conference of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies (2016) Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Conference of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies (2016)
56.
Zurück zum Zitat Lee, J., Cho, K., Hofmann, T.: Fully character-level neural machine translation without explicit segmentation. CoRR abs/1610.03017 (2016) Lee, J., Cho, K., Hofmann, T.: Fully character-level neural machine translation without explicit segmentation. CoRR abs/1610.03017 (2016)
57.
Zurück zum Zitat Lepage, Y., Denoual, E.: Purest ever example-based machine translation: detailed presentation and assessment. Mach. Transl. 19(3–4), 251–282 (2005) Lepage, Y., Denoual, E.: Purest ever example-based machine translation: detailed presentation and assessment. Mach. Transl. 19(3–4), 251–282 (2005)
58.
Zurück zum Zitat Ling, W., et al.: Finding function in form: compositional character models for open vocabulary word representation. In: Conference on Empirical Methods in Natural Language Processing, pp. 1520–1530 (2015) Ling, W., et al.: Finding function in form: compositional character models for open vocabulary word representation. In: Conference on Empirical Methods in Natural Language Processing, pp. 1520–1530 (2015)
59.
Zurück zum Zitat Ling, W., Trancoso, I., Dyer, C., Black, A.W.: Character-based neural machine translation. CoRR abs/1511.04586 (2015) Ling, W., Trancoso, I., Dyer, C., Black, A.W.: Character-based neural machine translation. CoRR abs/1511.04586 (2015)
60.
Zurück zum Zitat Luong, M., Manning, C.D.: Achieving open vocabulary neural machine translation with hybrid word-character models. In: Annual Meeting of the Association for Computational Linguistics (2016) Luong, M., Manning, C.D.: Achieving open vocabulary neural machine translation with hybrid word-character models. In: Annual Meeting of the Association for Computational Linguistics (2016)
61.
Zurück zum Zitat Luong, M.T., Socher, R., Manning, C.D.: Better word representations with recursive neural networks for morphology. In: Computational Natural Language Learning (2013) Luong, M.T., Socher, R., Manning, C.D.: Better word representations with recursive neural networks for morphology. In: Computational Natural Language Learning (2013)
62.
Zurück zum Zitat Ma, X., Hovy, E.H.: End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF. In: Annual Meeting of the Association for Computational Linguistics (2016) Ma, X., Hovy, E.H.: End-to-end sequence labeling via bi-directional LSTM-CNNS-CRF. In: Annual Meeting of the Association for Computational Linguistics (2016)
63.
Zurück zum Zitat McNamee, P., Mayfield, J.: Character n-gram tokenization for european language text retrieval. Inf. Retr. 7(1–2), 73–97 (2004)CrossRef McNamee, P., Mayfield, J.: Character n-gram tokenization for european language text retrieval. Inf. Retr. 7(1–2), 73–97 (2004)CrossRef
64.
Zurück zum Zitat Mihalcea, R., Nastase, V.: Letter level learning for language independent diacritics restoration. In: Computational Natural Language Learning (2002) Mihalcea, R., Nastase, V.: Letter level learning for language independent diacritics restoration. In: Computational Natural Language Learning (2002)
65.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
66.
Zurück zum Zitat Mikolov, T., Sutskever, I., Deoras, A., Le, H.S., Kombrink, S., Cernocky, J.: Subword language modeling with neural networks (2012) Mikolov, T., Sutskever, I., Deoras, A., Le, H.S., Kombrink, S., Cernocky, J.: Subword language modeling with neural networks (2012)
67.
Zurück zum Zitat Miyamoto, Y., Cho, K.: Gated word-character recurrent language model. In: Conference on Empirical Methods in Natural Language Processing, pp. 1992–1997 (2016) Miyamoto, Y., Cho, K.: Gated word-character recurrent language model. In: Conference on Empirical Methods in Natural Language Processing, pp. 1992–1997 (2016)
68.
Zurück zum Zitat Müller, T., Schmid, H., Schütze, H.: Efficient higher-order CRFs for morphological tagging. In: Conference on Empirical Methods in Natural Language Processing, pp. 322–332 (2013) Müller, T., Schmid, H., Schütze, H.: Efficient higher-order CRFs for morphological tagging. In: Conference on Empirical Methods in Natural Language Processing, pp. 322–332 (2013)
69.
Zurück zum Zitat Parada, C., Dredze, M., Sethy, A., Rastrow, A.: Learning sub-word units for open vocabulary speech recognition. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 712–721 (2011) Parada, C., Dredze, M., Sethy, A., Rastrow, A.: Learning sub-word units for open vocabulary speech recognition. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 712–721 (2011)
70.
Zurück zum Zitat Peng, F., Schuurmans, D., Wang, S., Keselj, V.: Language independent authorship attribution using character level language models. In: Conference of the European Chapter of the Association for Computational Linguistics, pp. 267–274 (2003) Peng, F., Schuurmans, D., Wang, S., Keselj, V.: Language independent authorship attribution using character level language models. In: Conference of the European Chapter of the Association for Computational Linguistics, pp. 267–274 (2003)
71.
Zurück zum Zitat Pettersson, E., Megyesi, B., Nivre, J.: A multilingual evaluation of three spelling normalisation methods for historical text. In: Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, pp. 32–41 (2014) Pettersson, E., Megyesi, B., Nivre, J.: A multilingual evaluation of three spelling normalisation methods for historical text. In: Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, pp. 32–41 (2014)
72.
Zurück zum Zitat Plank, B., Søgaard, A., Goldberg, Y.: Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. In: Annual Meeting of the Association for Computational Linguistics (2016) Plank, B., Søgaard, A., Goldberg, Y.: Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. In: Annual Meeting of the Association for Computational Linguistics (2016)
73.
Zurück zum Zitat Rastogi, P., Cotterell, R., Eisner, J.: Weighting finite-state transductions with neural context. In: Conference of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies, pp. 623–633 (2016) Rastogi, P., Cotterell, R., Eisner, J.: Weighting finite-state transductions with neural context. In: Conference of the North American Chapter of the Association for Computational Linguistics/Human Language Technologies, pp. 623–633 (2016)
74.
Zurück zum Zitat Ratnaparkhi, A., et al.: A maximum entropy model for part-of-speech tagging. In: Conference on Empirical Methods in Natural Language Processing, vol. 1, pp. 133–142. Philadelphia, USA (1996) Ratnaparkhi, A., et al.: A maximum entropy model for part-of-speech tagging. In: Conference on Empirical Methods in Natural Language Processing, vol. 1, pp. 133–142. Philadelphia, USA (1996)
75.
Zurück zum Zitat Sajjad, H.: Statistical models for unsupervised, semi-supervised and supervised transliteration mining. In: Computational Linguistics (2012) Sajjad, H.: Statistical models for unsupervised, semi-supervised and supervised transliteration mining. In: Computational Linguistics (2012)
76.
Zurück zum Zitat dos Santos, C.N., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: International Conference on Computational Linguistics. pp. 69–78 (2014) dos Santos, C.N., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: International Conference on Computational Linguistics. pp. 69–78 (2014)
77.
Zurück zum Zitat dos Santos, C.N., Guimarães, V.: Boosting named entity recognition with neural character embeddings. In: Fifth Named Entity Workshop, pp. 25–33 (2015) dos Santos, C.N., Guimarães, V.: Boosting named entity recognition with neural character embeddings. In: Fifth Named Entity Workshop, pp. 25–33 (2015)
78.
Zurück zum Zitat dos Santos, C.N., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: International Conference on Machine Learning, pp. 1818–1826 (2014) dos Santos, C.N., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: International Conference on Machine Learning, pp. 1818–1826 (2014)
79.
Zurück zum Zitat Schütze, H.: Word space. In: Advances in Neural Information Processing Systems, pp. 895–902 (1992) Schütze, H.: Word space. In: Advances in Neural Information Processing Systems, pp. 895–902 (1992)
80.
Zurück zum Zitat Schütze, H.: Nonsymbolic text representation. CoRR abs/1610.00479 (2016) Schütze, H.: Nonsymbolic text representation. CoRR abs/1610.00479 (2016)
81.
Zurück zum Zitat Sejnowski, T.J., Rosenberg, C.R.: Parallel networks that learn to pronounce english text. Complex Syst. 1(1), 145–168 (1987)MATH Sejnowski, T.J., Rosenberg, C.R.: Parallel networks that learn to pronounce english text. Complex Syst. 1(1), 145–168 (1987)MATH
82.
Zurück zum Zitat Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Annual Meeting of the Association for Computational Linguistics (2016) Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Annual Meeting of the Association for Computational Linguistics (2016)
83.
Zurück zum Zitat Shaik, M.A.B., Mousa, A.E.D., Schlüter, R., Ney, H.: Hybrid language models using mixed types of sub-lexical units for open vocabulary german lvcsr. In: Annual Conference of the International Speech Communication Association, pp. 1441–1444 (2011) Shaik, M.A.B., Mousa, A.E.D., Schlüter, R., Ney, H.: Hybrid language models using mixed types of sub-lexical units for open vocabulary german lvcsr. In: Annual Conference of the International Speech Communication Association, pp. 1441–1444 (2011)
84.
Zurück zum Zitat Shaik, M.A.B., Mousa, A.E., Schlüter, R., Ney, H.: Feature-rich sub-lexical language models using a maximum entropy approach for german LVCSR. In: Annual Conference of the International Speech Communication Association, pp. 3404–3408 (2013) Shaik, M.A.B., Mousa, A.E., Schlüter, R., Ney, H.: Feature-rich sub-lexical language models using a maximum entropy approach for german LVCSR. In: Annual Conference of the International Speech Communication Association, pp. 3404–3408 (2013)
85.
Zurück zum Zitat Shannon, C.E.: Prediction and entropy of printed english. Bell Labs Tech. J. 30(1), 50–64 (1951)CrossRef Shannon, C.E.: Prediction and entropy of printed english. Bell Labs Tech. J. 30(1), 50–64 (1951)CrossRef
86.
Zurück zum Zitat Sperr, H., Niehues, J., Waibel, A.: Letter n-gram-based input encoding for continuous space language models. In: Workshop on Continuous Vector Space Models and their Compositionality, pp. 30–39 (2013) Sperr, H., Niehues, J., Waibel, A.: Letter n-gram-based input encoding for continuous space language models. In: Workshop on Continuous Vector Space Models and their Compositionality, pp. 30–39 (2013)
87.
Zurück zum Zitat Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. In: ICML 2015 Deep Learing Workshop (2015) Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. In: ICML 2015 Deep Learing Workshop (2015)
88.
Zurück zum Zitat Sutskever, I., Martens, J., Hinton, G.E.: Generating text with recurrent neural networks. In: International Conference on Machine Learning, pp. 1017–1024 (2011) Sutskever, I., Martens, J., Hinton, G.E.: Generating text with recurrent neural networks. In: International Conference on Machine Learning, pp. 1017–1024 (2011)
89.
Zurück zum Zitat Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014) Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
90.
Zurück zum Zitat Tiedemann, J., Nakov, P.: Analyzing the use of character-level translation with sparse and noisy datasets. In: Recent Advances in Natural Language Processing, RANLP 2013, 9–11 September, 2013, Hissar, Bulgaria, pp. 676–684 (2013) Tiedemann, J., Nakov, P.: Analyzing the use of character-level translation with sparse and noisy datasets. In: Recent Advances in Natural Language Processing, RANLP 2013, 9–11 September, 2013, Hissar, Bulgaria, pp. 676–684 (2013)
91.
Zurück zum Zitat Murthy, V., Khapra, M.M., Bhattacharyya, P.: Sharing network parameters for crosslingual named entity recognition. CoRR abs/1607.00198 (2016) Murthy, V., Khapra, M.M., Bhattacharyya, P.: Sharing network parameters for crosslingual named entity recognition. CoRR abs/1607.00198 (2016)
92.
Zurück zum Zitat Vergyri, D., Kirchhoff, K., Duh, K., Stolcke, A.: Morphology-based language modeling for arabic speech recognition. In: Annual Conference of the International Speech Communication Association, 4, 2245–2248 (2004) Vergyri, D., Kirchhoff, K., Duh, K., Stolcke, A.: Morphology-based language modeling for arabic speech recognition. In: Annual Conference of the International Speech Communication Association, 4, 2245–2248 (2004)
93.
Zurück zum Zitat Vilar, D., Peter, J.T., Ney, H.: Can we translate letters? In: Workshop on Statistical Machine Translation (2007) Vilar, D., Peter, J.T., Ney, H.: Can we translate letters? In: Workshop on Statistical Machine Translation (2007)
94.
Zurück zum Zitat Vylomova, E., Cohn, T., He, X., Haffari, G.: Word representation models for morphologically rich languages in neural machine translation. CoRR abs/1606.04217 (2016) Vylomova, E., Cohn, T., He, X., Haffari, G.: Word representation models for morphologically rich languages in neural machine translation. CoRR abs/1606.04217 (2016)
95.
Zurück zum Zitat Wang, L., Cao, Z., Xia, Y., de Melo, G.: Morphological segmentation with window LSTM neural networks. In: AAAI Conference on Artificial Intelligence (2016) Wang, L., Cao, Z., Xia, Y., de Melo, G.: Morphological segmentation with window LSTM neural networks. In: AAAI Conference on Artificial Intelligence (2016)
96.
Zurück zum Zitat Wieting, J., Bansal, M., Gimpel, K., Livescu, K.: Charagram: embedding words and sentences via character n-grams. In: Conference on Empirical Methods in Natural Language Processing (2016) Wieting, J., Bansal, M., Gimpel, K., Livescu, K.: Charagram: embedding words and sentences via character n-grams. In: Conference on Empirical Methods in Natural Language Processing (2016)
97.
Zurück zum Zitat Wu, Y., et al.: Google’s neural machine translation system: Bridging the gap between human and machine translation. CoRR abs/1609.08144 (2016) Wu, Y., et al.: Google’s neural machine translation system: Bridging the gap between human and machine translation. CoRR abs/1609.08144 (2016)
98.
Zurück zum Zitat Xiao, Y., Cho, K.: Efficient character-level document classification by combining convolution and recurrent layers. CoRR abs/1602.00367 (2016) Xiao, Y., Cho, K.: Efficient character-level document classification by combining convolution and recurrent layers. CoRR abs/1602.00367 (2016)
99.
Zurück zum Zitat Yaghoobzadeh, Y., Schütze, H.: Multi-level representations for fine-grained typing of knowledge base entities. In: Conference of the European Chapter of the Association for Computational Linguistics (2017) Yaghoobzadeh, Y., Schütze, H.: Multi-level representations for fine-grained typing of knowledge base entities. In: Conference of the European Chapter of the Association for Computational Linguistics (2017)
100.
Zurück zum Zitat Yang, Z., Chen, W., Wang, F., Xu, B.: A character-aware encoder for neural machine translation. In: International Conference on Computational Linguistics, pp. 3063–3070 (2016) Yang, Z., Chen, W., Wang, F., Xu, B.: A character-aware encoder for neural machine translation. In: International Conference on Computational Linguistics, pp. 3063–3070 (2016)
101.
Zurück zum Zitat Yang, Z., Salakhutdinov, R., Cohen, W.W.: Multi-task cross-lingual sequence tagging from scratch. CoRR abs/1603.06270 (2016) Yang, Z., Salakhutdinov, R., Cohen, W.W.: Multi-task cross-lingual sequence tagging from scratch. CoRR abs/1603.06270 (2016)
102.
Zurück zum Zitat Yu, L., Buys, J., Blunsom, P.: Online segment to segment neural transduction. In: Conference on Empirical Methods in Natural Language Processing, pp. 1307–1316 (2016) Yu, L., Buys, J., Blunsom, P.: Online segment to segment neural transduction. In: Conference on Empirical Methods in Natural Language Processing, pp. 1307–1316 (2016)
103.
Zurück zum Zitat Zhang, X., LeCun, Y.: Text understanding from scratch. CoRR abs/1502.01710 (2015) Zhang, X., LeCun, Y.: Text understanding from scratch. CoRR abs/1502.01710 (2015)
104.
Zurück zum Zitat Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015) Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Metadaten
Titel
Overview of Character-Based Models for Natural Language Processing
verfasst von
Heike Adel
Ehsaneddin Asgari
Hinrich Schütze
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-77113-7_1

Premium Partner