Skip to main content
Top

2019 | OriginalPaper | Chapter

5. Distributed Representations

Authors : Uday Kamath, John Liu, James Whitaker

Published in: Deep Learning for NLP and Speech Recognition

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this chapter, we introduce the notion of word embeddings that serve as core representations of text in deep learning approaches. We start with the distributional hypothesis and explain how it can be leveraged to form semantic representations of words. We discuss the common distributional semantic models including word2vec and GloVe and their variants. We address the shortcomings of embedding models and their extension to document and concept representation. Finally, we discuss several applications to natural language processing tasks and present a case study focused on language modeling.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
[Als+18]
go back to reference Faisal Alshargi et al. “Concept2vec: Metrics for Evaluating Quality of Embeddings for Ontological Concepts.” In: CoRR abs/1803.04488 (2018). Faisal Alshargi et al. “Concept2vec: Metrics for Evaluating Quality of Embeddings for Ontological Concepts.” In: CoRR abs/1803.04488 (2018).
[Amm+16]
go back to reference Waleed Ammar et al. “Massively Multilingual Word Embeddings.” In: CoRR abs/1602.01925 (2016). Waleed Ammar et al. “Massively Multilingual Word Embeddings.” In: CoRR abs/1602.01925 (2016).
[BCB14]
go back to reference Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. “Neural machine translation by jointly learning to align and translate”. In: CoRR abs/1409.0473 (2014). Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. “Neural machine translation by jointly learning to align and translate”. In: CoRR abs/1409.0473 (2014).
[Bak18]
go back to reference Amir Bakarov. “A Survey of Word Embeddings Evaluation Methods”. In: CoRR abs/1801.09536 (2018). Amir Bakarov. “A Survey of Word Embeddings Evaluation Methods”. In: CoRR abs/1801.09536 (2018).
[Ben+03]
go back to reference Yoshua Bengio et al. “A neural probabilistic language model”. In: JMLR (2003), pp. 1137–1155. Yoshua Bengio et al. “A neural probabilistic language model”. In: JMLR (2003), pp. 1137–1155.
[Boj+16]
go back to reference Piotr Bojanowski et al. “Enriching Word Vectors with Subword Information”. In: CoRR abs/1607.04606 (2016). Piotr Bojanowski et al. “Enriching Word Vectors with Subword Information”. In: CoRR abs/1607.04606 (2016).
[Bor+13]
go back to reference Antoine Bordes et al. “Translating Embeddings for Modeling Multirelational Data.” In: NIPS. 2013, pp. 2787–2795. Antoine Bordes et al. “Translating Embeddings for Modeling Multirelational Data.” In: NIPS. 2013, pp. 2787–2795.
[CP18]
go back to reference José Camacho-Collados and Mohammad Taher Pilehvar. “From Word to Sense Embeddings: A Survey on Vector Representations of Meaning”. In: CoRR abs/1805.04032 (2018). José Camacho-Collados and Mohammad Taher Pilehvar. “From Word to Sense Embeddings: A Survey on Vector Representations of Meaning”. In: CoRR abs/1805.04032 (2018).
[Che+16]
go back to reference Ting Chen et al. “Entity Embedding-Based Anomaly Detection for Heterogeneous Categorical Events.” In: IJCAI. IJCAI/AAAI Press, 2016, pp. 1396–1403. Ting Chen et al. “Entity Embedding-Based Anomaly Detection for Heterogeneous Categorical Events.” In: IJCAI. IJCAI/AAAI Press, 2016, pp. 1396–1403.
[CW08]
go back to reference Ronan Collobert and Jason Weston. “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning”. In: Proceedings of the 25th International Conference on Machine Learning. ACM, 2008, pp. 160–167. Ronan Collobert and Jason Weston. “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning”. In: Proceedings of the 25th International Conference on Machine Learning. ACM, 2008, pp. 160–167.
[CJF16]
go back to reference Marta R. Costa-Jussà and José A. R. Fonollosa. “Character-based Neural Machine Translation.” In: CoRR abs/1603.00810 (2016). Marta R. Costa-Jussà and José A. R. Fonollosa. “Character-based Neural Machine Translation.” In: CoRR abs/1603.00810 (2016).
[Cou+16]
go back to reference Jocelyn Coulmance et al. “Trans-gram, Fast Cross-lingual Word embeddings”. In: CoRR abs/1601.02502 (2016). Jocelyn Coulmance et al. “Trans-gram, Fast Cross-lingual Word embeddings”. In: CoRR abs/1601.02502 (2016).
[Dev+18]
go back to reference Jacob Devlin et al. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” In: CoRR abs/1810.04805 (2018). Jacob Devlin et al. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” In: CoRR abs/1810.04805 (2018).
[DFU11]
go back to reference Paramveer S. Dhillon, Dean Foster, and Lyle Ungar. “Multiview learning of word embeddings via cca”. In: In Proc. of NIPS. 2011. Paramveer S. Dhillon, Dean Foster, and Lyle Ungar. “Multiview learning of word embeddings via cca”. In: In Proc. of NIPS. 2011.
[Dhi+18]
go back to reference Bhuwan Dhingra et al. “Embedding Text in Hyperbolic Spaces”. In: Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12). Association for Computational Linguistics, 2018, pp. 59–69. Bhuwan Dhingra et al. “Embedding Text in Hyperbolic Spaces”. In: Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12). Association for Computational Linguistics, 2018, pp. 59–69.
[Far+14]
go back to reference Manaal Faruqui et al. Retrofitting Word Vectors to Semantic Lexicons. 2014. Manaal Faruqui et al. Retrofitting Word Vectors to Semantic Lexicons. 2014.
[Gra+18]
go back to reference Edouard Grave et al. “Learning Word Vectors for 157 Languages”. In: CoRR abs/1802.06893 (2018). Edouard Grave et al. “Learning Word Vectors for 157 Languages”. In: CoRR abs/1802.06893 (2018).
[Gu+16]
go back to reference Jiatao Gu et al. Incorporating Copying Mechanism in Sequence-to-Sequence Learning. 2016. Jiatao Gu et al. Incorporating Copying Mechanism in Sequence-to-Sequence Learning. 2016.
[HR18]
go back to reference Jeremy Howard and Sebastian Ruder. “Universal Language Model Fine-tuning for Text Classification”. In: Association for Computational Linguistics, 2018. Jeremy Howard and Sebastian Ruder. “Universal Language Model Fine-tuning for Text Classification”. In: Association for Computational Linguistics, 2018.
[Jou+16a]
go back to reference Armand Joulin et al. “Bag of Tricks for Efficient Text Classification”. In: CoRR abs/1607.01759 (2016). Armand Joulin et al. “Bag of Tricks for Efficient Text Classification”. In: CoRR abs/1607.01759 (2016).
[Kan+17]
go back to reference Ramakrishnan Kannan et al. “Outlier Detection for Text Data: An Extended Version.” In: CoRR abs/1701.01325 (2017). Ramakrishnan Kannan et al. “Outlier Detection for Text Data: An Extended Version.” In: CoRR abs/1701.01325 (2017).
[Kim+16]
go back to reference Yoon Kim et al. “Character-Aware Neural Language Models”. In: AAAI. 2016. Yoon Kim et al. “Character-Aware Neural Language Models”. In: AAAI. 2016.
[KB16]
go back to reference Anoop Kunchukuttan and Pushpak Bhattacharyya. “Learning variable length units for SMT between related languages via Byte Pair Encoding.” In: CoRR abs/1610.06510 (2016). Anoop Kunchukuttan and Pushpak Bhattacharyya. “Learning variable length units for SMT between related languages via Byte Pair Encoding.” In: CoRR abs/1610.06510 (2016).
[Lam18]
go back to reference Maximilian Lam. “Word2Bits - Quantized Word Vectors”. In: CoRR abs/1803.05651 (2018). Maximilian Lam. “Word2Bits - Quantized Word Vectors”. In: CoRR abs/1803.05651 (2018).
[LM14]
go back to reference Quoc V. Le and Tomas Mikolov. “Distributed Representations of Sentences and Documents”. In: CoRR abs/1405.4053 (2014). Quoc V. Le and Tomas Mikolov. “Distributed Representations of Sentences and Documents”. In: CoRR abs/1405.4053 (2014).
[Lin+15]
go back to reference Wang Ling et al. “Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation.” In: CoRR abs/1508.02096 (2015). Wang Ling et al. “Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation.” In: CoRR abs/1508.02096 (2015).
[LM16]
go back to reference Minh-Thang Luong and Christopher D. Manning. “Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models.” In: CoRR abs/1604.00788 (2016). Minh-Thang Luong and Christopher D. Manning. “Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models.” In: CoRR abs/1604.00788 (2016).
[Mik+13b]
go back to reference Tomas Mikolov et al. “Distributed Representations of Words and Phrases and their Compositionality”. In: Advances in Neural Information Processing Systems 26. 2013, pp. 3111–3119. Tomas Mikolov et al. “Distributed Representations of Words and Phrases and their Compositionality”. In: Advances in Neural Information Processing Systems 26. 2013, pp. 3111–3119.
[MH09]
go back to reference Andriy Mnih and Geoffrey E Hinton. “A scalable hierarchical distributed language model”. In: Advances in neural information processing systems. 2009, pp. 1081–1088. Andriy Mnih and Geoffrey E Hinton. “A scalable hierarchical distributed language model”. In: Advances in neural information processing systems. 2009, pp. 1081–1088.
[Nee+14]
go back to reference Arvind Neelakantan et al. “Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space.” In: EMNLP. ACL, 2014, pp. 1059–1069. Arvind Neelakantan et al. “Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space.” In: EMNLP. ACL, 2014, pp. 1059–1069.
[NK17]
go back to reference Maximillian Nickel and Douwe Kiela. “Poincaré Embeddings for Learning Hierarchical Representations”. In: Advances in Neural Information Processing Systems 30. Curran Associates, Inc., 2017, pp. 6338–6347. Maximillian Nickel and Douwe Kiela. “Poincaré Embeddings for Learning Hierarchical Representations”. In: Advances in Neural Information Processing Systems 30. Curran Associates, Inc., 2017, pp. 6338–6347.
[OMS15]
go back to reference Masataka Ono, Makoto Miwa, and Yutaka Sasaki. “Word Embedding based Antonym Detection using Thesauri and Distributional Information.” In: HLT-NAACL. 2015, pp. 984–989. Masataka Ono, Makoto Miwa, and Yutaka Sasaki. “Word Embedding based Antonym Detection using Thesauri and Distributional Information.” In: HLT-NAACL. 2015, pp. 984–989.
[PSM14]
go back to reference Jeffrey Pennington, Richard Socher, and Christopher D. Manning. “GloVe: Global Vectors for Word Representation”. In: Empirical Methods in Natural Language Processing (EMNLP). 2014, pp. 1532–1543. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. “GloVe: Global Vectors for Word Representation”. In: Empirical Methods in Natural Language Processing (EMNLP). 2014, pp. 1532–1543.
[RVS17]
go back to reference Sebastian Ruder, Ivan Vulic, and Anders Sogaard. A Survey Of Cross-lingual Word Embedding Models. 2017. Sebastian Ruder, Ivan Vulic, and Anders Sogaard. A Survey Of Cross-lingual Word Embedding Models. 2017.
[SL14]
go back to reference Tianze Shi and Zhiyuan Liu. “Linking GloVe with word2vec.” In: CoRR abs/1411.5595 (2014). Tianze Shi and Zhiyuan Liu. “Linking GloVe with word2vec.” In: CoRR abs/1411.5595 (2014).
[TML15]
go back to reference Andrew Trask, Phil Michalak, and John Liu. “sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings.” In: CoRR abs/1511.06388 (2015). Andrew Trask, Phil Michalak, and John Liu. “sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings.” In: CoRR abs/1511.06388 (2015).
[Vas+17a]
go back to reference Ashish Vaswani et al. “Attention is all you need”. In: Advances in Neural Information Processing Systems. 2017, pp. 5998–6008. Ashish Vaswani et al. “Attention is all you need”. In: Advances in Neural Information Processing Systems. 2017, pp. 5998–6008.
[VM14]
go back to reference Luke Vilnis and Andrew McCallum. “Word Representations via Gaussian Embedding.” In: CoRR abs/1412.6623 (2014). Luke Vilnis and Andrew McCallum. “Word Representations via Gaussian Embedding.” In: CoRR abs/1412.6623 (2014).
Metadata
Title
Distributed Representations
Authors
Uday Kamath
John Liu
James Whitaker
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-14596-5_5

Premium Partner