nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

4. Word Embedding for Understanding Natural Language: A Survey

verfasst von : Yang Li, Tao Yang

Erschienen in: Guide to Big Data Applications

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Word embedding, where semantic and syntactic features are captured from unlabeled text data, is a basic procedure in Natural Language Processing (NLP). The extracted features thus could be organized in low dimensional space. Some representative word embedding approaches include Probability Language Model, Neural Networks Language Model, Sparse Coding, etc. The state-of-the-art methods like skip-gram negative samplings, noise-contrastive estimation, matrix factorization and hierarchical structure regularizer are applied correspondingly to resolve those models. Most of these literatures are working on the observed count and co-occurrence statistic to learn the word embedding. The increasing scale of data, the sparsity of data representation, word position, and training speed are the main challenges for designing word embedding algorithms. In this survey, we first introduce the motivation and background of word embedding. Next we will introduce the methods of text representation as preliminaries, as well as some existing word embedding approaches such as Neural Network Language Model and Sparse Coding Approach, along with their evaluation metrics. In the end, we summarize the applications of word embedding and discuss its future directions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Privacy Preserving Federated Big Data Analysis

Nächstes Kapitel Big Data Solutions to Interpreting Complex Systems in the Environment

https://en.wikipedia.org/wiki/Semantic_analysis_(linguistics).

The citation numbers are from http://www.webofscience.com.

Amir, S., Astudillo, R., Ling, W., Martins, B., Silva, M. J., & Trancoso, I. (2015). INESC-ID: A regression model for large scale twitter sentiment lexicon induction. In International Workshop on Semantic Evaluation.

Andreas, J., & Dan, K. (2014). How much do word embeddings encode about syntax? In Meeting of the Association for Computational Linguistics (pp. 822–827).

Antony, P. J., Warrier, N. J., & Soman, K. P. (2010). Penn treebank. International Journal of Computer Applications, 7(8), 14–21.CrossRef

Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. Eprint arxiv.

Bengio, Y., Schwenk, H., Senécal, J. S., Morin, F., & Gauvain, J. L. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3(6), 1137–1155.

Bjerva, J., Bos, J., van der Goot, R., & Nissim, M. (2014). The meaning factory: Formal semantics for recognizing textual entailment and determining semantic similarity. In SemEval-2014 Workshop.

Collobert, R., & Weston, J. (2008). A unified architecture for natural language processing: deep neural networks with multitask learning. In International Conference, Helsinki, Finland, June (pp. 160–167).

Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12(1), 2493–2537.MATH

Deerweste, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Richard (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41, 391–407.

Dickinson, B., & Hu, W. (2015). Sentiment analysis of investor opinions on twitter. Social Networking, 04(3), 62–71.CrossRef

Djuric, N., Wu, H., Radosavljevic, V., Grbovic, M., & Bhamidipati, N. (2015). Hierarchical neural language models for joint representation of streaming documents and their content. In WWW.

Faruqui, M., Tsvetkov, Y., Yogatama, D., Dyer, C., & Smith, N. (2015). Sparse overcomplete word vector representations. Preprint, arXiv:1506.02004.

Fellbaum, C. (1998). WordNet. Wiley Online Library.MATH

Goddard, C. (2011). Semantic analysis: A practical introduction. Oxford: Oxford University Press.

Goller, C., & Kuchler, A. (1996). Learning task-dependent distributed representations by backpropagation through structure. In IEEE International Conference on Neural Networks (Vol. 1, pp. 347–352).

Harris, Z. S. (1954). Distributional structure. Synthese Language Library, 10(2–3), 146–162.

Hill, F., Cho, K., Jean, S., Devin, C., & Bengio, Y. (2014). Embedding word similarity with neural machine translation. Eprint arXiv.

Hinton, G. E. (1986). Learning distributed representations of concepts. In Proceedings of CogSci.

Hofmann, T. (2001). Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42(1–2), 177–196.CrossRefMATH

Hoyer, P. O. (2002). Non-negative sparse coding. In IEEE Workshop on Neural Networks for Signal Processing (pp. 557–565).

Huang, E. H., Socher, R., Manning, C. D., & Ng, A. Y. (2012). Improving word representations via global context and multiple word prototypes. In Meeting of the Association for Computational Linguistics: Long Papers (pp. 873–882).

Huang, P.-S., He, X., Gao, J., Deng, L., Acero, A., & Heck, L. (2013). Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22Nd ACM International Conference on Information & Knowledge Management, CIKM ’13 (pp. 2333–2338). New York, NY: ACM.

Klein, D., & Manning, C. D. (2003). Accurate unlexicalized parsing. In Meeting on Association for Computational Linguistics (pp. 423–430).

Lai, S., Liu, K., Xu, L., & Zhao, J. (2015). How to generate a good word embedding? Credit Union Times, III(2).

Landauer, T. K. (2002). On the computational basis of learning and cognition: Arguments from lsa. Psychology of Learning & Motivation, 41(41), 43–84.CrossRef

Landauer, T. K., & Dumais, S. T. (1997). A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211–240.CrossRef

Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25(2), 259–284.CrossRef

Lin, C., & He, Y. (2009). Joint sentiment/topic model for sentiment analysis. In ACM Conference on Information & Knowledge Management (pp. 375–384).

Lin, X. (2009). Dual averaging methods for regularized stochastic learning and online optimization. In Conference on Neural Information Processing Systems 2009 (pp. 2543–2596).

Liu, Y., Liu, Z., Chua, T. S., & Sun, M. (2015). Topical word embeddings. In Twenty-Ninth AAAI Conference on Artificial Intelligence.

Luo, Y., Tang, J., Yan, J., Xu, C., & Chen, Z. (2014). Pre-trained multi-view word embedding using two-side neural network. In Twenty-Eighth AAAI Conference on Artificial Intelligence.

Matsugu, M., Mori, K., Mitari, Y., & Kaneda, Y. (2003). Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Networks, 16(5–6), 555–559.CrossRef

McMahon, J. G., & Smith, F. J. (1996). Improving statistical language model performance with automatically generated word hierarchies. Computational Linguistics, 22(2), 217–247.

Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. CoRR, abs/1301.3781.

Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., & Khudanpur, S. (2010). Recurrent neural network based language model. In INTERSPEECH 2010, Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September (pp. 1045–1048).

Mnih, A., & Hinton, G. (2007). Three new graphical models for statistical language modelling. In International Conference on Machine Learning (pp. 641–648).

Mnih, A., & Hinton, G. E. (2008). A scalable hierarchical distributed language model. In Advances in Neural Information Processing Systems 21, Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 8–11, 2008 (pp. 1081–1088).

Morin, F., & Bengio, Y. (2005). Hierarchical probabilistic neural network language model. Aistats (Vol. 5, pp. 246–252). Citeseer.

Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).

Rastogi, P., Van Durme, B., & Arora, R. (2015). Multiview LSA: Representation learning via generalized CCA. In Conference of the North American chapter of the association for computational linguistics: Human language technologies, NAACL-HLT’15 (pp. 556–566).

Rijkhoff, & Jan (2007). Word classes. Language & Linguistics Compass, 1(6), 709–726.

Salehi, B., Cook, P., & Baldwin, T. (2015). A word embedding approach to predicting the compositionality of multiword expressions. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

Salton, G., Wong, A., & Yang, C. S. (1997). A vector space model for automatic indexing. San Francisco: Morgan Kaufmann Publishers Inc.MATH

Saurf, R., & Pustejovsky, J. (2007). Determining modality and factuality for text entailment. In International Conference on Semantic Computing (pp. 509–516).

Schökopf, B., Platt, J., & Hofmann, T. (2007). Efficient sparse coding algorithms. In NIPS (pp. 801–808).

Scott, D., Dumais, S. T., Furnas, G. W., Lauer, T. K., & Richard, H. (1999). Indexing by latent semantic analysis. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (pp. 391–407).

Sharma, R., & Raman, S. (2003). Phrase-based text representation for managing the web documents. In International Conference on Information Technology: Coding and Computing (pp. 165–169).

Shazeer, N., Doherty, R., Evans, C., & Waterson, C. (2016). Swivel: Improving embeddings by noticing what’s missing. Preprint, arXiv:1602.02215.

Socher, R., Huval, B., Manning, C. D., & Ng, A. Y. (2012). Semantic compositionality through recursive matrix-vector spaces. In Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (pp. 1201–1211).

Socher, R., Pennington, J., Huang, E. H., Ng, A. Y., & Manning, C. D. (2011). Semi-supervised recursive autoencoders for predicting sentiment distributions. In Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27–31 July 2011, John Mcintyre Conference Centre, Edinburgh, A Meeting of SIGDAT, A Special Interest Group of the ACL (pp. 151–161).

Socher, R., Perelygin, A., Wu, J. Y., Chuang, J., Manning, C. D., Ng, A. Y., & Potts, C. (2013). Recursive deep models for semantic compositionality over a sentiment treebank. In Conference on Empirical Methods on Natural Language Processing.

Sun, F., Guo, J., Lan, Y., Xu, J., & Cheng, X. (2015). Learning word representations by jointly modeling syntagmatic and paradigmatic relations. In AAAI.

Sun, F., Guo, J., Lan, Y., Xu, J., & Cheng, X. (2016). Sparse word embeddings using l1 regularized online learning. In International Joint Conference on Artificial Intelligence.

Sun, S., Liu, H., Lin, H., & Abraham, A. (2012). Twitter part-of-speech tagging using pre-classification hidden Markov model. In IEEE International Conference on Systems, Man, and Cybernetics (pp. 1118–1123).

Ueffing, N., Haffari, G., & Sarkar, A. (2007). Transductive learning for statistical machine translation. In ACL 2007, Proceedings of the Meeting of the Association for Computational Linguistics, June 23–30, 2007, Prague (pp. 25–32).

Xu, W., & Rudnicky, A. (2000). Can artificial neural networks learn language models? In International Conference on Statistical Language Processing (pp. 202–205).

Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Fourteenth International Conference on Machine Learning (pp. 412–420).

Yih, W.-T., Zweig, G., & Platt, J. C. (2012). Polarity inducing latent semantic analysis. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL ’12 (pp. 1212–1222). Stroudsburg, PA: Association for Computational Linguistics.

Yin, W., & Schütze, H. (2016). Discriminative phrase embedding for paraphrase identification. Preprint, arXiv:1604.00503.

Yogatama, D., Faruqui, M., Dyer, C., & Smith, N. A. (2014a). Learning word representations with hierarchical sparse coding. Eprint arXiv.

Yogatama, D., Faruqui, M., Dyer, C., & Smith, N. A. (2014b). Learning word representations with hierarchical sparse coding. Eprint arXiv.

Zhao, J., Lan, M., Niu, Z. Y., & Lu, Y. (2015). Integrating word embeddings and traditional NLP features to measure textual entailment and semantic relatedness of sentence pairs. In International Joint Conference on Neural Networks (pp. 32–35).

Zhou, C., Sun, C., Liu, Z., & Lau, F. (2015). Category enhanced word embedding. Preprint, arXiv:1511.08629.

Zou, W. Y., Socher, R., Cer, D. M., & Manning, C. D. (2013). Bilingual word embeddings for phrase-based machine translation. In EMNLP (pp. 1393–1398).

Titel: Word Embedding for Understanding Natural Language: A Survey
verfasst von: Yang Li
Tao Yang
Verlag: Springer International Publishing
Buch: Guide to Big Data Applications
Print ISBN: 978-3-319-53816-7

Electronic ISBN: 978-3-319-53817-4

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-319-53817-4_4

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.