Skip to main content

2018 | OriginalPaper | Buchkapitel

Word Vector Computation Based on Implicit Expression

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Word vector and topic model can help retrieve information semantically to some extent. However, there are still many problems. (1) Antonyms share high similarity when clustering with word vectors. (2) Number of all kinds of name entities, such as person name, location name, and organization name is infinite while the number of one specific name entity in corpus is limited. As the result, the vectors for these name entities are not fully trained. In order to overcome above problems, this paper proposes a word vector computation model based on implicit expression. Words with the same meaning are implicitly expression based on dictionary and part of speech. With the implicit expression, the sparsity of corpus is reduced, and word vectors are trained deeper.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR (2013)
2.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS (2013)
3.
Zurück zum Zitat Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing (2014) Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing (2014)
4.
Zurück zum Zitat Huang, E.H., Socher, R., Manning, C.D., et al.: Improving word representations via global context and multiple word prototypes. In: Meeting of the Association for Computational Linguistics: Long Papers. Association for Computational Linguistics, pp. 873–882 (2012) Huang, E.H., Socher, R., Manning, C.D., et al.: Improving word representations via global context and multiple word prototypes. In: Meeting of the Association for Computational Linguistics: Long Papers. Association for Computational Linguistics, pp. 873–882 (2012)
5.
Zurück zum Zitat Mesnil, G., He, X., Deng, L., et al.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. Interspeech (2013) Mesnil, G., He, X., Deng, L., et al.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. Interspeech (2013)
6.
Zurück zum Zitat Bastien, F., Lamblin, P., Pascanu, R., et al.: Theano: new features and speed improvements. Comput. Sci. (2012) Bastien, F., Lamblin, P., Pascanu, R., et al.: Theano: new features and speed improvements. Comput. Sci. (2012)
7.
Zurück zum Zitat Socher, R., Huang, E.H., Pennington, J., et al.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. Adv. Neural. Inf. Process. Syst. 24, 801–809 (2011) Socher, R., Huang, E.H., Pennington, J., et al.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. Adv. Neural. Inf. Process. Syst. 24, 801–809 (2011)
8.
Zurück zum Zitat Rigouste, L., Cappé, O., Yvon, F.: Inference and evaluation of the multinomial mixture model for text clustering. Inf. Process. Manag. 43(5), 1260–1280 (2006)CrossRef Rigouste, L., Cappé, O., Yvon, F.: Inference and evaluation of the multinomial mixture model for text clustering. Inf. Process. Manag. 43(5), 1260–1280 (2006)CrossRef
9.
Zurück zum Zitat Hofmann, T.: Probabilistic latent semantic indexing. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57. ACM (1999) Hofmann, T.: Probabilistic latent semantic indexing. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57. ACM (1999)
10.
Zurück zum Zitat Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
11.
Zurück zum Zitat Teh, Y.W., Jordan, M.I., Beal, M.J., et al.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)CrossRefMATHMathSciNet Teh, Y.W., Jordan, M.I., Beal, M.J., et al.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)CrossRefMATHMathSciNet
12.
Zurück zum Zitat Teh, Y.W., Jordan, M.I.: Hierarchical Bayesian Nonparametric Models With Applications. To appear in Bayesian Nonparametrics: Principles and Practice, pp. 158–207 (2009) Teh, Y.W., Jordan, M.I.: Hierarchical Bayesian Nonparametric Models With Applications. To appear in Bayesian Nonparametrics: Principles and Practice, pp. 158–207 (2009)
Metadaten
Titel
Word Vector Computation Based on Implicit Expression
verfasst von
Xinzhi Wang
Hui Zhang
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-67071-3_12

Premium Partner