Skip to main content

2018 | OriginalPaper | Buchkapitel

Topic-Bigram Enhanced Word Embedding Model

verfasst von : Qi Yang, Ruixuan Li, Yuhua Li, Qilei Liu

Erschienen in: Neural Information Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we propose a novel model which exploits the topic relevance to enhance the word embedding learning. We attempt to leverage the hidden topic-bigram model to build topic relevance matrices, then learn the Topic-Bigram Word Embedding (TBWE) by aggregating the context as well as corresponding topic-bigram information. The topic relevance weights are updated with word embeddings simultaneously during the training process. To verify the validity and accuracy of the model, we conduct experiments on word analogy task and word similarity task. The results show that the TBWE model can achieve the better performance in both two tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Barbieri, N., Manco, G., Ritacco, E., Carnuccio, M., Bevacqua, A.: Probabilistic topic models for sequence data. Mach. Learn. 93(1), 5–29 (2013)MathSciNetCrossRef Barbieri, N., Manco, G., Ritacco, E., Carnuccio, M., Bevacqua, A.: Probabilistic topic models for sequence data. Mach. Learn. 93(1), 5–29 (2013)MathSciNetCrossRef
2.
Zurück zum Zitat Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)MATH Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)MATH
3.
Zurück zum Zitat Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. Machine Learning. In: Proceedings of the 25th International Conference (ICML 2008), vol. 307, pp. 160–167. ACM, Helsinki, Finland (2008) Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. Machine Learning. In: Proceedings of the 25th International Conference (ICML 2008), vol. 307, pp. 160–167. ACM, Helsinki, Finland (2008)
4.
Zurück zum Zitat Finkelstein, L., et al.: Placing search in context: the concept revisited. In: Proceedings of the 10th International World Wide Web Conference, WWW 2001, pp. 406–414. ACM, Hong Kong, China (2001) Finkelstein, L., et al.: Placing search in context: the concept revisited. In: Proceedings of the 10th International World Wide Web Conference, WWW 2001, pp. 406–414. ACM, Hong Kong, China (2001)
5.
Zurück zum Zitat Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation. Comput. Linguist. 41(4), 665–695 (2015)MathSciNetCrossRef Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation. Comput. Linguist. 41(4), 665–695 (2015)MathSciNetCrossRef
6.
Zurück zum Zitat Hu, Q., Pei, Y., Chen, Q., He, L.: SG++: word representation with sentiment and negation for twitter sentiment classification. In: Proceedings of the 39th International conference on Research and Development in Information Retrieval, SIGIR 2016, pp. 997–1000. ACM, Pisa, Italy (2016) Hu, Q., Pei, Y., Chen, Q., He, L.: SG++: word representation with sentiment and negation for twitter sentiment classification. In: Proceedings of the 39th International conference on Research and Development in Information Retrieval, SIGIR 2016, pp. 997–1000. ACM, Pisa, Italy (2016)
7.
Zurück zum Zitat Huang, E.H., Socher, R., Manning, C.D., Ng, A.Y.: Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012, vol. 1, pp. 873–882. The Association for Computer Linguistics, Jeju Island, Korea (2012) Huang, E.H., Socher, R., Manning, C.D., Ng, A.Y.: Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012, vol. 1, pp. 873–882. The Association for Computer Linguistics, Jeju Island, Korea (2012)
8.
Zurück zum Zitat Levy, O., Goldberg, Y.: Dependency-based word embeddings. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, vol. 2, pp. 302–308. The Association for Computer Linguistics, Baltimore (2014) Levy, O., Goldberg, Y.: Dependency-based word embeddings. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, vol. 2, pp. 302–308. The Association for Computer Linguistics, Baltimore (2014)
9.
Zurück zum Zitat Liu, Q., Ling, Z., Jiang, H., Hu, Y.: Part-of-speech relevance weights for learning word embeddings. CoRR abs/1603.07695 (2016) Liu, Q., Ling, Z., Jiang, H., Hu, Y.: Part-of-speech relevance weights for learning word embeddings. CoRR abs/1603.07695 (2016)
10.
Zurück zum Zitat Liu, Y., Liu, Z., Chua, T., Sun, M.: Topical word embeddings. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 2418–2424. AAAI Press, Austin, Texas, USA (2015) Liu, Y., Liu, Z., Chua, T., Sun, M.: Topical word embeddings. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 2418–2424. AAAI Press, Austin, Texas, USA (2015)
11.
Zurück zum Zitat Luong, T., Socher, R., Manning, C.D.: Better word representations with recursive neural networks for morphology. In: Proceedings of the 17th Conference on Computational Natural Language Learning, CoNLL 2013, pp. 104–113. ACL, Sofia, Bulgaria (2013) Luong, T., Socher, R., Manning, C.D.: Better word representations with recursive neural networks for morphology. In: Proceedings of the 17th Conference on Computational Natural Language Learning, CoNLL 2013, pp. 104–113. ACL, Sofia, Bulgaria (2013)
12.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)
13.
Zurück zum Zitat Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association, pp. 1045–1048. ISCA, Makuhari, Chiba, Japan (2010) Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association, pp. 1045–1048. ISCA, Makuhari, Chiba, Japan (2010)
14.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems, vol. 26, pp. 3111–3119. Advances in Neural Information Processing Systems, Lake Tahoe, Nevada, USA (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems, vol. 26, pp. 3111–3119. Advances in Neural Information Processing Systems, Lake Tahoe, Nevada, USA (2013)
15.
Zurück zum Zitat Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cogn. Process. 6(1), 1–28 (1991)MathSciNetCrossRef Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cogn. Process. 6(1), 1–28 (1991)MathSciNetCrossRef
16.
Zurück zum Zitat Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics, AISTATS 2005. Society for Artificial Intelligence and Statistics, Bridgetown, Barbados (2005) Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics, AISTATS 2005. Society for Artificial Intelligence and Statistics, Bridgetown, Barbados (2005)
17.
Zurück zum Zitat Qiu, S., Cui, Q., Bian, J., Gao, B., Liu, T.: Co-learning of word representations and morpheme representations. In: Proceedings of the 25th International Conference on Computational Linguistics, COLING 2014, pp. 141–150, Dublin, Ireland (2014) Qiu, S., Cui, Q., Bian, J., Gao, B., Liu, T.: Co-learning of word representations and morpheme representations. In: Proceedings of the 25th International Conference on Computational Linguistics, COLING 2014, pp. 141–150, Dublin, Ireland (2014)
18.
Zurück zum Zitat Ren, Y., Wang, R., Ji, D.: A topic-enhanced word embedding for twitter sentiment classification. Inf. Sci. 369, 188–198 (2016)CrossRef Ren, Y., Wang, R., Ji, D.: A topic-enhanced word embedding for twitter sentiment classification. Inf. Sci. 369, 188–198 (2016)CrossRef
19.
Zurück zum Zitat Ren, Y., Zhang, Y., Zhang, M., Ji, D.: Improving twitter sentiment classification using topic-enriched multi-prototype word embeddings. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, pp. 3038–3044. AAAI Press, Phoenix, Arizona, USA (2016) Ren, Y., Zhang, Y., Zhang, M., Ji, D.: Improving twitter sentiment classification using topic-enriched multi-prototype word embeddings. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, pp. 3038–3044. AAAI Press, Phoenix, Arizona, USA (2016)
20.
Zurück zum Zitat Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)CrossRef Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)CrossRef
21.
Zurück zum Zitat Tang, D., Wei, F., Qin, B., Yang, N., Liu, T., Zhou, M.: Sentiment embeddings with applications to sentiment analysis. IEEE Trans. Knowl. Data Eng. 28(2), 496–509 (2016)CrossRef Tang, D., Wei, F., Qin, B., Yang, N., Liu, T., Zhou, M.: Sentiment embeddings with applications to sentiment analysis. IEEE Trans. Knowl. Data Eng. 28(2), 496–509 (2016)CrossRef
22.
Zurück zum Zitat Wang, H., Wang, J., Zhao, M., Cao, J., Guo, M.: Joint topic-semantic-aware social recommendation for online voting. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, pp. 347–356. ACM, Singapore (2017) Wang, H., Wang, J., Zhao, M., Cao, J., Guo, M.: Joint topic-semantic-aware social recommendation for online voting. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, pp. 347–356. ACM, Singapore (2017)
23.
Zurück zum Zitat Xu, W., Rudnicky, A.: Can artificial neural networks learn language models? In: 6th International Conference on Spoken Language Processing, ICSLP 2000/INTERSPEECH 2000, pp. 202–205. ISCA, Beijing, China (2000) Xu, W., Rudnicky, A.: Can artificial neural networks learn language models? In: 6th International Conference on Spoken Language Processing, ICSLP 2000/INTERSPEECH 2000, pp. 202–205. ISCA, Beijing, China (2000)
24.
Zurück zum Zitat Yu, M., Dredze, M.: Improving lexical embeddings with semantic knowledge. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, vol. 2, pp. 545–550. The Association for Computer Linguistics, Baltimore (2014) Yu, M., Dredze, M.: Improving lexical embeddings with semantic knowledge. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, vol. 2, pp. 545–550. The Association for Computer Linguistics, Baltimore (2014)
Metadaten
Titel
Topic-Bigram Enhanced Word Embedding Model
verfasst von
Qi Yang
Ruixuan Li
Yuhua Li
Qilei Liu
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-04182-3_7