Skip to main content
Top

2016 | OriginalPaper | Chapter

Learning Phrase Representations Based on Word and Character Embeddings

Authors : Jiangping Huang, Donghong Ji, Shuxin Yao, Wenzhi Huang, Bo Chen

Published in: Neural Information Processing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Most phrase embedding methods consider a phrase as a basic term and learn embeddings according to phrases’ external contexts, ignoring the internal structures of words and characters. There are some languages such as Chinese, a phrase is usually composed of several words or characters and contains rich internal information. The semantic meaning of a phrase is also related to the meanings of its composing words or characters. Therefore, we take Chinese for example, and propose a joint words and characters embedding model for learning phrase representation. In order to disambiguate the word and character and address the issue of non-compositional phrases, we present multiple-prototype word and character embeddings and an effective phrase selection method. We evaluate the effectiveness of the proposed model on phrase similarities computation and analogical reasoning. The empirical result shows that our model outperforms other baseline methods which ignore internal word and character information.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Chen, X., Xu, L., Liu, Z., Sun, M., Luan, H.: Joint learning of character and word embeddings. In: Proceedings of the Twenty-Fourth IJCAI, pp. 1236–1242 (2015) Chen, X., Xu, L., Liu, Z., Sun, M., Luan, H.: Joint learning of character and word embeddings. In: Proceedings of the Twenty-Fourth IJCAI, pp. 1236–1242 (2015)
2.
go back to reference Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on EMNLP, pp. 1724–1734. Association for Computational Linguistics (2014) Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on EMNLP, pp. 1724–1734. Association for Computational Linguistics (2014)
3.
go back to reference Huang, E., Socher, R., Manning, C., Ng, A.: Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the ACL, pp. 873–882. Association for Computational Linguistics (2012) Huang, E., Socher, R., Manning, C., Ng, A.: Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the ACL, pp. 873–882. Association for Computational Linguistics (2012)
4.
go back to reference Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26, pp. 3111–3119. Curran Associates, Inc. (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26, pp. 3111–3119. Curran Associates, Inc. (2013)
5.
go back to reference Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the NAACL: HLT, pp. 746–751. Association for Computational Linguistics (2013) Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the NAACL: HLT, pp. 746–751. Association for Computational Linguistics (2013)
6.
go back to reference Mitchell, J., Lapata, M.: Composition in distributional models of semantics. Cogn. Sci. 34(8), 1388–1429 (2010)CrossRef Mitchell, J., Lapata, M.: Composition in distributional models of semantics. Cogn. Sci. 34(8), 1388–1429 (2010)CrossRef
7.
go back to reference Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on EMNLP, pp. 1532–1543. Association for Computational Linguistics, Doha (2014) Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on EMNLP, pp. 1532–1543. Association for Computational Linguistics, Doha (2014)
8.
go back to reference Quoc, L., Tomas, M.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning, pp. 1188–1196. JMLR.org, Beijing (2014) Quoc, L., Tomas, M.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning, pp. 1188–1196. JMLR.​org, Beijing (2014)
9.
go back to reference Socher, R., Bauer, J., Manning, C.D., Andrew, Y., N.: Parsing with compositional vector grammars. In: Proceedings of the 51st Annual Meeting of the ACL, pp. 455–465. Association for Computational Linguistics (2013) Socher, R., Bauer, J., Manning, C.D., Andrew, Y., N.: Parsing with compositional vector grammars. In: Proceedings of the 51st Annual Meeting of the ACL, pp. 455–465. Association for Computational Linguistics (2013)
10.
go back to reference Turian, J., Ratinov, L.A., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the ACL, pp. 384–394. Association for Computational Linguistics, July 2010 Turian, J., Ratinov, L.A., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the ACL, pp. 384–394. Association for Computational Linguistics, July 2010
11.
go back to reference Yu, Z., Zhiyuan, L., Maosong, S.: Phrase type sensitive tensor indexing model for semantic composition. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 2195–2201 (2015) Yu, Z., Zhiyuan, L., Maosong, S.: Phrase type sensitive tensor indexing model for semantic composition. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 2195–2201 (2015)
Metadata
Title
Learning Phrase Representations Based on Word and Character Embeddings
Authors
Jiangping Huang
Donghong Ji
Shuxin Yao
Wenzhi Huang
Bo Chen
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-46681-1_65

Premium Partner