Skip to main content
Top

2017 | OriginalPaper | Chapter

Representation Learning of Multiword Expressions with Compositionality Constraint

Authors : Minglei Li, Qin Lu, Yunfei Long

Published in: Knowledge Science, Engineering and Management

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Representations of multiword expressions (MWE) are currently learned either from context external to MWEs based on the distributional hypothesis or from the representations of component words based on some composition functions using the compositional hypothesis. However, a distributional method treats MWEs as a non-divisible unit without consideration of component words. Distributional methods also have the data sparseness problem, especially for MWEs. On the other hand, a compositional method can fail if a MWE is non-compositional. In this paper, we propose a hybrid method to learn the representation of MWEs from their external context and component words with a compositionality constraint. This method can make use of both the external context and component words. Instead of simply combining the two kinds of information, we use compositionality measure from lexical semantics to serve as the constraint. The main idea is to learn MWE representations based on a weighted linear combination of both external context and component words, where the weight is based on the compositionality of MWEs. Evaluation on three datasets shows that the performance of this hybrid method is more robust and can improve the representation.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Baroni, M., Dinu, G., Kruszewski, G.: Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In: ACL, pp. 238–247 (2014) Baroni, M., Dinu, G., Kruszewski, G.: Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In: ACL, pp. 238–247 (2014)
2.
go back to reference Baroni, M., Zamparelli, R.: Nouns are vectors, adjectives are matrices: representing adjective-noun constructions in semantic space. In: EMNLP, pp. 1183–1193 (2010) Baroni, M., Zamparelli, R.: Nouns are vectors, adjectives are matrices: representing adjective-noun constructions in semantic space. In: EMNLP, pp. 1183–1193 (2010)
3.
go back to reference Biemann, C., Giesbrecht, E.: Distributional semantics and compositionality 2011: shared task description and results. In: Proceedings of the Workshop on Distributional Semantics and Compositionality, DiSCo 2011, Stroudsburg, PA, USA, pp. 21–28 (2011) Biemann, C., Giesbrecht, E.: Distributional semantics and compositionality 2011: shared task description and results. In: Proceedings of the Workshop on Distributional Semantics and Compositionality, DiSCo 2011, Stroudsburg, PA, USA, pp. 21–28 (2011)
4.
go back to reference Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. JMLR 12, 2493–2537 (2011)MATH Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. JMLR 12, 2493–2537 (2011)MATH
5.
go back to reference Farahmand, M., Smith, A., Nivre, J.: A multiword expression data set: annotating non-compositionality and conventionalization for English noun compounds. In: Proceedings of the 11th Workshop on Multiword Expressions, NAACL, pp. 29–33 (2015) Farahmand, M., Smith, A., Nivre, J.: A multiword expression data set: annotating non-compositionality and conventionalization for English noun compounds. In: Proceedings of the 11th Workshop on Multiword Expressions, NAACL, pp. 29–33 (2015)
6.
go back to reference Harris, Z.S.: Distributional structure. Word (1954) Harris, Z.S.: Distributional structure. Word (1954)
7.
go back to reference Hashimoto, K., Tsuruoka, Y.: Adaptive joint learning of compositional and non-compositional phrase embeddings. arXiv preprint arXiv:1603.06067 (2016) Hashimoto, K., Tsuruoka, Y.: Adaptive joint learning of compositional and non-compositional phrase embeddings. arXiv preprint arXiv:​1603.​06067 (2016)
8.
go back to reference Korkontzelos, I.: Unsupervised learning of multiword expressions. Ph.D. thesis, University of York, UK (2010) Korkontzelos, I.: Unsupervised learning of multiword expressions. Ph.D. thesis, University of York, UK (2010)
9.
go back to reference Korkontzelos, I., Zesch, T., Zanzotto, F.M., Biemann, C.: Semeval-2013 task 5: evaluating phrasal semantics. In: Second Joint Conference on Lexical and Computational Semantics (*SEM), vol. 2, pp. 39–47 (2013) Korkontzelos, I., Zesch, T., Zanzotto, F.M., Biemann, C.: Semeval-2013 task 5: evaluating phrasal semantics. In: Second Joint Conference on Lexical and Computational Semantics (*SEM), vol. 2, pp. 39–47 (2013)
10.
go back to reference Levy, O., Goldberg, Y.: Dependency-based word embeddings. In: Proceedings of ACL, vol. 2, pp. 302–308 (2014). 00054 Levy, O., Goldberg, Y.: Dependency-based word embeddings. In: Proceedings of ACL, vol. 2, pp. 302–308 (2014). 00054
11.
go back to reference Levy, O., Goldberg, Y.: Neural word embedding as implicit matrix factorization. In: Proceedings of NIPS, pp. 2177–2185 (2014) Levy, O., Goldberg, Y.: Neural word embedding as implicit matrix factorization. In: Proceedings of NIPS, pp. 2177–2185 (2014)
12.
go back to reference Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS, pp. 3111–3119 (2013)
13.
go back to reference Mitchell, J., Lapata, M.: Vector-based models of semantic composition. In. In Proceedings of ACL, pp. 236–244 (2008). 00288 Mitchell, J., Lapata, M.: Vector-based models of semantic composition. In. In Proceedings of ACL, pp. 236–244 (2008). 00288
14.
go back to reference Mitchell, J., Lapata, M.: Composition in distributional models of semantics. Cogn. Sci. 34(8), 1388–1429 (2010)CrossRef Mitchell, J., Lapata, M.: Composition in distributional models of semantics. Cogn. Sci. 34(8), 1388–1429 (2010)CrossRef
15.
go back to reference Moreno-Ortiz, A., Prez-Hernndez, C., Del-Olmo, M., et al.: Managing multiword expressions in a lexicon-based sentiment analysis system for Spanish. In: NAACL HLT 2013, vol. 1 (2013) Moreno-Ortiz, A., Prez-Hernndez, C., Del-Olmo, M., et al.: Managing multiword expressions in a lexicon-based sentiment analysis system for Spanish. In: NAACL HLT 2013, vol. 1 (2013)
16.
go back to reference Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of EMNLP, pp. 1532–1543 (2014) Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of EMNLP, pp. 1532–1543 (2014)
17.
go back to reference Salehi, B., Cook, P., Baldwin, T.: A word embedding approach to predicting the compositionality of multiword expressions. In: Proceedings of NAACL-HLT, pp. 977–983 (2015) Salehi, B., Cook, P., Baldwin, T.: A word embedding approach to predicting the compositionality of multiword expressions. In: Proceedings of NAACL-HLT, pp. 977–983 (2015)
18.
go back to reference Schneider, N., Danchik, E., Dyer, C., Smith, N.A.: Discriminative lexical semantic segmentation with gaps: running the MWE gamut. TACL 2, 193–206 (2014) Schneider, N., Danchik, E., Dyer, C., Smith, N.A.: Discriminative lexical semantic segmentation with gaps: running the MWE gamut. TACL 2, 193–206 (2014)
19.
go back to reference Schneider, N., Onuffer, S., Kazour, N., Danchik, E., Mordowanec, M.T., Conrad, H., Smith, N.A.: Comprehensive annotation of multiword expressions in a social web corpus. In: Proceedings of LREC, Reykjavik, Iceland, pp. 455–461 (2014) Schneider, N., Onuffer, S., Kazour, N., Danchik, E., Mordowanec, M.T., Conrad, H., Smith, N.A.: Comprehensive annotation of multiword expressions in a social web corpus. In: Proceedings of LREC, Reykjavik, Iceland, pp. 455–461 (2014)
20.
go back to reference Socher, R., Lin, C.C., Manning, C., Ng, A.Y.: Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of ICML, pp. 129–136 (2011) Socher, R., Lin, C.C., Manning, C., Ng, A.Y.: Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of ICML, pp. 129–136 (2011)
21.
go back to reference Sun, F., Guo, J., Lan, Y., Xu, J., Cheng, X.: Inside out: two jointly predictive models for word representations and phrase representations. In: Proceedings of AAAI (2016) Sun, F., Guo, J., Lan, Y., Xu, J., Cheng, X.: Inside out: two jointly predictive models for word representations and phrase representations. In: Proceedings of AAAI (2016)
22.
go back to reference Turney, P.D.: Domain and function: a dual-space model of semantic relations and compositions. JAIR 44, 533–585 (2012)MATH Turney, P.D.: Domain and function: a dual-space model of semantic relations and compositions. JAIR 44, 533–585 (2012)MATH
23.
go back to reference Yazdani, M., Farahmand, M., Henderson, J.: Learning semantic composition to detect non-compositionality of multiword expressions. In: Proceedings of EMNLP, pp. 1733–1742 (2015) Yazdani, M., Farahmand, M., Henderson, J.: Learning semantic composition to detect non-compositionality of multiword expressions. In: Proceedings of EMNLP, pp. 1733–1742 (2015)
24.
go back to reference Yin, W., Schtze, H.: An exploration of embeddings for generalized phrases. In: Proceedings of the ACL, Student Research Workshop, pp. 41–47 (2014) Yin, W., Schtze, H.: An exploration of embeddings for generalized phrases. In: Proceedings of the ACL, Student Research Workshop, pp. 41–47 (2014)
25.
go back to reference Yu, M., Dredze, M.: Learning composition models for phrase embeddings. TACL 3, 227–242 (2015). 00001 Yu, M., Dredze, M.: Learning composition models for phrase embeddings. TACL 3, 227–242 (2015). 00001
Metadata
Title
Representation Learning of Multiword Expressions with Compositionality Constraint
Authors
Minglei Li
Qin Lu
Yunfei Long
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-63558-3_43

Premium Partner