Skip to main content
Erschienen in: Knowledge and Information Systems 1/2019

11.12.2018 | Regular Paper

Improving short-text representation in convolutional networks by dependency parsing

verfasst von: Siheng Zhang, Wensheng Zhang, Jinghao Niu

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Automatic question answering (QA) system is the inevitable trend of future search engines. As the essential steps of QA, question classification and text retrieval both require algorithms to capture the semantic information and syntactic structure of natural language. This paper proposes dependency-based convolutional networks to learn a representation of sentences. First, we use dependency layer to map discrete word depth on the dependency tree of a sentence into continuous real space. Then, the mapping result serves as weight of word vectors and convolutional kernels are employed as feature extractors for further specific tasks. The method proposed allows convolutional networks to take the advantage of higher representational ability of dependency structure. Experiments involving three tasks including text classification, duplicate classification and text pairs ranking confirm the advantages of our model.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
5
http://​nlp.​stanford.​edu/​sentiment/​ Data are actually provided with phrase-level annotation; however, to enable direct comparison in learning representation of sentences, we do not use phrase-level annotation here.
 
11
More details: 1. For CR, dataset used by Kim [18] is a subset of ours; 2. For SST2/SST5, Kim [18] and other state-of-the-art obtained a higher performance by using phrase-level annotation, which is out of our scope.
 
Literatur
1.
Zurück zum Zitat Ferrucci DA (2012) Introduction to “This is Watson”. IBM J Res Dev 56(3/4):1:1–1:15CrossRef Ferrucci DA (2012) Introduction to “This is Watson”. IBM J Res Dev 56(3/4):1:1–1:15CrossRef
2.
Zurück zum Zitat Lally A, Prager JM, McCord MC et al (2012) Question analysis: how Watson reads a clue. IBM J Res Dev 56(3/4):2:1–2:14CrossRef Lally A, Prager JM, McCord MC et al (2012) Question analysis: how Watson reads a clue. IBM J Res Dev 56(3/4):2:1–2:14CrossRef
3.
Zurück zum Zitat Chu-Carroll J, Fan J, Schlaefer N, Zadrozny W (2012) Textual resource acquisition and engineering. IBM J Res Dev 56(3/4):3:1–3:11 Chu-Carroll J, Fan J, Schlaefer N, Zadrozny W (2012) Textual resource acquisition and engineering. IBM J Res Dev 56(3/4):3:1–3:11
4.
Zurück zum Zitat Loni B (2011) A survey of state-of-the-art methods on question classification. Delft University of Technology, Tech. Rep: 1–40 Loni B (2011) A survey of state-of-the-art methods on question classification. Delft University of Technology, Tech. Rep: 1–40
5.
Zurück zum Zitat Li X, Roth D (2002) Learning question classifiers. In: Proceedings of ACL, pp 1–7 Li X, Roth D (2002) Learning question classifiers. In: Proceedings of ACL, pp 1–7
6.
Zurück zum Zitat Wen X, Zhang Y, Liu T et al (2006) Syntactic structure parsing based Chinese question classification. J Chin Inf Process 20(2):33–39 Wen X, Zhang Y, Liu T et al (2006) Syntactic structure parsing based Chinese question classification. J Chin Inf Process 20(2):33–39
7.
Zurück zum Zitat Boot C, Meijman FJ (2010) Classifying health questions asked by the public using the ICPC-2 classification and a taxonomy of generic clinical questions: an empirical exploration of the feasibility. Health Commun 25(2):175–181CrossRef Boot C, Meijman FJ (2010) Classifying health questions asked by the public using the ICPC-2 classification and a taxonomy of generic clinical questions: an empirical exploration of the feasibility. Health Commun 25(2):175–181CrossRef
8.
Zurück zum Zitat Ely JW, Osheroff JA, Gorman PN et al (2000) A taxonomy of generic clinical questions: classification study. Br Med J 321(7258):429–432CrossRef Ely JW, Osheroff JA, Gorman PN et al (2000) A taxonomy of generic clinical questions: classification study. Br Med J 321(7258):429–432CrossRef
9.
Zurück zum Zitat Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. Mach Learn Res 3:137–1155MATH Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. Mach Learn Res 3:137–1155MATH
10.
Zurück zum Zitat Mikolov T, Karafiat M, Burget L et al (2010) Recurrent neural network based language model. In: Proceedings of Interspeech, pp 1045–1048 Mikolov T, Karafiat M, Burget L et al (2010) Recurrent neural network based language model. In: Proceedings of Interspeech, pp 1045–1048
11.
Zurück zum Zitat Mikolov T, Chen K, Corrado GS et al (2013) Efficient estimation of word representations in vector space. In: Proceedings of ICLR Mikolov T, Chen K, Corrado GS et al (2013) Efficient estimation of word representations in vector space. In: Proceedings of ICLR
12.
Zurück zum Zitat Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of ACL, pp 655–665 Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of ACL, pp 655–665
13.
Zurück zum Zitat Socher R, Lin C, Manning CD et al (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of ICML, pp 129–136 Socher R, Lin C, Manning CD et al (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of ICML, pp 129–136
14.
Zurück zum Zitat Socher R, Huval B, Manning CD et al (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of EMNLP, pp: 1201–1211 Socher R, Huval B, Manning CD et al (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of EMNLP, pp: 1201–1211
15.
Zurück zum Zitat Socher R, Perelygin A., Wu JY et al (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of EMNLP, pp 1631–1642 Socher R, Perelygin A., Wu JY et al (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of EMNLP, pp 1631–1642
16.
Zurück zum Zitat Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of ACL, pp 1556–1566 Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of ACL, pp 1556–1566
17.
Zurück zum Zitat Li X, Roth D (2004) Learning question classifiers: The role of semantic information. In: Proceedings of COLING, pp 556–562 Li X, Roth D (2004) Learning question classifiers: The role of semantic information. In: Proceedings of COLING, pp 556–562
18.
Zurück zum Zitat Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of EMNLP, pp 1746–1751 Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of EMNLP, pp 1746–1751
19.
Zurück zum Zitat Yih W, He X, Meek C (2014) Semantic parsing for single-relation question answering. In: Proceedings of ACL, pp 643–648 Yih W, He X, Meek C (2014) Semantic parsing for single-relation question answering. In: Proceedings of ACL, pp 643–648
20.
Zurück zum Zitat LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp 2278–2324 LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp 2278–2324
21.
Zurück zum Zitat Silva J, Coheur L, Mendes A, Wichert A (2011) From symbolic to sub-symbolic information in question classification. Artif Intell Rev 35(2):137–154CrossRef Silva J, Coheur L, Mendes A, Wichert A (2011) From symbolic to sub-symbolic information in question classification. Artif Intell Rev 35(2):137–154CrossRef
22.
Zurück zum Zitat Hu B, Lu Z, Li H et al (2014) Convolutional neural network architectures for matching natural language sentences. In: International conference on NIPS, pp 2042–2050 Hu B, Lu Z, Li H et al (2014) Convolutional neural network architectures for matching natural language sentences. In: International conference on NIPS, pp 2042–2050
23.
Zurück zum Zitat Severyn A, Moschitti A (2015) Learning to rank short text pairs with convolutional deep neural networks. In: Proceedings of SIGIR, pp 373–382 Severyn A, Moschitti A (2015) Learning to rank short text pairs with convolutional deep neural networks. In: Proceedings of SIGIR, pp 373–382
24.
Zurück zum Zitat Zhang D, Lee WS (2003) Question classification using support vector machines. In: Proceedings of SIGIR, pp 26–32 Zhang D, Lee WS (2003) Question classification using support vector machines. In: Proceedings of SIGIR, pp 26–32
25.
Zurück zum Zitat Lees RB, Chomsky N (1957) Syntactic structures. Language 33(3 Part 1):375–408CrossRef Lees RB, Chomsky N (1957) Syntactic structures. Language 33(3 Part 1):375–408CrossRef
26.
Zurück zum Zitat Farabet C, Couprie C, Najman L et al (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929CrossRef Farabet C, Couprie C, Najman L et al (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929CrossRef
27.
Zurück zum Zitat Echihabi A, Marcu D (2003) A noisy-channel approach to question answering. In: Proceedings of ACL, pp 16–23 Echihabi A, Marcu D (2003) A noisy-channel approach to question answering. In: Proceedings of ACL, pp 16–23
28.
Zurück zum Zitat Bordes A, Weston J, Usunier N (2014) Open question answering with weakly supervised embedding models. In: Joint European conference on machine learning and knowledge discovery in databases, pp 165–180 Bordes A, Weston J, Usunier N (2014) Open question answering with weakly supervised embedding models. In: Joint European conference on machine learning and knowledge discovery in databases, pp 165–180
29.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: International conference on NIPS, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: International conference on NIPS, pp 1097–1105
30.
Zurück zum Zitat Collobert R, Weston J, Bottou L et al (2011) Natural language processing from scratch. J Mach Learn Res 12(1):2493–2537MATH Collobert R, Weston J, Bottou L et al (2011) Natural language processing from scratch. J Mach Learn Res 12(1):2493–2537MATH
31.
Zurück zum Zitat Srivastava N, Hinton GE, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH Srivastava N, Hinton GE, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH
32.
Zurück zum Zitat Hu M, Liu B (2004) Mining opinion features in customer reviews. In: Proceedings on AAAI, pp 755–760 Hu M, Liu B (2004) Mining opinion features in customer reviews. In: Proceedings on AAAI, pp 755–760
33.
Zurück zum Zitat Ding X, Liu B, Yu PS (2008) A holistic lexicon-based approach to opinion mining. In: Proceedings of international conference on web search and data mining. ACM, pp 231–240 Ding X, Liu B, Yu PS (2008) A holistic lexicon-based approach to opinion mining. In: Proceedings of international conference on web search and data mining. ACM, pp 231–240
34.
Zurück zum Zitat Liu Q, Gao Z, Liu B et al (2015) Automated rule selection for aspect extraction in opinion mining. In: Proceedings of IJCAI, pp 1291–1297 Liu Q, Gao Z, Liu B et al (2015) Automated rule selection for aspect extraction in opinion mining. In: Proceedings of IJCAI, pp 1291–1297
35.
Zurück zum Zitat Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language. Lang Resour Eval 39(2–3):165–210CrossRef Wiebe J, Wilson T, Cardie C (2005) Annotating expressions of opinions and emotions in language. Lang Resour Eval 39(2–3):165–210CrossRef
36.
Zurück zum Zitat Kingma DP, Adam JB (2015) A method for stochastic optimization. In: Proceedings of ICLR, pp 1–10 Kingma DP, Adam JB (2015) A method for stochastic optimization. In: Proceedings of ICLR, pp 1–10
37.
Zurück zum Zitat Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of ICML, pp 1188–1196 Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of ICML, pp 1188–1196
38.
Zurück zum Zitat Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef
39.
Zurück zum Zitat Cho K, van Merrienboer B, Gulcehre C et al (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of EMNLP Cho K, van Merrienboer B, Gulcehre C et al (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of EMNLP
40.
Zurück zum Zitat Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings on EMNLP, pp 1532–1543 Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings on EMNLP, pp 1532–1543
41.
Zurück zum Zitat Manning CD, Surdeanu M, Bauer J et al (2014) The Stanford CoreNLP natural language processing toolkit. In: Meeting of the Association for Computational Linguistics: System Demonstrations Manning CD, Surdeanu M, Bauer J et al (2014) The Stanford CoreNLP natural language processing toolkit. In: Meeting of the Association for Computational Linguistics: System Demonstrations
Metadaten
Titel
Improving short-text representation in convolutional networks by dependency parsing
verfasst von
Siheng Zhang
Wensheng Zhang
Jinghao Niu
Publikationsdatum
11.12.2018
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2019
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-018-1312-9

Weitere Artikel der Ausgabe 1/2019

Knowledge and Information Systems 1/2019 Zur Ausgabe

Premium Partner