Skip to main content

2016 | OriginalPaper | Buchkapitel

Multilevel Syntactic Parsing Based on Recursive Restricted Boltzmann Machines and Learning to Rank

verfasst von : Jungang Xu, Hong Chen, Shilong Zhou, Ben He

Erschienen in: Intelligence and Security Informatics

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Syntactic parsing is one of the central tasks in Natural Language Processing. In this paper, a multilevel syntactic parsing algorithm is proposed, which is a three-level model with innovative combinations of existing mature tools and algorithms. First, coarse-grained syntax trees are generated with general algorithms, such as Cocke-Younger-Kasami (CYK) algorithm based on Probabilistic Context Free Grammar (PCFG). Second, Recursive Restricted Boltzmann Machines (RRBM) are constructed, which aim at extracting feature vector through training syntax trees with deep learning methods. At last, Learning to Rank (LTR) model is trained to get the most satisfactory syntax tree and furthermore turn the parsing problem into a typical retrieval problem. Experiment results show that our method has achieved the state-of-the-art performance on syntactic parsing task.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: Proceedings of the 24th International Conference on Machine Learning, pp. 641–648. ACM (2007) Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: Proceedings of the 24th International Conference on Machine Learning, pp. 641–648. ACM (2007)
2.
Zurück zum Zitat Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefMATH Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefMATH
3.
Zurück zum Zitat Hinton, G.E., Osindero, S., The, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefMATH Hinton, G.E., Osindero, S., The, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefMATH
4.
Zurück zum Zitat Salakhutdinov, R., Hinton, G.E.: Deep boltzmann machines. In: International Conference on Artificial Intelligence and Statistics, pp. 448–455 (2009) Salakhutdinov, R., Hinton, G.E.: Deep boltzmann machines. In: International Conference on Artificial Intelligence and Statistics, pp. 448–455 (2009)
5.
Zurück zum Zitat Krizhevsky, A., Hinton, G.E.: Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Technical report 1 (4), 7 (2009) Krizhevsky, A., Hinton, G.E.: Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Technical report 1 (4), 7 (2009)
6.
Zurück zum Zitat Krizhevsky, A., Hinton, G.E.: Using very deep autoencoders for content-based image retrieval. In: ESANN. Citeseer (2011) Krizhevsky, A., Hinton, G.E.: Using very deep autoencoders for content-based image retrieval. In: ESANN. Citeseer (2011)
7.
Zurück zum Zitat Mohamed, A., Dahl, G.E., Hinton, G.: Acoustic modeling using deep belief net-works. IEEE Trans. Audio Speech Lang. Process. 20(1), 14–22 (2012)CrossRef Mohamed, A., Dahl, G.E., Hinton, G.: Acoustic modeling using deep belief net-works. IEEE Trans. Audio Speech Lang. Process. 20(1), 14–22 (2012)CrossRef
8.
Zurück zum Zitat Hinton, G.E., Deng, L., Yu, D., Dahl, G.E., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. Sig. Process. Mag. IEEE 29(6), 82–97 (2012)CrossRef Hinton, G.E., Deng, L., Yu, D., Dahl, G.E., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. Sig. Process. Mag. IEEE 29(6), 82–97 (2012)CrossRef
9.
Zurück zum Zitat Salakhutdinov, R., Hinton, G.E.: Semantic hashing. Int. J. Approximate Reasoning 50(7), 969–978 (2009)CrossRef Salakhutdinov, R., Hinton, G.E.: Semantic hashing. Int. J. Approximate Reasoning 50(7), 969–978 (2009)CrossRef
10.
Zurück zum Zitat Bengio, Y., Ducharme, R., Vincent, P., et al.: A neural probabilistic language model. J. Mach. Learn. Res. 2003(3), 1137–1155 (2003)MATH Bengio, Y., Ducharme, R., Vincent, P., et al.: A neural probabilistic language model. J. Mach. Learn. Res. 2003(3), 1137–1155 (2003)MATH
11.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representa-tions of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representa-tions of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
12.
Zurück zum Zitat Hinton, G.E.: Learning distributed representations of concepts. In: Proceedings of the Eighth Annual Conference of the Cognitive Science Society, pp. 1–12 (1986) Hinton, G.E.: Learning distributed representations of concepts. In: Proceedings of the Eighth Annual Conference of the Cognitive Science Society, pp. 1–12 (1986)
13.
Zurück zum Zitat Collobert, R., Weston, J.: A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167. ACM (2008) Collobert, R., Weston, J.: A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167. ACM (2008)
14.
Zurück zum Zitat Huang, E.H., Socher, R., Manning, C.D., et al.: Improving word representations via global con-text and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 873–882. Association for Computational Linguistics (2012) Huang, E.H., Socher, R., Manning, C.D., et al.: Improving word representations via global con-text and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 873–882. Association for Computational Linguistics (2012)
15.
Zurück zum Zitat Socher, R., Bauer, J., Manning, C.D., Ng, A.Y.: Parsing with compositional vector grammars. In: Proceedings of the ACL Conference (2013) Socher, R., Bauer, J., Manning, C.D., Ng, A.Y.: Parsing with compositional vector grammars. In: Proceedings of the ACL Conference (2013)
16.
Zurück zum Zitat Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)CrossRefMATH Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)CrossRefMATH
17.
Zurück zum Zitat Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)CrossRefMATH Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)CrossRefMATH
18.
Zurück zum Zitat Zhai, C.X.: Statistical language models for information retrieval. Synth. Lect. Hum. Lang. Technol. 1(1), 1–141 (2008)CrossRef Zhai, C.X.: Statistical language models for information retrieval. Synth. Lect. Hum. Lang. Technol. 1(1), 1–141 (2008)CrossRef
19.
Zurück zum Zitat Xia, F., Liu, T.Y., Wang, J., et al.: Listwise approach to learning to rank: theory and algorithm. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1192–1199. ACM (2008) Xia, F., Liu, T.Y., Wang, J., et al.: Listwise approach to learning to rank: theory and algorithm. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1192–1199. ACM (2008)
20.
Zurück zum Zitat Gildea, D., Palmer, M.: The necessity of parsing for predicate argument recognition. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 239–246. Association for Computational Linguistics, Stroudsburg (2002) Gildea, D., Palmer, M.: The necessity of parsing for predicate argument recognition. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 239–246. Association for Computational Linguistics, Stroudsburg (2002)
21.
Zurück zum Zitat Klein, D., Manning C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Meeting of the Association for Computational Linguistics, pp. 423–430 (2003) Klein, D., Manning C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Meeting of the Association for Computational Linguistics, pp. 423–430 (2003)
22.
Zurück zum Zitat Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 433–440. Association for Computational Linguistics (2006) Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 433–440. Association for Computational Linguistics (2006)
23.
Zurück zum Zitat Charniak, E.: A maximum-entropy-inspired parser. In: Proceedings of the 1st North American chapter of the Association for Computational Linguistics Conference, pp. 132–139. Association for Computational Linguistics (2000) Charniak, E.: A maximum-entropy-inspired parser. In: Proceedings of the 1st North American chapter of the Association for Computational Linguistics Conference, pp. 132–139. Association for Computational Linguistics (2000)
24.
25.
26.
Zurück zum Zitat Younger, D.H.: Recognition and parsing of context-free languages in time n 3. Inform. Control 10(2), 189 (1967)CrossRefMATH Younger, D.H.: Recognition and parsing of context-free languages in time n 3. Inform. Control 10(2), 189 (1967)CrossRefMATH
27.
Zurück zum Zitat Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetMATH Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetMATH
28.
Zurück zum Zitat Abney, S., Flickenger, S., Gdaniec, C., et al.: Procedure for quantitatively comparing the syntac-tic coverage of English grammars. In: Proceedings of the Workshop on Speech and Natural Language, pp. 306–311. Association for Computational Linguistics (1991) Abney, S., Flickenger, S., Gdaniec, C., et al.: Procedure for quantitatively comparing the syntac-tic coverage of English grammars. In: Proceedings of the Workshop on Speech and Natural Language, pp. 306–311. Association for Computational Linguistics (1991)
29.
Zurück zum Zitat Freund, Y., Iyer, R., Schapire, R.E., et al.: An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 2003(4), 933–969 (2003)MathSciNetMATH Freund, Y., Iyer, R., Schapire, R.E., et al.: An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 2003(4), 933–969 (2003)MathSciNetMATH
30.
Zurück zum Zitat Burges, C.J.C.: From ranknet to lambdarank to lambdamart: an overview. Learning 2010(11), 23–581 (2010) Burges, C.J.C.: From ranknet to lambdarank to lambdamart: an overview. Learning 2010(11), 23–581 (2010)
31.
Zurück zum Zitat Xu, J., Li, H.: AdaRank: a boosting algorithm for information retrieval. In: Proceedings of the 30rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 391–398. ACM (2007) Xu, J., Li, H.: AdaRank: a boosting algorithm for information retrieval. In: Proceedings of the 30rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 391–398. ACM (2007)
Metadaten
Titel
Multilevel Syntactic Parsing Based on Recursive Restricted Boltzmann Machines and Learning to Rank
verfasst von
Jungang Xu
Hong Chen
Shilong Zhou
Ben He
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-31863-9_4