Top

Published in:

2016 | OriginalPaper | Chapter

Large-Scale Kernel-Based Language Learning Through the Ensemble Nystr\(\ddot{o}\)m Methods

Authors : Danilo Croce, Roberto Basili

Published in: Advances in Information Retrieval

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Kernel methods have been used by many Machine Learning paradigms, achieving state-of-the-art performances in many Language Learning tasks. One drawback of expressive kernel functions, such as Sequence or Tree kernels, is the time and space complexity required both in learning and classification. In this paper, the Nystr\(\ddot{o}\)m methodology is studied as a viable solution to face these scalability issues. By mapping data in low-dimensional spaces as kernel space approximations, the proposed methodology positively impacts on scalability through compact linear representation of highly structured data. Computation can be also distributed on several machines by adopting the so-called Ensemble Nystr\(\ddot{o}\)m Method. Experimental results show that an accuracy comparable with state-of-the-art kernel-based methods can be obtained by reducing of orders of magnitude the required operations and enabling the adoption of datasets containing more than one million examples.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Multi-task Representation Learning for Demographic Prediction

next chapter Beyond Factoid QA: Effective Methods for Non-factoid Answer Sentence Retrieval

We are referring to the PA-II version in [5].

In this work we will consider the hinge loss \(H(\varvec{w}; (\varvec{\tilde{x}}_t, y_t))=max(0,1-y_t \varvec{w}^\top \varvec{\tilde{x}}_t)\).

http://sag.art.uniroma2.it/demo-software/kelp/.

http://cogcomp.cs.illinois.edu/Data/QA/QC/.

C-SVM [3] proposes a caching policy, here ignored for comparative purposes. Large-scale applications may impose prohibitive requirements on the required space.

Only sentences whose lexical unit corresponds to a verb are adopted in our tests.

Baker, C.F., Fillmore, C.J., Lowe, J.B.: The Berkeley FrameNet project. In: Proceedings of COLING-ACL. Montreal, Canada(1998)

Cancedda, N., Gaussier, É., Goutte, C., Renders, J.M.: Word-sequence kernels. J. Mach. Learn. Res. 3, 1059–1082 (2003)MathSciNetMATH

Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)CrossRef

Collins, M., Duffy, N.: Convolution kernels for natural language. In: Proceedings of Neural Information Processing Systems (NIPS 2001), pp. 625–632 (2001)

Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. J. Mach. Learn. Res. 7, 551–585 (2006)MathSciNetMATH

Croce, D., Moschitti, A., Basili, R.: Structured lexical similarity via convolution kernels on dependency trees. In: Proceedings of EMNLP (2011)

Culotta, A., Sorensen, J.: Dependency tree kernels for relation extraction. In: Proceedings of ACL 2004. Stroudsburg, PA, USA (2004)

Dredze, M., Crammer, K., Pereira, F.: Confidence-weighted linear classification. In: Proceedings of ICML 2008. ACM, New York (2008)

Drineas, P., Mahoney, M.W.: On the nystrm method for approximating a gram matrix for improved kernel-based learning. J. ML Res. 6, 2153–2175 (2005)MathSciNetMATH

10.

Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)MATH

11.

Filice, S., Castellucci, G., Croce, D., Basili, R.: Effective kernelized online learning in language processing tasks. In: de Rijke, M., Kenter, T., de Vries, A.P., Zhai, C.X., de Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 347–358. Springer, Heidelberg (2014)CrossRef

12.

Filice, S., Castellucci, G., Croce, D., Basili, R.: Kelp: a kernel-based learning platform for natural language processing. In: Proceedings of ACL: System Demonstrations. Beijing, China, July 2015

13.

Filice, S., Croce, D., Basili, R.: A stratified strategy for efficient kernel-based learning. In: AAAI Conference on Artificial Intelligence (2015)

14.

Filice, S., Croce, D., Basili, R., Zanzotto, F.M.: Linear online learning over structured data with distributed tree kernels. In: Proceedings of ICMLA 2013 (2013)

15.

Hsieh, C.J., Chang, K.W., Lin, C.J., Keerthi, S.S., Sundararajan, S.: A dual coordinate descent method for large-scale linear svm. In: Proceedings of the ICML 2008, pp. 408–415. ACM, New York (2008)

16.

Joachims, T., Finley, T., Yu, C.N.: Cutting-plane training of structural SVMs. Mach. Learn. 77(1), 27–59 (2009)CrossRefMATH

17.

Kumar, S., Mohri, M., Talwalkar, A.: Ensemble nystrom method. In: NIPS, pp. 1060–1068. Curran Associates, Inc.(2009)

18.

Kumar, S., Mohri, M., Talwalkar, A.: Sampling methods for the nyström method. J. Mach. Learn. Res. 13, 981–1006 (2012)MathSciNetMATH

19.

Li, X., Roth, D.: Learning question classifiers: the role of semantic information. Nat. Lang. Eng. 12(3), 229–249 (2006)CrossRef

20.

Moschitti, A., Pighin, D., Basili, R.: Tree kernels for semantic role labeling. Comput. Linguist. 34, 193–224 (2008)MathSciNetCrossRef

21.

Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)MathSciNetMATH

22.

Shalev-Shwartz, S., Singer, Y., Srebro, N.: Pegasos: primal estimated sub-gradient solver for SVM. In: Proceedings of ICML. ACM, New York (2007)

23.

Shalev-Shwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss. J. Mach. Learn. Res. 14(1), 567–599 (2013)MathSciNetMATH

24.

Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, New York (2004)CrossRefMATH

25.

Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience, New York (1998)MATH

26.

Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 480–492 (2012)CrossRef

27.

Wang, J., Zhao, P., Hoi, S.C.: Exact soft confidence-weighted learning. In: Proceedings of the ICML 2012. ACM, New York (2012)

28.

Wang, Z., Vucetic, S.: Online passive-aggressive algorithms on a budget. J. Mach. Learn. Res. Proc. Track 9, 908–915 (2010)

29.

Williams, C.K.I., Seeger, M.: Using the nyström method to speed up kernel machines. In: Proceedings of NIpPS 2000 (2001)

30.

Zanzotto, F.M., Dell’Arciprete, L.: Distributed tree kernels. In: Proceedings of ICML 2012 (2012)

31.

Zhang, D., Lee, W.S.: Question classification using support vector machines. In: Proceedings of SIGIR 2003, pp. 26–32. ACM, New York (2003)

Title: Large-Scale Kernel-Based Language Learning Through the Ensemble Nystrm Methods
Authors: Danilo Croce
Roberto Basili
Publisher: Springer International Publishing
Book: Advances in Information Retrieval
Print ISBN: 978-3-319-30670-4

Electronic ISBN: 978-3-319-30671-1

Copyright Year: 2016
DOI: https://doi.org/10.1007/978-3-319-30671-1_8

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"