nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

Selective Training: A Strategy for Fast Backpropagation on Sentence Embeddings

verfasst von : Jan Neerbek, Peter Dolog, Ira Assent

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Representation or embedding based machine learning models, such as language models or convolutional neural networks have shown great potential for improved performance. However, for complex models on large datasets training time can be extensive, approaching weeks, which is often infeasible in practice. In this work, we present a method to reduce training time substantially by selecting training instances that provide relevant information for training. Selection is based on the similarity of the learned representations over input instances, thus allowing for learning a non-trivial weighting scheme from multi-dimensional representations. We demonstrate the efficiency and effectivity of our approach in several text classification tasks using recursive neural networks. Our experiments show that by removing approximately one fifth of the training data the objective function converges up to six times faster without sacrificing accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Ranking Network Embedding via Adversarial Learning

Nächstes Kapitel Extracting Keyphrases from Research Papers Using Word Embeddings

code https://bitbucket.alexandra.dk/projects/TAB, data https://dataverse.harvard.edu/dataverse/enron-w-trees.

In this work we refer to recursive neural networks as RecNN to avoid name clash with RNNs.

http://www.fasb.org/jsp/FASB/Document_C/DocumentPage?cid=1218220124871.

Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. JMLR 3, 1137–1155 (2003)MATH

Coates, A., Ng, A.Y.: Learning feature representations with K-means. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 561–580. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_30CrossRef

Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: ICML, pp. 160–167 (2008)

Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. JMLR 12, 2493–2537 (2011)MATH

Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: EMNLP, pp. 670–680 (2017)

Cormack, G.V., Grossman, M.R., Hedin, B., Oard, D.W.: Overview of the TREC 2010 legal track. In: TREC (2010)

Forgy, E.W.: Cluster analysis of multivariate data: efficiency versus interpretability of classification. Biometrics 21(3), 768–769 (1965)

Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: ICML, pp. 148–156 (1996)

Goller, C., Kuchler, A.: Learning task-dependent distributed representations by backpropagation through structure. In: IEEE ICNN, pp. 347–352 (1996)

10.

Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)MATH

11.

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

12.

İrsoy, O., Cardie, C.: Deep recursive neural networks for compositionality in language. In: NIPS, pp. 2096–2104 (2014)

13.

Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP, pp. 1746–1751 (2014)

14.

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

15.

Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)

16.

Kiss, T., Strunk, J.: Unsupervised multilingual sentence boundary detection. Comput. Linguist. 32(4), 485–525 (2006)CrossRef

17.

Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: ACL, pp. 423–430 (2003)

18.

Klimt, B., Yang, Y.: The enron corpus: a new dataset for email classification research. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 217–226. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30115-8_22CrossRef

19.

Kotsiantis, S.B.: Bagging and boosting variants for handling classifications problems: a survey. Knowl. Eng. Rev. 29(1), 78–100 (2014)CrossRef

20.

Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)

21.

Lloyd, S.: Least squares quantization in PCM. IEEE TIT 28(2), 129–137 (1982)MathSciNetMATH

22.

Loshchilov, I., Hutter, F.: Online batch selection for faster training of neural networks. In: ICLR Workshop (2016)

23.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)

24.

Neerbek, J., Assent, I., Dolog, P.: Detecting complex sensitive information via phrase structure in recursive neural networks. In: Phung, D., Tseng, V.S., Webb, G.I., Ho, B., Ganji, M., Rashidi, L. (eds.) PAKDD 2018. LNCS (LNAI), vol. 10939, pp. 373–385. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93040-4_30CrossRef

25.

Olvera-López, J.A., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Kittler, J.: A review of instance selection methods. AI Rev. 34(2), 133–143 (2010)

26.

Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)

27.

Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: EMNLP (2015)

28.

Settles, B.: Active learning. Synth. Lect. Artif. Intell. Mach. Learn. 6(1), 1–114 (2012)MathSciNetCrossRefMATH

29.

Socher, R., Huang, E.H., Pennin, J., Manning, C.D., Ng, A.Y.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: NIPS, pp. 801–809 (2011)

30.

Socher, R., Manning, C.D., Ng, A.Y.: Learning continuous phrase representations and syntactic parsing with recursive neural networks. In: NIPS Deep Learning and Unsupervised Feature Learning Workshop, pp. 1–9 (2010)

31.

Socher, R., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP, pp. 1631–1642 (2013)

32.

Taylor, A., Marcus, M., Santorini, B.: The penn treebank: an overview. In: Abeillé, A. (ed.) Treebanks. TLTB, pp. 5–22. Springer, Heidelberg (2003). https://doi.org/10.1007/978-94-010-0201-1_1CrossRef

33.

Tomlinson, S.: Learning task experiments in the TREC 2010 legal track. In: TREC (2010)

34.

Zhao, Z., Liu, T., Li, B., Du, X.: Cluster-driven model for improved word and text embedding. In: ECAI, pp. 99–106 (2016)

Titel: Selective Training: A Strategy for Fast Backpropagation on Sentence Embeddings
verfasst von: Jan Neerbek
Peter Dolog
Ira Assent
Verlag: Springer International Publishing
Buch: Advances in Knowledge Discovery and Data Mining
Print ISBN: 978-3-030-16141-5

Electronic ISBN: 978-3-030-16142-2

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-3-030-16142-2_4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"