Skip to main content

2016 | OriginalPaper | Buchkapitel

Multitask Learning for Text Classification with Deep Neural Networks

verfasst von : Hossein Ghodrati Noushahr, Samad Ahmadi

Erschienen in: Research and Development in Intelligent Systems XXXIII

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Multitask learning, the concept of solving multiple related tasks in parallel promises to improve generalization performance over the traditional divide-and-conquer approach in machine learning. The training signals of related tasks induce a bias that helps to find better hypotheses. This paper reviews the concept of multitask learning and prior work on it. An experimental evaluation is done on a large scale text classification problem. A deep neural network is trained to classify English newswire stories by their overlapping topics in parallel. The results are compared to the traditional approach of training a separate deep neural network for each topic separately. The results confirm the initial hypothesis that multitask learning improves generalization.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Caruana, R.: Multitask learning. Mach.Learn. 28(1), 41–75 (1997) Caruana, R.: Multitask learning. Mach.Learn. 28(1), 41–75 (1997)
2.
Zurück zum Zitat Collobert, R., Weston, J.: A unified architecture for natural language processing. In: Proceedings of the 25th International Conference on Machine learning—ICML ’08, pp. 160–167. ACM Press, New York, USA (2008) Collobert, R., Weston, J.: A unified architecture for natural language processing. In: Proceedings of the 25th International Conference on Machine learning—ICML ’08, pp. 160–167. ACM Press, New York, USA (2008)
3.
Zurück zum Zitat Hinton, G.E., Bengio, Y., Lecun, Y.: Deep Learning: NIPS 2015 Tutorial (2015) Hinton, G.E., Bengio, Y., Lecun, Y.: Deep Learning: NIPS 2015 Tutorial (2015)
4.
Zurück zum Zitat LeCun, Y., Bottou, L., Orr, G.B., Müller, K.R.: Efficient BackProp. In: Neural Networks: Tricks of the Trade, vol. 1524, chap. Efficient BackProp, pp. 9–50. Springer, Berlin, Heidelberg (1998) LeCun, Y., Bottou, L., Orr, G.B., Müller, K.R.: Efficient BackProp. In: Neural Networks: Tricks of the Trade, vol. 1524, chap. Efficient BackProp, pp. 9–50. Springer, Berlin, Heidelberg (1998)
5.
Zurück zum Zitat Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout : a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (JMLR) 15, 1929–1958 (2014)MathSciNetMATH Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout : a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (JMLR) 15, 1929–1958 (2014)MathSciNetMATH
6.
Zurück zum Zitat Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML Workshop on Deep Learning for Audio, Speech and Language Processing, vol. 28, p. 1 (2013) Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML Workshop on Deep Learning for Audio, Speech and Language Processing, vol. 28, p. 1 (2013)
7.
Zurück zum Zitat Hochreiter, S.: Recurrent neural net learning and vanishing gradient. Int. J. Uncertainity Fuzziness Knowl. Based Syst. 6(2), 8 (1998) Hochreiter, S.: Recurrent neural net learning and vanishing gradient. Int. J. Uncertainity Fuzziness Knowl. Based Syst. 6(2), 8 (1998)
8.
Zurück zum Zitat Bengio, Y., Simard, P., Frasconi, P.: Learning long term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRef Bengio, Y., Simard, P., Frasconi, P.: Learning long term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRef
9.
Zurück zum Zitat Glorot, X., Bordes, A., Bengio, Y.: Deep Sparse Rectifier Neural Networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS-11), vol. 15, pp. 315–323 (2011). (Journal of Machine Learning Research—Workshop and Conference Proceedings) Glorot, X., Bordes, A., Bengio, Y.: Deep Sparse Rectifier Neural Networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS-11), vol. 15, pp. 315–323 (2011). (Journal of Machine Learning Research—Workshop and Conference Proceedings)
10.
Zurück zum Zitat Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Adv. Neural Inf. Process. Syst. 19(1), 153–160 (2007) Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Adv. Neural Inf. Process. Syst. 19(1), 153–160 (2007)
11.
Zurück zum Zitat Hinton, G.E.: Learning multiple layers of representation. Trends Cogn. Sci. 11(10), 428–434 (2007)CrossRef Hinton, G.E.: Learning multiple layers of representation. Trends Cogn. Sci. 11(10), 428–434 (2007)CrossRef
12.
Zurück zum Zitat Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefMATH Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefMATH
13.
Zurück zum Zitat Jin, X., Xu, C., Feng, J., Wei, Y., Xiong, J., Yan, S.: Deep Learning with S-shaped Rectified Linear Activation Units. arXiv preprint arXiv:1512.07030 (2015) Jin, X., Xu, C., Feng, J., Wei, Y., Xiong, J., Yan, S.: Deep Learning with S-shaped Rectified Linear Activation Units. arXiv preprint arXiv:​1512.​07030 (2015)
14.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034. IEEE (2015) He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034. IEEE (2015)
15.
Zurück zum Zitat Witten, I.H.: Text mining. Practical Handbook of Internet Computing, pp. 14–1 (2005) Witten, I.H.: Text mining. Practical Handbook of Internet Computing, pp. 14–1 (2005)
16.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of the International Conference on Learning Representations (ICLR 2013), pp. 1–12 (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of the International Conference on Learning Representations (ICLR 2013), pp. 1–12 (2013)
17.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
18.
Zurück zum Zitat Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of NAACL-HLT, pp. 746–751 (2013) Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of NAACL-HLT, pp. 746–751 (2013)
19.
Zurück zum Zitat Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004) Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
20.
Zurück zum Zitat Powers, D.M.W.: Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)MathSciNet Powers, D.M.W.: Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)MathSciNet
21.
Zurück zum Zitat Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, É.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2012)MathSciNetMATH Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, É.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2012)MathSciNetMATH
22.
Zurück zum Zitat Theano Development team: theano: a python framework for fast computation of mathematical expressions. arXiv e-prints (2016) Theano Development team: theano: a python framework for fast computation of mathematical expressions. arXiv e-prints (2016)
Metadaten
Titel
Multitask Learning for Text Classification with Deep Neural Networks
verfasst von
Hossein Ghodrati Noushahr
Samad Ahmadi
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-47175-4_8