Skip to main content
Erschienen in: Memetic Computing 4/2020

19.10.2020 | Regular Research Paper

Multi-task gradient descent for multi-task learning

verfasst von: Lu Bai, Yew-Soon Ong, Tiantian He, Abhishek Gupta

Erschienen in: Memetic Computing | Ausgabe 4/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Multi-Task Learning (MTL) aims to simultaneously solve a group of related learning tasks by leveraging the salutary knowledge memes contained in the multiple tasks to improve the generalization performance. Many prevalent approaches focus on designing a sophisticated cost function, which integrates all the learning tasks and explores the task-task relationship in a predefined manner. Different from previous approaches, in this paper, we propose a novel Multi-task Gradient Descent (MGD) framework, which improves the generalization performance of multiple tasks through knowledge transfer. The uniqueness of MGD lies in assuming individual task-specific learning objectives at the start, but with the cost functions implicitly changing during the course of parameter optimization based on task-task relationships. Specifically, MGD optimizes the individual cost function of each task using a reformative gradient descent iteration, where relations to other tasks are facilitated through effectively transferring parameter values (serving as the computational representations of memes) from other tasks. Theoretical analysis shows that the proposed framework is convergent under any appropriate transfer mechanism. Compared with existing MTL approaches, MGD provides a novel easy-to-implement framework for MTL, which can mitigate negative transfer in the learning procedure by asymmetric transfer. The proposed MGD has been compared with both classical and state-of-the-art approaches on multiple MTL datasets. The competitive experimental results validate the effectiveness of the proposed algorithm.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Amaya JE, Cotta C, Fernández-Leiva AJ, García-Sánchez P (2020) Deep memetic models for combinatorial optimization problems: application to the tool switching problem. Memet Comput 12(1):3–22CrossRef Amaya JE, Cotta C, Fernández-Leiva AJ, García-Sánchez P (2020) Deep memetic models for combinatorial optimization problems: application to the tool switching problem. Memet Comput 12(1):3–22CrossRef
2.
Zurück zum Zitat Argyriou A, Evgeniou T, Pontil M (2007) Multi-task feature learning. Adv Neural Inf Process Syst 20:41–48 Argyriou A, Evgeniou T, Pontil M (2007) Multi-task feature learning. Adv Neural Inf Process Syst 20:41–48
3.
Zurück zum Zitat Bali KK, Ong Y-S, Gupta A, Tan PS (2019) Multifactorial evolutionary algorithm with online transfer parameter estimation: Mfea-ii. IEEE Trans Evol Comput 24(1):69–83CrossRef Bali KK, Ong Y-S, Gupta A, Tan PS (2019) Multifactorial evolutionary algorithm with online transfer parameter estimation: Mfea-ii. IEEE Trans Evol Comput 24(1):69–83CrossRef
4.
Zurück zum Zitat Basar T, Olsder GJ (1999) Dynamic noncooperative game theory, vol 23. SIAM, PhiladelphiaMATH Basar T, Olsder GJ (1999) Dynamic noncooperative game theory, vol 23. SIAM, PhiladelphiaMATH
5.
Zurück zum Zitat Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771CrossRef Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771CrossRef
6.
Zurück zum Zitat Chen J, Tang L, Liu J, Ye J (2009) A convex formulation for learning shared structures from multiple tasks. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 137–144 Chen J, Tang L, Liu J, Ye J (2009) A convex formulation for learning shared structures from multiple tasks. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 137–144
7.
Zurück zum Zitat Deng Z, Lu J, Wu D, Choi K-S, Sun S, Nojima Y (2019) Guest editorial: special issue on new advances in deep-transfer learning. IEEE Trans Emerg Top Comput Intell 3(5):357–359CrossRef Deng Z, Lu J, Wu D, Choi K-S, Sun S, Nojima Y (2019) Guest editorial: special issue on new advances in deep-transfer learning. IEEE Trans Emerg Top Comput Intell 3(5):357–359CrossRef
8.
Zurück zum Zitat Dinh TP, Thanh BHT, Ba TT, Binh LN (2020) Multifactorial evolutionary algorithm for solving clustered tree problems: competition among cayley codes. Memet Comput 12(3):185–217CrossRef Dinh TP, Thanh BHT, Ba TT, Binh LN (2020) Multifactorial evolutionary algorithm for solving clustered tree problems: competition among cayley codes. Memet Comput 12(3):185–217CrossRef
9.
Zurück zum Zitat Dong D, Wu H, He W, Yu D, Wang H (2015) Multi-task learning for multiple language translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, pp 1723–1732 Dong D, Wu H, He W, Yu D, Wang H (2015) Multi-task learning for multiple language translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, pp 1723–1732
10.
Zurück zum Zitat Duong L, Cohn T, Bird S, Cook P (2015) Low resource dependency parsing: cross-lingual parameter sharing in a neural network parser. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 2: short papers), pp 845–850 Duong L, Cohn T, Bird S, Cook P (2015) Low resource dependency parsing: cross-lingual parameter sharing in a neural network parser. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 2: short papers), pp 845–850
11.
Zurück zum Zitat Evgeniou T, Pontil M (2004) Regularized multi–task learning. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 109–117 Evgeniou T, Pontil M (2004) Regularized multi–task learning. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 109–117
12.
Zurück zum Zitat Facchinei F, Pang J-S (2007) Finite-dimensional variational inequalities and complementarity problems. Springer, BerlinMATH Facchinei F, Pang J-S (2007) Finite-dimensional variational inequalities and complementarity problems. Springer, BerlinMATH
13.
Zurück zum Zitat Feng L, An B, He S (2019) Collaboration based multi-label learning. In: Thirty-third AAAI conference on artificial intelligence Feng L, An B, He S (2019) Collaboration based multi-label learning. In: Thirty-third AAAI conference on artificial intelligence
14.
Zurück zum Zitat Feng L, Ong Y-S, Tan A-H, Tsang IW (2015) Memes as building blocks: a case study on evolutionary optimization + transfer learning for routing problems. Memet Comput 7(3):159–180CrossRef Feng L, Ong Y-S, Tan A-H, Tsang IW (2015) Memes as building blocks: a case study on evolutionary optimization + transfer learning for routing problems. Memet Comput 7(3):159–180CrossRef
15.
Zurück zum Zitat Fürnkranz J, Hüllermeier E, Mencía EL, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153CrossRef Fürnkranz J, Hüllermeier E, Mencía EL, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153CrossRef
16.
Zurück zum Zitat Görnitz N, Widmer C, Zeller G, Kahles A, Rätsch G, Sonnenburg S (2011) Hierarchical multitask structured output learning for large-scale sequence segmentation. Adv Neural Inf Process Syst 24:2690–2698 Görnitz N, Widmer C, Zeller G, Kahles A, Rätsch G, Sonnenburg S (2011) Hierarchical multitask structured output learning for large-scale sequence segmentation. Adv Neural Inf Process Syst 24:2690–2698
17.
Zurück zum Zitat Gupta A, Ong Y-S (2019) Memetic computation: the mainspring of knowledge transfer in a data-driven optimization era, vol 21. Springer Gupta A, Ong Y-S (2019) Memetic computation: the mainspring of knowledge transfer in a data-driven optimization era, vol 21. Springer
18.
Zurück zum Zitat Gupta A, Ong Y-S, Feng L (2015) Multifactorial evolution: toward evolutionary multitasking. IEEE Trans Evol Comput 20(3):343–357CrossRef Gupta A, Ong Y-S, Feng L (2015) Multifactorial evolution: toward evolutionary multitasking. IEEE Trans Evol Comput 20(3):343–357CrossRef
19.
Zurück zum Zitat Gupta A, Ong Y-S, Feng L (2017) Insights on transfer optimization: because experience is the best teacher. IEEE Trans Emerg Top Comput Intell 2(1):51–64CrossRef Gupta A, Ong Y-S, Feng L (2017) Insights on transfer optimization: because experience is the best teacher. IEEE Trans Emerg Top Comput Intell 2(1):51–64CrossRef
20.
Zurück zum Zitat Han L, Zhang Y, Song G, Xie K (2014) Encoding tree sparsity in multi-task learning: a probabilistic framework. In: Twenty-eighth AAAI conference on artificial intelligence Han L, Zhang Y, Song G, Xie K (2014) Encoding tree sparsity in multi-task learning: a probabilistic framework. In: Twenty-eighth AAAI conference on artificial intelligence
21.
Zurück zum Zitat He T, Liu Y, Ko T-H, Chan K-C, Ong Y-S (2019) Contextual correlation preserving multiview featured graph clustering. IEEE trans cybern 50(10):4318–4331CrossRef He T, Liu Y, Ko T-H, Chan K-C, Ong Y-S (2019) Contextual correlation preserving multiview featured graph clustering. IEEE trans cybern 50(10):4318–4331CrossRef
22.
Zurück zum Zitat He T, Bai L, Ong Y-S (2019) Manifold regularized stochastic block model. In: International Conference on Tools with Artificial Intelligence, pp 800–807 He T, Bai L, Ong Y-S (2019) Manifold regularized stochastic block model. In: International Conference on Tools with Artificial Intelligence, pp 800–807
23.
Zurück zum Zitat Hou J-C, Wang S-S, Lai Y-H, Tsao Y, Chang H-W, Wang H-M (2018) Audio-visual speech enhancement using multimodal deep convolutional neural networks. IEEE Trans Emerg Top Comput Intell 2(2):117–128CrossRef Hou J-C, Wang S-S, Lai Y-H, Tsao Y, Chang H-W, Wang H-M (2018) Audio-visual speech enhancement using multimodal deep convolutional neural networks. IEEE Trans Emerg Top Comput Intell 2(2):117–128CrossRef
24.
Zurück zum Zitat Huang J, Li G, Huang Q, Wu X (2016) Learning label-specific features and class-dependent labels for multi-label classification. IEEE Trans Knowl Data Eng 28(12):3309–3323CrossRef Huang J, Li G, Huang Q, Wu X (2016) Learning label-specific features and class-dependent labels for multi-label classification. IEEE Trans Knowl Data Eng 28(12):3309–3323CrossRef
25.
Zurück zum Zitat Huang J, Li G, Huang Q, Wu X (2018) Joint feature selection and classification for multilabel learning. IEEE Trans Cybern 48(3):876–889CrossRef Huang J, Li G, Huang Q, Wu X (2018) Joint feature selection and classification for multilabel learning. IEEE Trans Cybern 48(3):876–889CrossRef
26.
Zurück zum Zitat Huang S-J, Zhou Z-H (2012) Multi-label learning by exploiting label correlations locally. In: Twenty-sixth AAAI conference on artificial intelligence Huang S-J, Zhou Z-H (2012) Multi-label learning by exploiting label correlations locally. In: Twenty-sixth AAAI conference on artificial intelligence
27.
Zurück zum Zitat Kato T, Kashima H, Sugiyama M, Asai K (2008) Multi-task learning via conic programming. Adv Neural Inf Process Syst 21:737–744 Kato T, Kashima H, Sugiyama M, Asai K (2008) Multi-task learning via conic programming. Adv Neural Inf Process Syst 21:737–744
28.
Zurück zum Zitat Kendall A, Gal Y, Cipolla R (2018) Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: Proceedings of the IEEE conference on computer vision and pattern recognition 7482–7491 Kendall A, Gal Y, Cipolla R (2018) Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: Proceedings of the IEEE conference on computer vision and pattern recognition 7482–7491
29.
Zurück zum Zitat LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef
30.
Zurück zum Zitat Lee G, Yang E, Hwang S (2016) Asymmetric multi-task learning based on task relatedness and loss. In: International conference on machine learning, pp 230–238 Lee G, Yang E, Hwang S (2016) Asymmetric multi-task learning based on task relatedness and loss. In: International conference on machine learning, pp 230–238
31.
Zurück zum Zitat Liu H, Palatucci M, Zhang J (2009) Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 649–656 Liu H, Palatucci M, Zhang J (2009) Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 649–656
32.
Zurück zum Zitat Liu W, Mei T, Zhang Y, Che C, Luo J (2015) Multi-task deep visual-semantic embedding for video thumbnail selection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3707–3715 Liu W, Mei T, Zhang Y, Che C, Luo J (2015) Multi-task deep visual-semantic embedding for video thumbnail selection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3707–3715
33.
Zurück zum Zitat Obozinski G, Taskar B, Jordan M (2006) Multi-task feature selection. Statistics Department, UC Berkeley, Tech. Rep, 2(2.2):2 Obozinski G, Taskar B, Jordan M (2006) Multi-task feature selection. Statistics Department, UC Berkeley, Tech. Rep, 2(2.2):2
34.
Zurück zum Zitat Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef
35.
Zurück zum Zitat Ramsundar B, Kearnes S, Riley P, Webster D, Konerding D, Pande V (2015) Massively multitask networks for drug discovery. arXiv preprint arXiv:1502.02072 Ramsundar B, Kearnes S, Riley P, Webster D, Konerding D, Pande V (2015) Massively multitask networks for drug discovery. arXiv preprint arXiv:​1502.​02072
36.
Zurück zum Zitat Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333MathSciNetCrossRef Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333MathSciNetCrossRef
38.
Zurück zum Zitat Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. Adv Neural Inf Process Syst 3856–3866 Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. Adv Neural Inf Process Syst 3856–3866
39.
Zurück zum Zitat Sayed AH (2014) Diffusion adaptation over networks. In: Academic Press library in signal processing, vol 3. Elsevier, pp 323–453 Sayed AH (2014) Diffusion adaptation over networks. In: Academic Press library in signal processing, vol 3. Elsevier, pp 323–453
40.
Zurück zum Zitat Schmidt M, Fung G, Rosales R (2007) Fast optimization methods for l1 regularization: a comparative study and two new approaches. In: European conference on machine learning. Springer, pp 286–297 Schmidt M, Fung G, Rosales R (2007) Fast optimization methods for l1 regularization: a comparative study and two new approaches. In: European conference on machine learning. Springer, pp 286–297
41.
Zurück zum Zitat Sener O, Koltun V (2018) Multi-task learning as multi-objective optimization. Adv Neural Inf Process Syst 525–536 Sener O, Koltun V (2018) Multi-task learning as multi-objective optimization. Adv Neural Inf Process Syst 525–536
42.
Zurück zum Zitat Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Data mining and knowledge discovery handbook. Springer, pp 667–685 Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Data mining and knowledge discovery handbook. Springer, pp 667–685
43.
Zurück zum Zitat Tsoumakas G, Katakis I, Vlahavas I (2010) Random k-labelsets for multilabel classification. IEEE Trans Knowl Data Eng 23(7):1079–1089CrossRef Tsoumakas G, Katakis I, Vlahavas I (2010) Random k-labelsets for multilabel classification. IEEE Trans Knowl Data Eng 23(7):1079–1089CrossRef
45.
Zurück zum Zitat Zhang M-L, Wu L (2014) Lift: multi-label learning with label-specific features. IEEE Trans Pattern Anal Mach Intell 37(1):107–120CrossRef Zhang M-L, Wu L (2014) Lift: multi-label learning with label-specific features. IEEE Trans Pattern Anal Mach Intell 37(1):107–120CrossRef
46.
Zurück zum Zitat Zhang M-L, Zhou Z-H (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837CrossRef Zhang M-L, Zhou Z-H (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837CrossRef
47.
Zurück zum Zitat Zhang W, Li R, Zeng T, Sun Q, Kumar S, Ye J, Ji S (2016) Deep model based transfer and multi-task learning for biological image analysis. IEEE Trans Big Data 6(2):322–333CrossRef Zhang W, Li R, Zeng T, Sun Q, Kumar S, Ye J, Ji S (2016) Deep model based transfer and multi-task learning for biological image analysis. IEEE Trans Big Data 6(2):322–333CrossRef
48.
Zurück zum Zitat Zhang X, Yang Z, Cao F, Cao J-Z, Wang M, Cai N (2020) Conditioning optimization of extreme learning machine by multitask beetle antennae swarm algorithm. Memet Comput 12(2):151–164CrossRef Zhang X, Yang Z, Cao F, Cao J-Z, Wang M, Cai N (2020) Conditioning optimization of extreme learning machine by multitask beetle antennae swarm algorithm. Memet Comput 12(2):151–164CrossRef
49.
Zurück zum Zitat Zhang X, Zhuang Y, Wang W, Pedrycz W (2016) Transfer boosting with synthetic instances for class imbalanced object recognition. IEEE Trans Cybern 48(1):357–370CrossRef Zhang X, Zhuang Y, Wang W, Pedrycz W (2016) Transfer boosting with synthetic instances for class imbalanced object recognition. IEEE Trans Cybern 48(1):357–370CrossRef
50.
Zurück zum Zitat Zhang Y, Yang Q (2017) Learning sparse task relations in multi-task learning. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, pp 2914–2920 Zhang Y, Yang Q (2017) Learning sparse task relations in multi-task learning. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, pp 2914–2920
52.
Zurück zum Zitat Zhang Y, Yeung D-Y (2013) Multilabel relationship learning. ACM Trans Knowl Discov Data (TKDD) 7(2):1–30CrossRef Zhang Y, Yeung D-Y (2013) Multilabel relationship learning. ACM Trans Knowl Discov Data (TKDD) 7(2):1–30CrossRef
53.
Zurück zum Zitat Zhang Y, Yeung D-Y (2014) A regularization approach to learning task relationships in multitask learning. ACM Trans Knowl Discov Data (TKDD) 8(3):12 Zhang Y, Yeung D-Y (2014) A regularization approach to learning task relationships in multitask learning. ACM Trans Knowl Discov Data (TKDD) 8(3):12
54.
Zurück zum Zitat Zhang Z, Luo P, Loy CC, Tang X (2014) Facial landmark detection by deep multi-task learning. In: European conference on computer vision. Springer, pp 94–108 Zhang Z, Luo P, Loy CC, Tang X (2014) Facial landmark detection by deep multi-task learning. In: European conference on computer vision. Springer, pp 94–108
55.
Zurück zum Zitat Zhu Y, Kwok JT, Zhou Z-H (2018) Multi-label learning with global and local label correlation. IEEE Trans Knowl Data Eng 30(6):1081–1094CrossRef Zhu Y, Kwok JT, Zhou Z-H (2018) Multi-label learning with global and local label correlation. IEEE Trans Knowl Data Eng 30(6):1081–1094CrossRef
Metadaten
Titel
Multi-task gradient descent for multi-task learning
verfasst von
Lu Bai
Yew-Soon Ong
Tiantian He
Abhishek Gupta
Publikationsdatum
19.10.2020
Verlag
Springer Berlin Heidelberg
Erschienen in
Memetic Computing / Ausgabe 4/2020
Print ISSN: 1865-9284
Elektronische ISSN: 1865-9292
DOI
https://doi.org/10.1007/s12293-020-00316-3

Weitere Artikel der Ausgabe 4/2020

Memetic Computing 4/2020 Zur Ausgabe

Premium Partner