nach oben

Memetic Computing

Erschienen in:

19.10.2020 | Regular Research Paper

Multi-task gradient descent for multi-task learning

verfasst von: Lu Bai, Yew-Soon Ong, Tiantian He, Abhishek Gupta

Erschienen in: Memetic Computing | Ausgabe 4/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Multi-Task Learning (MTL) aims to simultaneously solve a group of related learning tasks by leveraging the salutary knowledge memes contained in the multiple tasks to improve the generalization performance. Many prevalent approaches focus on designing a sophisticated cost function, which integrates all the learning tasks and explores the task-task relationship in a predefined manner. Different from previous approaches, in this paper, we propose a novel Multi-task Gradient Descent (MGD) framework, which improves the generalization performance of multiple tasks through knowledge transfer. The uniqueness of MGD lies in assuming individual task-specific learning objectives at the start, but with the cost functions implicitly changing during the course of parameter optimization based on task-task relationships. Specifically, MGD optimizes the individual cost function of each task using a reformative gradient descent iteration, where relations to other tasks are facilitated through effectively transferring parameter values (serving as the computational representations of memes) from other tasks. Theoretical analysis shows that the proposed framework is convergent under any appropriate transfer mechanism. Compared with existing MTL approaches, MGD provides a novel easy-to-implement framework for MTL, which can mitigate negative transfer in the learning procedure by asymmetric transfer. The proposed MGD has been compared with both classical and state-of-the-art approaches on multiple MTL datasets. The competitive experimental results validate the effectiveness of the proposed algorithm.

Vorheriger Artikel A unified linear convergence analysis of k-SVD

Nächster Artikel Chaotic-based grey wolf optimizer for numerical and engineering optimization problems

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

http://mulan.sourceforge.net/datasets-mlc.html

Amaya JE, Cotta C, Fernández-Leiva AJ, García-Sánchez P (2020) Deep memetic models for combinatorial optimization problems: application to the tool switching problem. Memet Comput 12(1):3–22CrossRef

Argyriou A, Evgeniou T, Pontil M (2007) Multi-task feature learning. Adv Neural Inf Process Syst 20:41–48

Bali KK, Ong Y-S, Gupta A, Tan PS (2019) Multifactorial evolutionary algorithm with online transfer parameter estimation: Mfea-ii. IEEE Trans Evol Comput 24(1):69–83CrossRef

Basar T, Olsder GJ (1999) Dynamic noncooperative game theory, vol 23. SIAM, PhiladelphiaMATH

Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771CrossRef

Chen J, Tang L, Liu J, Ye J (2009) A convex formulation for learning shared structures from multiple tasks. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 137–144

Deng Z, Lu J, Wu D, Choi K-S, Sun S, Nojima Y (2019) Guest editorial: special issue on new advances in deep-transfer learning. IEEE Trans Emerg Top Comput Intell 3(5):357–359CrossRef

Dinh TP, Thanh BHT, Ba TT, Binh LN (2020) Multifactorial evolutionary algorithm for solving clustered tree problems: competition among cayley codes. Memet Comput 12(3):185–217CrossRef

Dong D, Wu H, He W, Yu D, Wang H (2015) Multi-task learning for multiple language translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, pp 1723–1732

10.

Duong L, Cohn T, Bird S, Cook P (2015) Low resource dependency parsing: cross-lingual parameter sharing in a neural network parser. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 2: short papers), pp 845–850

11.

Evgeniou T, Pontil M (2004) Regularized multi–task learning. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 109–117

12.

Facchinei F, Pang J-S (2007) Finite-dimensional variational inequalities and complementarity problems. Springer, BerlinMATH

13.

Feng L, An B, He S (2019) Collaboration based multi-label learning. In: Thirty-third AAAI conference on artificial intelligence

14.

Feng L, Ong Y-S, Tan A-H, Tsang IW (2015) Memes as building blocks: a case study on evolutionary optimization + transfer learning for routing problems. Memet Comput 7(3):159–180CrossRef

15.

Fürnkranz J, Hüllermeier E, Mencía EL, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153CrossRef

16.

Görnitz N, Widmer C, Zeller G, Kahles A, Rätsch G, Sonnenburg S (2011) Hierarchical multitask structured output learning for large-scale sequence segmentation. Adv Neural Inf Process Syst 24:2690–2698

17.

Gupta A, Ong Y-S (2019) Memetic computation: the mainspring of knowledge transfer in a data-driven optimization era, vol 21. Springer

18.

Gupta A, Ong Y-S, Feng L (2015) Multifactorial evolution: toward evolutionary multitasking. IEEE Trans Evol Comput 20(3):343–357CrossRef

19.

Gupta A, Ong Y-S, Feng L (2017) Insights on transfer optimization: because experience is the best teacher. IEEE Trans Emerg Top Comput Intell 2(1):51–64CrossRef

20.

Han L, Zhang Y, Song G, Xie K (2014) Encoding tree sparsity in multi-task learning: a probabilistic framework. In: Twenty-eighth AAAI conference on artificial intelligence

21.

He T, Liu Y, Ko T-H, Chan K-C, Ong Y-S (2019) Contextual correlation preserving multiview featured graph clustering. IEEE trans cybern 50(10):4318–4331CrossRef

22.

He T, Bai L, Ong Y-S (2019) Manifold regularized stochastic block model. In: International Conference on Tools with Artificial Intelligence, pp 800–807

23.

Hou J-C, Wang S-S, Lai Y-H, Tsao Y, Chang H-W, Wang H-M (2018) Audio-visual speech enhancement using multimodal deep convolutional neural networks. IEEE Trans Emerg Top Comput Intell 2(2):117–128CrossRef

24.

Huang J, Li G, Huang Q, Wu X (2016) Learning label-specific features and class-dependent labels for multi-label classification. IEEE Trans Knowl Data Eng 28(12):3309–3323CrossRef

25.

Huang J, Li G, Huang Q, Wu X (2018) Joint feature selection and classification for multilabel learning. IEEE Trans Cybern 48(3):876–889CrossRef

26.

Huang S-J, Zhou Z-H (2012) Multi-label learning by exploiting label correlations locally. In: Twenty-sixth AAAI conference on artificial intelligence

27.

Kato T, Kashima H, Sugiyama M, Asai K (2008) Multi-task learning via conic programming. Adv Neural Inf Process Syst 21:737–744

28.

Kendall A, Gal Y, Cipolla R (2018) Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: Proceedings of the IEEE conference on computer vision and pattern recognition 7482–7491

29.

LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef

30.

Lee G, Yang E, Hwang S (2016) Asymmetric multi-task learning based on task relatedness and loss. In: International conference on machine learning, pp 230–238

31.

Liu H, Palatucci M, Zhang J (2009) Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 649–656

32.

Liu W, Mei T, Zhang Y, Che C, Luo J (2015) Multi-task deep visual-semantic embedding for video thumbnail selection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3707–3715

33.

Obozinski G, Taskar B, Jordan M (2006) Multi-task feature selection. Statistics Department, UC Berkeley, Tech. Rep, 2(2.2):2

34.

Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef

35.

Ramsundar B, Kearnes S, Riley P, Webster D, Konerding D, Pande V (2015) Massively multitask networks for drug discovery. arXiv preprint arXiv:1502.02072

36.

Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333MathSciNetCrossRef

37.

Ruder S (2017). An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098

38.

Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. Adv Neural Inf Process Syst 3856–3866

39.

Sayed AH (2014) Diffusion adaptation over networks. In: Academic Press library in signal processing, vol 3. Elsevier, pp 323–453

40.

Schmidt M, Fung G, Rosales R (2007) Fast optimization methods for l1 regularization: a comparative study and two new approaches. In: European conference on machine learning. Springer, pp 286–297

41.

Sener O, Koltun V (2018) Multi-task learning as multi-objective optimization. Adv Neural Inf Process Syst 525–536

42.

Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Data mining and knowledge discovery handbook. Springer, pp 667–685

43.

Tsoumakas G, Katakis I, Vlahavas I (2010) Random k-labelsets for multilabel classification. IEEE Trans Knowl Data Eng 23(7):1079–1089CrossRef

44.

Yang Y, Hospedales TM (2016) Trace norm regularised deep multi-task learning. arXiv preprint arXiv:1606.04038

45.

Zhang M-L, Wu L (2014) Lift: multi-label learning with label-specific features. IEEE Trans Pattern Anal Mach Intell 37(1):107–120CrossRef

46.

Zhang M-L, Zhou Z-H (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837CrossRef

47.

Zhang W, Li R, Zeng T, Sun Q, Kumar S, Ye J, Ji S (2016) Deep model based transfer and multi-task learning for biological image analysis. IEEE Trans Big Data 6(2):322–333CrossRef

48.

Zhang X, Yang Z, Cao F, Cao J-Z, Wang M, Cai N (2020) Conditioning optimization of extreme learning machine by multitask beetle antennae swarm algorithm. Memet Comput 12(2):151–164CrossRef

49.

Zhang X, Zhuang Y, Wang W, Pedrycz W (2016) Transfer boosting with synthetic instances for class imbalanced object recognition. IEEE Trans Cybern 48(1):357–370CrossRef

50.

Zhang Y, Yang Q (2017) Learning sparse task relations in multi-task learning. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, pp 2914–2920

51.

Zhang Y, Yang Q (2017) A survey on multi-task learning. arXiv preprint arXiv:1707.08114

52.

Zhang Y, Yeung D-Y (2013) Multilabel relationship learning. ACM Trans Knowl Discov Data (TKDD) 7(2):1–30CrossRef

53.

Zhang Y, Yeung D-Y (2014) A regularization approach to learning task relationships in multitask learning. ACM Trans Knowl Discov Data (TKDD) 8(3):12

54.

Zhang Z, Luo P, Loy CC, Tang X (2014) Facial landmark detection by deep multi-task learning. In: European conference on computer vision. Springer, pp 94–108

55.

Zhu Y, Kwok JT, Zhou Z-H (2018) Multi-label learning with global and local label correlation. IEEE Trans Knowl Data Eng 30(6):1081–1094CrossRef

Titel: Multi-task gradient descent for multi-task learning
verfasst von: Lu Bai
Yew-Soon Ong
Tiantian He
Abhishek Gupta
Publikationsdatum: 19.10.2020
Verlag: Springer Berlin Heidelberg
Erschienen in: Memetic Computing / Ausgabe 4/2020
Print ISSN: 1865-9284
Elektronische ISSN: 1865-9292
DOI: https://doi.org/10.1007/s12293-020-00316-3

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2020

Feature selection based bee swarm meta-heuristic approach for combinatorial optimisation problems: a case-study on MaxSAT

A unified linear convergence analysis of k-SVD

An efficient memetic genetic programming framework for symbolic regression

Designing optimal combination therapy for personalised glioma treatment

Cooperative co-evolutionary comprehensive learning particle swarm optimizer for formulation design of explosive simulant

Evolution of biocoenosis through symbiosis with fitness approximation for many-tasking optimization

Premium Partner