Skip to main content
Erschienen in: Knowledge and Information Systems 2/2016

01.02.2016 | Regular Paper

Multiple task transfer learning with small sample sizes

verfasst von: Budhaditya Saha, Sunil Gupta, Dinh Phung, Svetha Venkatesh

Erschienen in: Knowledge and Information Systems | Ausgabe 2/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Prognosis, such as predicting mortality, is common in medicine. When confronted with small numbers of samples, as in rare medical conditions, the task is challenging. We propose a framework for classification with data with small numbers of samples. Conceptually, our solution is a hybrid of multi-task and transfer learning, employing data samples from source tasks as in transfer learning, but considering all tasks together as in multi-task learning. Each task is modelled jointly with other related tasks by directly augmenting the data from other tasks. The degree of augmentation depends on the task relatedness and is estimated directly from the data. We apply the model on three diverse real-world data sets (healthcare data, handwritten digit data and face data) and show that our method outperforms several state-of-the-art multi-task learning baselines. We extend the model for online multi-task learning where the model parameters are incrementally updated given new data or new tasks. The novelty of our method lies in offering a hybrid multi-task/transfer learning model to exploit sharing across tasks at the data-level and joint parameter learning.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
A detail description on optimization methods can be found in [25].
 
2
Ethics approval obtained through University and the hospital—12/83.
 
Literatur
1.
Zurück zum Zitat Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272CrossRef Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272CrossRef
2.
Zurück zum Zitat Argyriou A, Pontil M, Ying Y, Charles MA (2007) A spectral regularization framework for multi-task structure learning. In: Advances in neural information processing systems, pp 25–32 Argyriou A, Pontil M, Ying Y, Charles MA (2007) A spectral regularization framework for multi-task structure learning. In: Advances in neural information processing systems, pp 25–32
3.
Zurück zum Zitat Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning, vol 1. Springer, New YorkMATH Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning, vol 1. Springer, New YorkMATH
4.
Zurück zum Zitat Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on computational learning theory. ACM, pp 92–100 Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on computational learning theory. ACM, pp 92–100
5.
Zurück zum Zitat Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learning 3(1):1–122CrossRef Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learning 3(1):1–122CrossRef
6.
Zurück zum Zitat Chelba C, Acero A (2006) Adaptation of maximum entropy capitalizer: little data can help a lot. Comput speech Lang 20(4):382–399CrossRef Chelba C, Acero A (2006) Adaptation of maximum entropy capitalizer: little data can help a lot. Comput speech Lang 20(4):382–399CrossRef
7.
Zurück zum Zitat Chen M, Weinberger KQ, Blitzer J (2011) Co-training for domain adaptation. In: NIPS, pp 2456–2464 Chen M, Weinberger KQ, Blitzer J (2011) Co-training for domain adaptation. In: NIPS, pp 2456–2464
8.
Zurück zum Zitat Cover TM, Thomas JA (2012) Elements of information theory. Wiley, New York Cover TM, Thomas JA (2012) Elements of information theory. Wiley, New York
9.
Zurück zum Zitat Daumé III H (2009) Bayesian multitask learning with latent hierarchies. In: Proceedings of the 25th conference on uncertainty in artificial intelligence, pp 135–142 Daumé III H (2009) Bayesian multitask learning with latent hierarchies. In: Proceedings of the 25th conference on uncertainty in artificial intelligence, pp 135–142
10.
Zurück zum Zitat Duan L, Xu D, Tsang IW (2012) Domain adaptation from multiple sources: a domain-dependent regularization approach. IEEE Trans Neural Netw Learn Syst 23(3):504–518CrossRef Duan L, Xu D, Tsang IW (2012) Domain adaptation from multiple sources: a domain-dependent regularization approach. IEEE Trans Neural Netw Learn Syst 23(3):504–518CrossRef
11.
Zurück zum Zitat Eaton E, Ruvolo PL (2013) Ella: an efficient lifelong learning algorithm. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 507–515 Eaton E, Ruvolo PL (2013) Ella: an efficient lifelong learning algorithm. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 507–515
12.
Zurück zum Zitat Evgeniou A, Pontil M (2007) Multi-task feature learning. In: Advances in neural information processing systems: proceedings of the 2006 conference, vol 19. The MIT Press, p 41 Evgeniou A, Pontil M (2007) Multi-task feature learning. In: Advances in neural information processing systems: proceedings of the 2006 conference, vol 19. The MIT Press, p 41
13.
Zurück zum Zitat Evgeniou T, Pontil M (2004) Regularized multi-task learning. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 109–117 Evgeniou T, Pontil M (2004) Regularized multi-task learning. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 109–117
14.
Zurück zum Zitat Gao J, Fan W, Jiang J, Han J (2008) Knowledge transfer via multiple model local structure mapping. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 283–291 Gao J, Fan W, Jiang J, Han J (2008) Knowledge transfer via multiple model local structure mapping. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 283–291
15.
Zurück zum Zitat Gong P, Ye J, Zhang C (2012) Robust multi-task feature learning. In: Proceedings of the 18th ACM SIGKDD. ACM, pp 895–903 Gong P, Ye J, Zhang C (2012) Robust multi-task feature learning. In: Proceedings of the 18th ACM SIGKDD. ACM, pp 895–903
16.
Zurück zum Zitat Gupta S, Phung D, Venkatesh S (2012) A bayesian nonparametric joint factor model for learning shared and individual subspaces from multiple data sources. In: Proceedings of the SDM, pp 200–211 Gupta S, Phung D, Venkatesh S (2012) A bayesian nonparametric joint factor model for learning shared and individual subspaces from multiple data sources. In: Proceedings of the SDM, pp 200–211
17.
Zurück zum Zitat Gupta S, Phung D, Venkatesh S (2013) Factorial multi-task learning: a bayesian nonparametric approach. In: Proceedings of international conference on machine learning, pp 657–665 Gupta S, Phung D, Venkatesh S (2013) Factorial multi-task learning: a bayesian nonparametric approach. In: Proceedings of international conference on machine learning, pp 657–665
18.
Zurück zum Zitat Hastie T, Tibshirani R, Jerome J, Friedman H (2001) The elements of statistical learning, vol 1. Springer, New YorkMATHCrossRef Hastie T, Tibshirani R, Jerome J, Friedman H (2001) The elements of statistical learning, vol 1. Springer, New YorkMATHCrossRef
19.
Zurück zum Zitat Jalali A, Sanghavi S, Ruan C, Ravikumar PK (2010) A dirty model for multi-task learning. In: Neural information processing systems, pp 964–972 Jalali A, Sanghavi S, Ruan C, Ravikumar PK (2010) A dirty model for multi-task learning. In: Neural information processing systems, pp 964–972
20.
Zurück zum Zitat Jebara T (2004) Multi-task feature and kernel selection for SVMs. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 55 Jebara T (2004) Multi-task feature and kernel selection for SVMs. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 55
21.
Zurück zum Zitat Ji S, Ye J (2009) An accelerated gradient method for trace norm minimization. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 457–464 Ji S, Ye J (2009) An accelerated gradient method for trace norm minimization. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 457–464
22.
Zurück zum Zitat Kang Z, Grauman K, Sha F (2011) Learning with whom to share in multi-task feature learning. In: Proceedings of the 28th international conference on machine learning, pp 521–528 Kang Z, Grauman K, Sha F (2011) Learning with whom to share in multi-task feature learning. In: Proceedings of the 28th international conference on machine learning, pp 521–528
23.
Zurück zum Zitat Lawrence ND, Platt JC (2004) Learning to learn with the informative vector machine. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 65 Lawrence ND, Platt JC (2004) Learning to learn with the informative vector machine. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 65
24.
Zurück zum Zitat Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l 2, 1-norm minimization. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, pp 339–348 Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l 2, 1-norm minimization. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, pp 339–348
25.
Zurück zum Zitat Liu J, Chen J, Ye J (2009) Large-scale sparse logistic regression. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 547–556 Liu J, Chen J, Ye J (2009) Large-scale sparse logistic regression. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 547–556
27.
Zurück zum Zitat Nesterov Y, Nesterov UE (2004) Introductory lectures on convex optimization: a basic course, vol 87. Springer, Berlin Nesterov Y, Nesterov UE (2004) Introductory lectures on convex optimization: a basic course, vol 87. Springer, Berlin
28.
Zurück zum Zitat Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef
29.
Zurück zum Zitat Rai P, Daume H (2010) Infinite predictor subspace models for multitask learning. In: International conference on artificial intelligence and statistics, pp 613–620 Rai P, Daume H (2010) Infinite predictor subspace models for multitask learning. In: International conference on artificial intelligence and statistics, pp 613–620
30.
Zurück zum Zitat Raudys SJ, Jain AK (1991) Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Trans Pattern Anal Mach Intell 13(3):252–264CrossRef Raudys SJ, Jain AK (1991) Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Trans Pattern Anal Mach Intell 13(3):252–264CrossRef
31.
Zurück zum Zitat Schmidt M (2010) Graphical model structure learning with l1-regularization. PhD thesis, The University of British Columbia Schmidt M (2010) Graphical model structure learning with l1-regularization. PhD thesis, The University of British Columbia
32.
Zurück zum Zitat Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodological) 58:267–288MATHMathSciNet Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodological) 58:267–288MATHMathSciNet
33.
Zurück zum Zitat Xue Y, Liao X, Carin L, Krishnapuram B (2007) Multi-task learning for classification with dirichlet process priors. J Mach Learn Res 8:35–63MATHMathSciNet Xue Y, Liao X, Carin L, Krishnapuram B (2007) Multi-task learning for classification with dirichlet process priors. J Mach Learn Res 8:35–63MATHMathSciNet
34.
Zurück zum Zitat Yang H, Lyu MR, King I (2013) Efficient online learning for multitask feature selection. ACM Trans Knowl Discov Data 7(2):6:1–6:27CrossRef Yang H, Lyu MR, King I (2013) Efficient online learning for multitask feature selection. ACM Trans Knowl Discov Data 7(2):6:1–6:27CrossRef
35.
Zurück zum Zitat Zhang Y, Yeung D-Y (2010) A convex formulation for learning task relationships in multi-task learning. In: UAI, pp 733–442 Zhang Y, Yeung D-Y (2010) A convex formulation for learning task relationships in multi-task learning. In: UAI, pp 733–442
Metadaten
Titel
Multiple task transfer learning with small sample sizes
verfasst von
Budhaditya Saha
Sunil Gupta
Dinh Phung
Svetha Venkatesh
Publikationsdatum
01.02.2016
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 2/2016
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-015-0821-z

Weitere Artikel der Ausgabe 2/2016

Knowledge and Information Systems 2/2016 Zur Ausgabe

Premium Partner