nach oben

Knowledge and Information Systems

Erschienen in:

01.02.2016 | Regular Paper

Multiple task transfer learning with small sample sizes

verfasst von: Budhaditya Saha, Sunil Gupta, Dinh Phung, Svetha Venkatesh

Erschienen in: Knowledge and Information Systems | Ausgabe 2/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Prognosis, such as predicting mortality, is common in medicine. When confronted with small numbers of samples, as in rare medical conditions, the task is challenging. We propose a framework for classification with data with small numbers of samples. Conceptually, our solution is a hybrid of multi-task and transfer learning, employing data samples from source tasks as in transfer learning, but considering all tasks together as in multi-task learning. Each task is modelled jointly with other related tasks by directly augmenting the data from other tasks. The degree of augmentation depends on the task relatedness and is estimated directly from the data. We apply the model on three diverse real-world data sets (healthcare data, handwritten digit data and face data) and show that our method outperforms several state-of-the-art multi-task learning baselines. We extend the model for online multi-task learning where the model parameters are incrementally updated given new data or new tasks. The novelty of our method lies in offering a hybrid multi-task/transfer learning model to exploit sharing across tasks at the data-level and joint parameter learning.

Vorheriger Artikel Efficient entity resolution based on subgraph cohesion

Nächster Artikel HICC: an entropy splitting-based framework for hierarchical co-clustering

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

A detail description on optimization methods can be found in [25].

Ethics approval obtained through University and the hospital—12/83.

http://www.cad.zju.edu.cn/home/dengcai/Data/FaceData.html.

Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272CrossRef

Argyriou A, Pontil M, Ying Y, Charles MA (2007) A spectral regularization framework for multi-task structure learning. In: Advances in neural information processing systems, pp 25–32

Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning, vol 1. Springer, New YorkMATH

Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on computational learning theory. ACM, pp 92–100

Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learning 3(1):1–122CrossRef

Chelba C, Acero A (2006) Adaptation of maximum entropy capitalizer: little data can help a lot. Comput speech Lang 20(4):382–399CrossRef

Chen M, Weinberger KQ, Blitzer J (2011) Co-training for domain adaptation. In: NIPS, pp 2456–2464

Cover TM, Thomas JA (2012) Elements of information theory. Wiley, New York

Daumé III H (2009) Bayesian multitask learning with latent hierarchies. In: Proceedings of the 25th conference on uncertainty in artificial intelligence, pp 135–142

10.

Duan L, Xu D, Tsang IW (2012) Domain adaptation from multiple sources: a domain-dependent regularization approach. IEEE Trans Neural Netw Learn Syst 23(3):504–518CrossRef

11.

Eaton E, Ruvolo PL (2013) Ella: an efficient lifelong learning algorithm. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 507–515

12.

Evgeniou A, Pontil M (2007) Multi-task feature learning. In: Advances in neural information processing systems: proceedings of the 2006 conference, vol 19. The MIT Press, p 41

13.

Evgeniou T, Pontil M (2004) Regularized multi-task learning. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 109–117

14.

Gao J, Fan W, Jiang J, Han J (2008) Knowledge transfer via multiple model local structure mapping. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 283–291

15.

Gong P, Ye J, Zhang C (2012) Robust multi-task feature learning. In: Proceedings of the 18th ACM SIGKDD. ACM, pp 895–903

16.

Gupta S, Phung D, Venkatesh S (2012) A bayesian nonparametric joint factor model for learning shared and individual subspaces from multiple data sources. In: Proceedings of the SDM, pp 200–211

17.

Gupta S, Phung D, Venkatesh S (2013) Factorial multi-task learning: a bayesian nonparametric approach. In: Proceedings of international conference on machine learning, pp 657–665

18.

Hastie T, Tibshirani R, Jerome J, Friedman H (2001) The elements of statistical learning, vol 1. Springer, New YorkMATHCrossRef

19.

Jalali A, Sanghavi S, Ruan C, Ravikumar PK (2010) A dirty model for multi-task learning. In: Neural information processing systems, pp 964–972

20.

Jebara T (2004) Multi-task feature and kernel selection for SVMs. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 55

21.

Ji S, Ye J (2009) An accelerated gradient method for trace norm minimization. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 457–464

22.

Kang Z, Grauman K, Sha F (2011) Learning with whom to share in multi-task feature learning. In: Proceedings of the 28th international conference on machine learning, pp 521–528

23.

Lawrence ND, Platt JC (2004) Learning to learn with the informative vector machine. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 65

24.

Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l 2, 1-norm minimization. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, pp 339–348

25.

Liu J, Chen J, Ye J (2009) Large-scale sparse logistic regression. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 547–556

26.

Nemirovski A (2005) Efficient methods in convex programming. Lecture Notes. http://www2.isye.gatech.edu/~nemirovs/

27.

Nesterov Y, Nesterov UE (2004) Introductory lectures on convex optimization: a basic course, vol 87. Springer, Berlin

28.

Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef

29.

Rai P, Daume H (2010) Infinite predictor subspace models for multitask learning. In: International conference on artificial intelligence and statistics, pp 613–620

30.

Raudys SJ, Jain AK (1991) Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Trans Pattern Anal Mach Intell 13(3):252–264CrossRef

31.

Schmidt M (2010) Graphical model structure learning with l1-regularization. PhD thesis, The University of British Columbia

32.

Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodological) 58:267–288MATHMathSciNet

33.

Xue Y, Liao X, Carin L, Krishnapuram B (2007) Multi-task learning for classification with dirichlet process priors. J Mach Learn Res 8:35–63MATHMathSciNet

34.

Yang H, Lyu MR, King I (2013) Efficient online learning for multitask feature selection. ACM Trans Knowl Discov Data 7(2):6:1–6:27CrossRef

35.

Zhang Y, Yeung D-Y (2010) A convex formulation for learning task relationships in multi-task learning. In: UAI, pp 733–442

Titel: Multiple task transfer learning with small sample sizes
verfasst von: Budhaditya Saha
Sunil Gupta
Dinh Phung
Svetha Venkatesh
Publikationsdatum: 01.02.2016
Verlag: Springer London
Erschienen in: Knowledge and Information Systems / Ausgabe 2/2016
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI: https://doi.org/10.1007/s10115-015-0821-z

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 2/2016

Minimizing response time in time series classification

HICC: an entropy splitting-based framework for hierarchical co-clustering

Learning from automatically labeled data: case study on click fraud prediction

Missing value imputation using a fuzzy clustering-based EM approach

A survey on indexing techniques for big data: taxonomy and performance evaluation

POI recommendation through cross-region collaborative filtering

Premium Partner