Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 10/2019

16.08.2019 | Original Article

Online transfer learning with multiple decision trees

verfasst von: Yimin Wen, Yixiu Qin, Keke Qin, Xiaoxia Lu, Pingshan Liu

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 10/2019

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Online learning techniques have been widely used in many fields where instances come one by one. However, in early stage of a data stream, online learning models cannot exhibit good classification accuracy for it cannot collect sufficient instances to learn. For example, a well-known online learning algorithm named as very fast decision tree (VFDT) needs to wait for Hoeffding bound satisfied to split, which leads to poor classification accuracy at the beginning of data stream. Thus, VFDT may not be appropriate for some real applications which demand us a fast and accurate online detection. This situation will become more serious in the scenario of data stream classification with concept drift. This paper attempts to take transfer learning algorithm to make up this shortcoming of VFDT. To achieve this goal, a new decision tree method named as VFDT-D is first proposed to cache instances in its leaf nodes to handle numerical attributes and adapt to a framework of online transfer learning (OTL), and then a measure which considers tree path, classification accuracy and classification confidence is proposed to evaluate the local similarity between source and target domain classifiers. At last, a multiple-source online transfer learning algorithm named as DMOTL is proposed to take VFDT-D as base classifier and use the proposed measure of local similarity to select the optimal source domain classifier to help transfer learning. The extensive experiments on several synthetic and real-world datasets demonstrate the advantage of the proposed algorithm.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Shalev-Shwartz S (2012) Online learning and online convex optimization. Found Trends Mach Learn 4(2):107–194MATHCrossRef Shalev-Shwartz S (2012) Online learning and online convex optimization. Found Trends Mach Learn 4(2):107–194MATHCrossRef
2.
Zurück zum Zitat Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proc. of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, pp. 71–80 Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proc. of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, pp. 71–80
3.
Zurück zum Zitat Chattopadhyay R, Ye J, Panchanathan S, et al (2011) Multisource domain adaptation and its application to early detection of fatigue. In: Pro of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, New York: ACM, pp. 717–725 Chattopadhyay R, Ye J, Panchanathan S, et al (2011) Multisource domain adaptation and its application to early detection of fatigue. In: Pro of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, New York: ACM, pp. 717–725
4.
Zurück zum Zitat Sidhu P, Bhatia MPS (2018) A novel online ensemble approach to handle concept drifting data streams: diversified dynamic weighted majority. Int J Mach Learn Cybern 9(1):37–61CrossRef Sidhu P, Bhatia MPS (2018) A novel online ensemble approach to handle concept drifting data streams: diversified dynamic weighted majority. Int J Mach Learn Cybern 9(1):37–61CrossRef
5.
Zurück zum Zitat Pan SJ, Yang Q (2010) A Survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef Pan SJ, Yang Q (2010) A Survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef
6.
Zurück zum Zitat Zhuang FZ, Luo P, He Q et al (2015) Survey on transfer learning research. J Softw 26(1):26–39 (in Chinese) MathSciNet Zhuang FZ, Luo P, He Q et al (2015) Survey on transfer learning research. J Softw 26(1):26–39 (in Chinese) MathSciNet
7.
Zurück zum Zitat Weiss K, Khoshgoftaar TM, Wang DD (2016) A survey of transfer learning. J Big Data 3(1):9–48CrossRef Weiss K, Khoshgoftaar TM, Wang DD (2016) A survey of transfer learning. J Big Data 3(1):9–48CrossRef
8.
9.
Zurück zum Zitat Pan W, Zhong H, Xu C et al (2015) Adaptive bayesian personalized ranking for heterogeneous implicit feedbacks. Knowl-Based Syst 73(1):173–180CrossRef Pan W, Zhong H, Xu C et al (2015) Adaptive bayesian personalized ranking for heterogeneous implicit feedbacks. Knowl-Based Syst 73(1):173–180CrossRef
10.
Zurück zum Zitat Quattoni A, Collins M, Darrell T (2008) Transfer learning for image classification with sparse prototype representations. In: Proc of the Computer Vision and Pattern Recognition. Piscataway: IEEE, pp. 1–8 Quattoni A, Collins M, Darrell T (2008) Transfer learning for image classification with sparse prototype representations. In: Proc of the Computer Vision and Pattern Recognition. Piscataway: IEEE, pp. 1–8
11.
Zurück zum Zitat Zhao P, Hoi SCH (2010) OTL: a framework of online transfer learning. In: Proc. of the international conference on machine learning. New York: ACM, pp. 1231–1238 Zhao P, Hoi SCH (2010) OTL: a framework of online transfer learning. In: Proc. of the international conference on machine learning. New York: ACM, pp. 1231–1238
13.
Zurück zum Zitat Wu Q, Wu H, Zhou X et al (2017) Online transfer learning with multiple homogeneous or heterogeneous Sources. IEEE Trans Knowl Data Eng 29(7):1494–1507CrossRef Wu Q, Wu H, Zhou X et al (2017) Online transfer learning with multiple homogeneous or heterogeneous Sources. IEEE Trans Knowl Data Eng 29(7):1494–1507CrossRef
14.
Zurück zum Zitat Li ZJ, Li YX, Wang F et al (2015) Online learning algorithms for big data analytics: a survey. J Comput Res Dev 52(8):1707–1721 (in Chinese) Li ZJ, Li YX, Wang F et al (2015) Online learning algorithms for big data analytics: a survey. J Comput Res Dev 52(8):1707–1721 (in Chinese)
15.
Zurück zum Zitat Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386–408CrossRef Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386–408CrossRef
16.
Zurück zum Zitat Crammer K, Dekel O, Keshet J et al (2006) Online passive-aggressive algorithms. J Mach Learn Res 7(3):551–585MathSciNetMATH Crammer K, Dekel O, Keshet J et al (2006) Online passive-aggressive algorithms. J Mach Learn Res 7(3):551–585MathSciNetMATH
17.
Zurück zum Zitat Gama J, Rocha R, Medas P (2003) Accurate decision trees for mining high-speed data streams. In: Proc. of the ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM, pp 523–528 Gama J, Rocha R, Medas P (2003) Accurate decision trees for mining high-speed data streams. In: Proc. of the ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM, pp 523–528
18.
Zurück zum Zitat Dai W, Yang Q, Xue G R et al (2007) Boosting for transfer learning. In: Proc. of the 24th international conference on Machine learning. New York: ACM, pp 193–200 Dai W, Yang Q, Xue G R et al (2007) Boosting for transfer learning. In: Proc. of the 24th international conference on Machine learning. New York: ACM, pp 193–200
19.
Zurück zum Zitat Eaton E, Desjardins M (2011) Selective transfer between learning tasks using task-based boosting. In: Proc. of the AAAI conference on artificial intelligence. Menlo Park: AAAI, pp 337–342 Eaton E, Desjardins M (2011) Selective transfer between learning tasks using task-based boosting. In: Proc. of the AAAI conference on artificial intelligence. Menlo Park: AAAI, pp 337–342
20.
Zurück zum Zitat Wang XS, Pan J, Cheng YH et al (2013) Self-adaptive transfer for decision trees based on similarity metric. Acta Automatica Sinica 39(12):2186–2192 (in Chinese) CrossRef Wang XS, Pan J, Cheng YH et al (2013) Self-adaptive transfer for decision trees based on similarity metric. Acta Automatica Sinica 39(12):2186–2192 (in Chinese) CrossRef
21.
Zurück zum Zitat Gao J, Fan W, Jiang J et al (2008) Knowledge transfer via multiple model local structure mapping. In: Proc. of the ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM, pp 283–291 Gao J, Fan W, Jiang J et al (2008) Knowledge transfer via multiple model local structure mapping. In: Proc. of the ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM, pp 283–291
22.
Zurück zum Zitat Ge L, Gao J, Zhang AD (2013) OMS-TL: a framework of online multiple source transfer learning. In: Proc. of the 22nd ACM international conference on information and knowledge management. New York: ACM, pp 2423–2428 Ge L, Gao J, Zhang AD (2013) OMS-TL: a framework of online multiple source transfer learning. In: Proc. of the 22nd ACM international conference on information and knowledge management. New York: ACM, pp 2423–2428
23.
Zurück zum Zitat Tang SQ, Wen YM, Qin YX (2017) Online transfer learning from multiple sources based on local classification accuracy. J Softw 28(11):2940–2960 (in Chinese) MATH Tang SQ, Wen YM, Qin YX (2017) Online transfer learning from multiple sources based on local classification accuracy. J Softw 28(11):2940–2960 (in Chinese) MATH
24.
Zurück zum Zitat Zadrozny B, Elkan C (2001) Learning and making decisions when costs and probabilities are both unknown. In: Proc. of the ACM SIGKDD international conference on knowledge discovery & data mining. New York: ACM, pp 204–213 Zadrozny B, Elkan C (2001) Learning and making decisions when costs and probabilities are both unknown. In: Proc. of the ACM SIGKDD international conference on knowledge discovery & data mining. New York: ACM, pp 204–213
25.
Zurück zum Zitat Ntoutsi I, Kalousis A, Theodoridis Y (2008) A general framework for estimating similarity of datasets and decision trees: exploring semantic similarity of decision trees. In: Proc. of the SIAM international conference on data mining. Philadelphia: SIAM, pp 810–821 Ntoutsi I, Kalousis A, Theodoridis Y (2008) A general framework for estimating similarity of datasets and decision trees: exploring semantic similarity of decision trees. In: Proc. of the SIAM international conference on data mining. Philadelphia: SIAM, pp 810–821
26.
Zurück zum Zitat Huang Z (1997) Clustering large datasets with mixed numeric and categorical values. In: Proc. of the 1st Pacific-asia conference on knowledge discovery and data mining. Springer, Berlin, pp 21–34 Huang Z (1997) Clustering large datasets with mixed numeric and categorical values. In: Proc. of the 1st Pacific-asia conference on knowledge discovery and data mining. Springer, Berlin, pp 21–34
27.
Zurück zum Zitat Bifet A, Holmes G, Kirkby R et al (2010) MOA: massive online analysis. J Mach Learn Res 11(2):1601–1604 Bifet A, Holmes G, Kirkby R et al (2010) MOA: massive online analysis. J Mach Learn Res 11(2):1601–1604
28.
Zurück zum Zitat Xiang W E, Pan J S, Pan W et al (2011) Source-selection-free transfer learning, In: Proc of the twenty-second international joint conference on artificial intelligence, Menlo Park: AAAI, pp 2355–2360 Xiang W E, Pan J S, Pan W et al (2011) Source-selection-free transfer learning, In: Proc of the twenty-second international joint conference on artificial intelligence, Menlo Park: AAAI, pp 2355–2360
Metadaten
Titel
Online transfer learning with multiple decision trees
verfasst von
Yimin Wen
Yixiu Qin
Keke Qin
Xiaoxia Lu
Pingshan Liu
Publikationsdatum
16.08.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 10/2019
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-019-00998-3

Weitere Artikel der Ausgabe 10/2019

International Journal of Machine Learning and Cybernetics 10/2019 Zur Ausgabe

Neuer Inhalt