skip to main content
research-article

Concept Drift Adaptation by Exploiting Drift Type

Published:12 February 2024Publication History
Skip Abstract Section

Abstract

Concept drift is a phenomenon where the distribution of data streams changes over time. When this happens, model predictions become less accurate. Hence, models built in the past need to be re-learned for the current data. Two design questions need to be addressed in designing a strategy to re-learn models: which type of concept drift has occurred, and how to utilize the drift type to improve re-learning performance. Existing drift detection methods are often good at determining when drift has occurred. However, few retrieve information about how the drift came to be present in the stream. Hence, determining the impact of the type of drift on adaptation is difficult. Filling this gap, we designed a framework based on a lazy strategy called Type-Driven Lazy Drift Adaptor (Type-LDA). Type-LDA first retrieves information about both how and when a drift has occurred, then it uses this information to re-learn the new model. To identify the type of drift, a drift type identifier is pre-trained on synthetic data of known drift types. Furthermore, a drift point locator locates the optimal point of drift via a sharing loss. Hence, Type-LDA can select the optimal point, according to the drift type, to re-learn the new model. Experiments validate Type-LDA on both synthetic data and real-world data, and the results show that accurately identifying drift type can improve adaptation accuracy.

REFERENCES

  1. [1] Abbasi Ahmad, Javed Abdul Rehman, Chakraborty Chinmay, Nebhen Jamel, Zehra Wisha, and Jalil Zunera. 2021. ElStream: An ensemble learning approach for concept drift detection in dynamic social big data stream learning. IEEE Access 9 (2021), 6640866419.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Agrahari Supriya and Singh Anil Kumar. 2021. Concept drift detection in data stream mining: A literature review. Journal of King Saud University-Computer and Information Sciences 34, 10 (2022), 9523–9540.Google ScholarGoogle Scholar
  3. [3] Agrawal Rakesh, Imielinski Tomasz, and Swami Arun. 1993. Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering 5, 6 (1993), 914925.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Baena-Garc M., Campo-Ávila J. D., Fidalgo R., Bifet A., and Morales-Bueno R.. 2006. Early drift detection method. In Proceedings of the International Workshop on Knowledge Discovery from Data Streams.Google ScholarGoogle Scholar
  5. [5] Baena-Garcıa Manuel, Campo-Ávila José del, Fidalgo Raúl, Bifet Albert, Gavalda R., and Morales-Bueno Rafael. 2006. Early drift detection method. In Proceedings of the 4th International Workshop on Knowledge Discovery from Data Streams. Vol. 6, 7786.Google ScholarGoogle Scholar
  6. [6] Bifet Albert and Gavalda Ricard. 2007. Learning from time-changing data with adaptive windowing. In Proceedings of the 2007 SIAM International Conference on Data Mining. SIAM, 443448.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Cano Alberto and Krawczyk Bartosz. 2020. Kappa updated ensemble for drifting data stream mining. Machine Learning 109, 1 (2020), 175218.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Elwell Ryan and Polikar Robi. 2011. Incremental learning of concept drift in nonstationary environments. IEEE Transactions on Neural Networks 22, 10 (2011), 15171531.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Frias-Blanco Isvani, Campo-Avila Jose Del, Ramos-Jimenez Gonzalo, Morales-Bueno Rafael, Ortiz-Diaz Agustin, and Caballero-Mota Yaile. 2015. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Transactions on Knowledge and Data Engineering 27, 3 (2015), 810823.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Frias-Blanco Isvani, Campo-Ávila José del, Ramos-Jimenez Gonzalo, Morales-Bueno Rafael, Ortiz-Diaz Agustin, and Caballero-Mota Yaile. 2014. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Transactions on Knowledge and Data Engineering 27, 3 (2014), 810823.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Gama Joao, Medas Pedro, Castillo Gladys, and Rodrigues Pedro. 2004. Learning with drift detection. In Proceedings of the Brazilian Symposium on Artificial Intelligence. Springer, 286295.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Gama J., Medas P., Castillo G., and Rodrigues P. P.. 2004. Learning with drift detection. In Brazilian Symposium on Artificial Intelligence. Springer, 286–295.Google ScholarGoogle Scholar
  13. [13] Gama João, Žliobaitė Indrė, Bifet Albert, Pechenizkiy Mykola, and Bouchachia Abdelhamid. 2014. A survey on concept drift adaptation. ACM Computing Surveys 46, 4 (2014), 137.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Guo Husheng, Li Hai, Ren Qiaoyan, and Wang Wenjian. 2022. Concept drift type identification based on multi-sliding windows. Information Sciences 585 (2022), 123.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Halstead Ben, Koh Yun Sing, Riddle Patricia, Pears Russel, Pechenizkiy Mykola, Bifet Albert, Olivares Gustavo, and Coulson Guy. 2022. Analyzing and repairing concept drift adaptation in data stream classification. Machine Learning 111, 10 (2022), 34893523.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Hulten Geoff, Spencer Laurie, and Domingos Pedro. 2001. Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 97106.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Ienco Dino, Bifet Albert, Žliobaitė Indrė, and Pfahringer Bernhard. 2013. Clustering based active learning for evolving data streams. In Proceedings of the International Conference on Discovery Science. Springer, 7993.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Kendall Alex, Gal Yarin, and Cipolla Roberto. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 74827491.Google ScholarGoogle Scholar
  19. [19] Kingma D. and Ba J.. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  20. [20] Krawczyk Bartosz. 2017. Active and adaptive ensemble learning for online activity recognition from data streams. Knowledge-Based Systems 138 (2017), 6978.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Liu Anjin, Lu Jie, Song Yiliao, Xuan Junyu, and Zhang Guangquan. 2022. Concept drift detection delay index. IEEE Transactions on Knowledge and Data Engineering 35, 5 (2022), 4585–4597.Google ScholarGoogle Scholar
  22. [22] Liu Anjin, Zhang Guangquan, and Lu Jie. 2017. Fuzzy time windowing for gradual concept drift adaptation. In Proceedings of the 2017 IEEE International Conference on Fuzzy Systems. IEEE, 16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Lu J., Liu A., Dong F., Gu F., Gama J., and Zhang G.. 2020. Learning under concept drift: A review. IEEE Transactions on Knowledge and Data Engineering 31, 12 (2018), 2346–2363, 2018, 1–1.Google ScholarGoogle Scholar
  24. [24] Lu Yang, Cheung Yiu-Ming, and Tang Yuan Yan. 2019. Adaptive chunk-based dynamic weighted majority for imbalanced data streams with concept drift. IEEE Transactions on Neural Networks and Learning Systems 31, 8 (2019), 27642778.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Nishida Kyosuke and Yamauchi Koichiro. 2007. Detecting concept drift using statistical testing. In Proceedings of the 10th International Conference on Discovery Science.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Oliveira Gustavo, Minku Leandro L., and Oliveira Adriano L. I.. 2021. Tackling virtual and real concept drifts: An adaptive Gaussian mixture model approach. IEEE Transactions on Knowledge and Data Engineering 1 (2021), 1–1.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Oza Nikunj C. and Russell Stuart. 2001. Experimental comparisons of online and batch versions of bagging and boosting. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 359364.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Pal Sankar K. and Mitra Sushmita. 1992. Multilayer perceptron, fuzzy sets, classifiaction. IEEE Transactionson Neural Networks 3, (1992), 683–697.Google ScholarGoogle Scholar
  29. [29] Pedregosa Fabian, Varoquaux Gaël, Gramfort Alexandre, Michel Vincent, Thirion Bertrand, Grisel Olivier, Blondel Mathieu, Prettenhofer Peter, Weiss Ron, Dubourg Vincent, et al. 2011. Scikit-learn: Machine learning in python. The Journal of Machine Learning Research 12 (2011), 28252830.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Pratama Mahardhika, Lu Jie, Lughofer Edwin, Zhang Guangquan, and Er Meng Joo. 2016. An incremental learning of concept drifts using evolving type-2 recurrent fuzzy neural networks. IEEE Transactions on Fuzzy Systems 25, 5 (2016), 11751192.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Raab C., Heusinger M., and Schleif F. M.. 2020. Reactive soft prototype computing for concept drift streams. Neurocomputing 416, (2020), 340–351.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Rahman Md Geaur and Islam Md Zahidul. 2022. Adaptive decision forest: An incremental machine learning framework. Pattern Recognition 122 (2022), 108345.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Saad David. 1998. Online algorithms and stochastic approximations. Online Learning 5 (1998), 6–3.Google ScholarGoogle Scholar
  34. [34] Shan Jicheng, Zhang Hang, Liu Weike, and Liu Qingbao. 2018. Online active learning ensemble framework for drifted data streams. IEEE Transactions on Neural Networks and Learning Systems 30, 2 (2018), 486498.Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Snell Jake, Swersky Kevin, and Zemel Richard. 2017. Prototypical networks for few-shot learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems.Google ScholarGoogle Scholar
  36. [36] Song Yiliao, Lu Jie, Lu Haiyan, and Zhang Guangquan. 2021. Learning data streams with changing distributions and temporal dependency. IEEE Transactions on Neural Networks and Learning Systems (2021).Google ScholarGoogle Scholar
  37. [37] Rijn Jan N. van, Holmes Geoffrey, Pfahringer Bernhard, and Vanschoren Joaquin. 2018. The online performance estimation framework: Heterogeneous ensemble learning for data streams. Machine Learning 107, 1 (2018), 149176.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Wang Kun, Lu Jie, Liu Anjin, Song Yiliao, Xiong Li, and Zhang Guangquan. 2022. Elastic gradient boosting decision tree with adaptive iterations for concept drift adaptation. Neurocomputing 491 (2022), 288304.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Wang Mingyuan and Barbu Adrian. 2022. Online feature screening for data streams with concept drift. IEEE Transactions on Knowledge and Data Engineering 1 (2022), 1–14.Google ScholarGoogle Scholar
  40. [40] Wu Dongrui, Lin Chin-Teng, and Huang Jian. 2019. Active learning for regression using greedy sampling. Information Sciences 474 (2019), 90105.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Xu Shuliang and Wang Junhong. 2017. Dynamic extreme learning machine for data stream classification. Neurocomputing 238 (2017), 433449.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Xuan Junyu, Lu Jie, and Zhang Guangquan. 2020. Bayesian nonparametric unsupervised concept drift detection for data stream mining. ACM Transactions on Intelligent Systems and Technology 12, 1 (2020), 122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Yu En, Song Yiliao, Zhang Guangquan, and Lu Jie. 2022. Learn-to-adapt: Concept drift adaptation for hybrid multiple streams. Neurocomputing 496 (2022), 121130.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Lu N., Zhang G., and Lu J.. 2014. Concept drift detection via competence models. Artificial Intelligence, 209 (2014), 11–18.Google ScholarGoogle Scholar
  45. [45] Yu Hang, Liu Weixu, Lu Jie, Wen Yimin, Luo Xiangfeng, and Zhang Guangquan. 2023. Detecting group concept drift from multiple data streams. Pattern Recognition 134 (2023), 109113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Yu Hang, Lu Jie, Liu Anjin, Wang Bin, Li Ruimin, and Zhang Guangquan. 2022. Real-time prediction system of train carriage load based on multi-stream fuzzy learning. IEEE Transactions on Intelligent Transportation Systems 23, 9 (2022), 15155–15165.Google ScholarGoogle Scholar
  47. [47] Yu Hang, Lu Jie, and Zhang Guangquan. 2020. Continuous support vector regression for nonstationary streaming data. IEEE Transactions on Cybernetics 52, 5 (2020), 35923605.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Yu H., Lu J., and Zhang G.. 2020. An online robust support vector regression for data streams. IEEE Transactions on Knowledge and Data Engineering PP, 99 (2020), 11.Google ScholarGoogle Scholar
  49. [49] Yu Hang, Lu Jie, and Zhang Guangquan. 2020. An online robust support vector regression for data streams. IEEE Transactions on Knowledge and Data Engineering 34, 1 (2020), 150163.Google ScholarGoogle Scholar
  50. [50] Yu Hang, Lu Jie, and Zhang Guangquan. 2020. Topology learning-based fuzzy random neural networks for streaming data regression. IEEE Transactions on Fuzzy Systems 30, 2 (2020), 412425.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Yu Hang, Lu Jie, and Zhang Guangquan. 2021. MORStreaming: A multioutput regression system for streaming data. IEEE Transactions on Systems, Man, and Cybernetics: Systems 52, 8 (2021), 4862–4874.Google ScholarGoogle Scholar
  52. [52] Yu Hang, Zhang Qingyong, Liu Tianyu, Lu Jie, Wen Yimin, and Zhang Guangquan. 2022. Meta-ADD: A meta-learning based pre-trained model for concept drift active detection. Information Sciences 608 (2022), 9961009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Zheng Xiulin, Li Peipei, Hu Xuegang, and Yu Kui. 2021. Semi-supervised classification on data streams with recurring concept drift and concept evolution. Knowledge-Based Systems 215 (2021), 106749.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Concept Drift Adaptation by Exploiting Drift Type

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 18, Issue 4
      May 2024
      707 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/3613622
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 February 2024
      • Online AM: 2 January 2024
      • Accepted: 15 December 2023
      • Revised: 28 September 2023
      • Received: 30 November 2022
      Published in tkdd Volume 18, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)381
      • Downloads (Last 6 weeks)123

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text