research-article

Concept Drift Adaptation by Exploiting Drift Type

Authors:
Jinpeng Li

School of Computer Engineering and Science, China

School of Computer Engineering and Science, China

0009-0007-2572-117X
Search about this author

,
Hang Yu

School of Computer Engineering and Science, China

School of Computer Engineering and Science, China

0000-0003-3444-9992
Search about this author

,
Zhenyu Zhang

School of Computer Engineering and Science, China

School of Computer Engineering and Science, China

0000-0002-9470-7132
Search about this author

,
Xiangfeng Luo

School of Computer Engineering and Science, China

School of Computer Engineering and Science, China

0000-0002-4093-4233
Search about this author

,
Shaorong Xie

School of Computer Engineering and Science, China

School of Computer Engineering and Science, China

0000-0002-8016-9310
Search about this author

ACM Transactions on Knowledge Discovery from Data Volume 18 Issue 4Article No.: 96pp 1–22https://doi.org/10.1145/3638777

Published:12 February 2024Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

Concept drift is a phenomenon where the distribution of data streams changes over time. When this happens, model predictions become less accurate. Hence, models built in the past need to be re-learned for the current data. Two design questions need to be addressed in designing a strategy to re-learn models: which type of concept drift has occurred, and how to utilize the drift type to improve re-learning performance. Existing drift detection methods are often good at determining when drift has occurred. However, few retrieve information about how the drift came to be present in the stream. Hence, determining the impact of the type of drift on adaptation is difficult. Filling this gap, we designed a framework based on a lazy strategy called Type-Driven Lazy Drift Adaptor (Type-LDA). Type-LDA first retrieves information about both how and when a drift has occurred, then it uses this information to re-learn the new model. To identify the type of drift, a drift type identifier is pre-trained on synthetic data of known drift types. Furthermore, a drift point locator locates the optimal point of drift via a sharing loss. Hence, Type-LDA can select the optimal point, according to the drift type, to re-learn the new model. Experiments validate Type-LDA on both synthetic data and real-world data, and the results show that accurately identifying drift type can improve adaptation accuracy.

REFERENCES

[1] Abbasi Ahmad, Javed Abdul Rehman, Chakraborty Chinmay, Nebhen Jamel, Zehra Wisha, and Jalil Zunera. 2021. ElStream: An ensemble learning approach for concept drift detection in dynamic social big data stream learning. IEEE Access 9 (2021), 66408–66419.Google ScholarCross Ref
[2] Agrahari Supriya and Singh Anil Kumar. 2021. Concept drift detection in data stream mining: A literature review. Journal of King Saud University-Computer and Information Sciences 34, 10 (2022), 9523–9540.Google Scholar
[3] Agrawal Rakesh, Imielinski Tomasz, and Swami Arun. 1993. Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering 5, 6 (1993), 914–925.Google ScholarDigital Library
[4] Baena-Garc M., Campo-Ávila J. D., Fidalgo R., Bifet A., and Morales-Bueno R.. 2006. Early drift detection method. In Proceedings of the International Workshop on Knowledge Discovery from Data Streams.Google Scholar
[5] Baena-Garcıa Manuel, Campo-Ávila José del, Fidalgo Raúl, Bifet Albert, Gavalda R., and Morales-Bueno Rafael. 2006. Early drift detection method. In Proceedings of the 4th International Workshop on Knowledge Discovery from Data Streams. Vol. 6, 77–86.Google Scholar
[6] Bifet Albert and Gavalda Ricard. 2007. Learning from time-changing data with adaptive windowing. In Proceedings of the 2007 SIAM International Conference on Data Mining. SIAM, 443–448.Google ScholarCross Ref
[7] Cano Alberto and Krawczyk Bartosz. 2020. Kappa updated ensemble for drifting data stream mining. Machine Learning 109, 1 (2020), 175–218.Google ScholarDigital Library
[8] Elwell Ryan and Polikar Robi. 2011. Incremental learning of concept drift in nonstationary environments. IEEE Transactions on Neural Networks 22, 10 (2011), 1517–1531.Google ScholarDigital Library
[9] Frias-Blanco Isvani, Campo-Avila Jose Del, Ramos-Jimenez Gonzalo, Morales-Bueno Rafael, Ortiz-Diaz Agustin, and Caballero-Mota Yaile. 2015. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Transactions on Knowledge and Data Engineering 27, 3 (2015), 810–823.Google ScholarDigital Library
[10] Frias-Blanco Isvani, Campo-Ávila José del, Ramos-Jimenez Gonzalo, Morales-Bueno Rafael, Ortiz-Diaz Agustin, and Caballero-Mota Yaile. 2014. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Transactions on Knowledge and Data Engineering 27, 3 (2014), 810–823.Google ScholarDigital Library
[11] Gama Joao, Medas Pedro, Castillo Gladys, and Rodrigues Pedro. 2004. Learning with drift detection. In Proceedings of the Brazilian Symposium on Artificial Intelligence. Springer, 286–295.Google ScholarCross Ref
[12] Gama J., Medas P., Castillo G., and Rodrigues P. P.. 2004. Learning with drift detection. In Brazilian Symposium on Artificial Intelligence. Springer, 286–295.Google Scholar
[13] Gama João, Žliobaitė Indrė, Bifet Albert, Pechenizkiy Mykola, and Bouchachia Abdelhamid. 2014. A survey on concept drift adaptation. ACM Computing Surveys 46, 4 (2014), 1–37.Google ScholarDigital Library
[14] Guo Husheng, Li Hai, Ren Qiaoyan, and Wang Wenjian. 2022. Concept drift type identification based on multi-sliding windows. Information Sciences 585 (2022), 1–23.Google ScholarDigital Library
[15] Halstead Ben, Koh Yun Sing, Riddle Patricia, Pears Russel, Pechenizkiy Mykola, Bifet Albert, Olivares Gustavo, and Coulson Guy. 2022. Analyzing and repairing concept drift adaptation in data stream classification. Machine Learning 111, 10 (2022), 3489–3523.Google ScholarDigital Library
[16] Hulten Geoff, Spencer Laurie, and Domingos Pedro. 2001. Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 97–106.Google ScholarDigital Library
[17] Ienco Dino, Bifet Albert, Žliobaitė Indrė, and Pfahringer Bernhard. 2013. Clustering based active learning for evolving data streams. In Proceedings of the International Conference on Discovery Science. Springer, 79–93.Google ScholarCross Ref
[18] Kendall Alex, Gal Yarin, and Cipolla Roberto. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7482–7491.Google Scholar
[19] Kingma D. and Ba J.. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
[20] Krawczyk Bartosz. 2017. Active and adaptive ensemble learning for online activity recognition from data streams. Knowledge-Based Systems 138 (2017), 69–78.Google ScholarDigital Library
[21] Liu Anjin, Lu Jie, Song Yiliao, Xuan Junyu, and Zhang Guangquan. 2022. Concept drift detection delay index. IEEE Transactions on Knowledge and Data Engineering 35, 5 (2022), 4585–4597.Google Scholar
[22] Liu Anjin, Zhang Guangquan, and Lu Jie. 2017. Fuzzy time windowing for gradual concept drift adaptation. In Proceedings of the 2017 IEEE International Conference on Fuzzy Systems. IEEE, 1–6.Google ScholarDigital Library
[23] Lu J., Liu A., Dong F., Gu F., Gama J., and Zhang G.. 2020. Learning under concept drift: A review. IEEE Transactions on Knowledge and Data Engineering 31, 12 (2018), 2346–2363, 2018, 1–1.Google Scholar
[24] Lu Yang, Cheung Yiu-Ming, and Tang Yuan Yan. 2019. Adaptive chunk-based dynamic weighted majority for imbalanced data streams with concept drift. IEEE Transactions on Neural Networks and Learning Systems 31, 8 (2019), 2764–2778.Google ScholarCross Ref
[25] Nishida Kyosuke and Yamauchi Koichiro. 2007. Detecting concept drift using statistical testing. In Proceedings of the 10th International Conference on Discovery Science.Google ScholarDigital Library
[26] Oliveira Gustavo, Minku Leandro L., and Oliveira Adriano L. I.. 2021. Tackling virtual and real concept drifts: An adaptive Gaussian mixture model approach. IEEE Transactions on Knowledge and Data Engineering 1 (2021), 1–1.Google ScholarCross Ref
[27] Oza Nikunj C. and Russell Stuart. 2001. Experimental comparisons of online and batch versions of bagging and boosting. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 359–364.Google ScholarDigital Library
[28] Pal Sankar K. and Mitra Sushmita. 1992. Multilayer perceptron, fuzzy sets, classifiaction. IEEE Transactionson Neural Networks 3, (1992), 683–697.Google Scholar
[29] Pedregosa Fabian, Varoquaux Gaël, Gramfort Alexandre, Michel Vincent, Thirion Bertrand, Grisel Olivier, Blondel Mathieu, Prettenhofer Peter, Weiss Ron, Dubourg Vincent, et al. 2011. Scikit-learn: Machine learning in python. The Journal of Machine Learning Research 12 (2011), 2825–2830.Google ScholarDigital Library
[30] Pratama Mahardhika, Lu Jie, Lughofer Edwin, Zhang Guangquan, and Er Meng Joo. 2016. An incremental learning of concept drifts using evolving type-2 recurrent fuzzy neural networks. IEEE Transactions on Fuzzy Systems 25, 5 (2016), 1175–1192.Google ScholarDigital Library
[31] Raab C., Heusinger M., and Schleif F. M.. 2020. Reactive soft prototype computing for concept drift streams. Neurocomputing 416, (2020), 340–351.Google ScholarCross Ref
[32] Rahman Md Geaur and Islam Md Zahidul. 2022. Adaptive decision forest: An incremental machine learning framework. Pattern Recognition 122 (2022), 108345.Google ScholarDigital Library
[33] Saad David. 1998. Online algorithms and stochastic approximations. Online Learning 5 (1998), 6–3.Google Scholar
[34] Shan Jicheng, Zhang Hang, Liu Weike, and Liu Qingbao. 2018. Online active learning ensemble framework for drifted data streams. IEEE Transactions on Neural Networks and Learning Systems 30, 2 (2018), 486–498.Google ScholarCross Ref
[35] Snell Jake, Swersky Kevin, and Zemel Richard. 2017. Prototypical networks for few-shot learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems.Google Scholar
[36] Song Yiliao, Lu Jie, Lu Haiyan, and Zhang Guangquan. 2021. Learning data streams with changing distributions and temporal dependency. IEEE Transactions on Neural Networks and Learning Systems (2021).Google Scholar
[37] Rijn Jan N. van, Holmes Geoffrey, Pfahringer Bernhard, and Vanschoren Joaquin. 2018. The online performance estimation framework: Heterogeneous ensemble learning for data streams. Machine Learning 107, 1 (2018), 149–176.Google ScholarDigital Library
[38] Wang Kun, Lu Jie, Liu Anjin, Song Yiliao, Xiong Li, and Zhang Guangquan. 2022. Elastic gradient boosting decision tree with adaptive iterations for concept drift adaptation. Neurocomputing 491 (2022), 288–304.Google ScholarDigital Library
[39] Wang Mingyuan and Barbu Adrian. 2022. Online feature screening for data streams with concept drift. IEEE Transactions on Knowledge and Data Engineering 1 (2022), 1–14.Google Scholar
[40] Wu Dongrui, Lin Chin-Teng, and Huang Jian. 2019. Active learning for regression using greedy sampling. Information Sciences 474 (2019), 90–105.Google ScholarCross Ref
[41] Xu Shuliang and Wang Junhong. 2017. Dynamic extreme learning machine for data stream classification. Neurocomputing 238 (2017), 433–449.Google ScholarDigital Library
[42] Xuan Junyu, Lu Jie, and Zhang Guangquan. 2020. Bayesian nonparametric unsupervised concept drift detection for data stream mining. ACM Transactions on Intelligent Systems and Technology 12, 1 (2020), 1–22.Google ScholarDigital Library
[43] Yu En, Song Yiliao, Zhang Guangquan, and Lu Jie. 2022. Learn-to-adapt: Concept drift adaptation for hybrid multiple streams. Neurocomputing 496 (2022), 121–130.Google ScholarDigital Library
[44] Lu N., Zhang G., and Lu J.. 2014. Concept drift detection via competence models. Artificial Intelligence, 209 (2014), 11–18.Google Scholar
[45] Yu Hang, Liu Weixu, Lu Jie, Wen Yimin, Luo Xiangfeng, and Zhang Guangquan. 2023. Detecting group concept drift from multiple data streams. Pattern Recognition 134 (2023), 109113.Google ScholarDigital Library
[46] Yu Hang, Lu Jie, Liu Anjin, Wang Bin, Li Ruimin, and Zhang Guangquan. 2022. Real-time prediction system of train carriage load based on multi-stream fuzzy learning. IEEE Transactions on Intelligent Transportation Systems 23, 9 (2022), 15155–15165.Google Scholar
[47] Yu Hang, Lu Jie, and Zhang Guangquan. 2020. Continuous support vector regression for nonstationary streaming data. IEEE Transactions on Cybernetics 52, 5 (2020), 3592–3605.Google ScholarCross Ref
[48] Yu H., Lu J., and Zhang G.. 2020. An online robust support vector regression for data streams. IEEE Transactions on Knowledge and Data Engineering PP, 99 (2020), 1–1.Google Scholar
[49] Yu Hang, Lu Jie, and Zhang Guangquan. 2020. An online robust support vector regression for data streams. IEEE Transactions on Knowledge and Data Engineering 34, 1 (2020), 150–163.Google Scholar
[50] Yu Hang, Lu Jie, and Zhang Guangquan. 2020. Topology learning-based fuzzy random neural networks for streaming data regression. IEEE Transactions on Fuzzy Systems 30, 2 (2020), 412–425.Google ScholarDigital Library
[51] Yu Hang, Lu Jie, and Zhang Guangquan. 2021. MORStreaming: A multioutput regression system for streaming data. IEEE Transactions on Systems, Man, and Cybernetics: Systems 52, 8 (2021), 4862–4874.Google Scholar
[52] Yu Hang, Zhang Qingyong, Liu Tianyu, Lu Jie, Wen Yimin, and Zhang Guangquan. 2022. Meta-ADD: A meta-learning based pre-trained model for concept drift active detection. Information Sciences 608 (2022), 996–1009.Google ScholarDigital Library
[53] Zheng Xiulin, Li Peipei, Hu Xuegang, and Yu Kui. 2021. Semi-supervised classification on data streams with recurring concept drift and concept evolution. Knowledge-Based Systems 215 (2021), 106749.Google ScholarCross Ref

Index Terms

Concept Drift Adaptation by Exploiting Drift Type
1. Information systems
  1. Information systems applications
    1. Data mining
      1. Data stream mining

Recommendations

Unsupervised Concept Drift Detection with a Discriminative Classifier
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

In data stream mining, one of the biggest challenges is to develop algorithms that deal with the changing data. As data evolve over time, static models become outdated. This phenomenon is called concept drift, and it is investigated extensively in the ...
Read More
Brute force concept drift detection
Abstract
We present a brute-force approach to detect concept drift behind time sequence data. This approach, named Select-Starţ searches for start points of concept drift to minimize error. In other words, Select-Start searches for the start points of new ...
Read More
Detecting and Adapting to Concept Drift in Continually Evolving Stochastic Processes
BDIOT '17: Proceedings of the International Conference on Big Data and Internet of Thing

Many real world stochastic processes are non-stationary, which means that the probability distribution that generates data samples is time-varying. In the context of machine learning, this phenomenon is known as concept drift. It is important that ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Knowledge Discovery from Data Volume 18, Issue 4
May 2024
707 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3613622
Editor:
Jian Pei
Duke University, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 February 2024
- Online AM: 2 January 2024
- Accepted: 15 December 2023
- Revised: 28 September 2023
- Received: 30 November 2022
Published in tkdd Volume 18, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Concept drift
data streams
drift detection
drift adaptation
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 381
  Total Downloads
- Downloads (Last 12 months)381
- Downloads (Last 6 weeks)123
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

Concept Drift Adaptation by Exploiting Drift Type

ACM Transactions on Knowledge Discovery from Data

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Unsupervised Concept Drift Detection with a Discriminative Classifier

Brute force concept drift detection

Detecting and Adapting to Concept Drift in Continually Evolving Stochastic Processes