Abstract
Concept drift is a phenomenon where the distribution of data streams changes over time. When this happens, model predictions become less accurate. Hence, models built in the past need to be re-learned for the current data. Two design questions need to be addressed in designing a strategy to re-learn models: which type of concept drift has occurred, and how to utilize the drift type to improve re-learning performance. Existing drift detection methods are often good at determining when drift has occurred. However, few retrieve information about how the drift came to be present in the stream. Hence, determining the impact of the type of drift on adaptation is difficult. Filling this gap, we designed a framework based on a lazy strategy called Type-Driven Lazy Drift Adaptor (Type-LDA). Type-LDA first retrieves information about both how and when a drift has occurred, then it uses this information to re-learn the new model. To identify the type of drift, a drift type identifier is pre-trained on synthetic data of known drift types. Furthermore, a drift point locator locates the optimal point of drift via a sharing loss. Hence, Type-LDA can select the optimal point, according to the drift type, to re-learn the new model. Experiments validate Type-LDA on both synthetic data and real-world data, and the results show that accurately identifying drift type can improve adaptation accuracy.
- [1] . 2021. ElStream: An ensemble learning approach for concept drift detection in dynamic social big data stream learning. IEEE Access 9 (2021), 66408–66419.Google ScholarCross Ref
- [2] . 2021. Concept drift detection in data stream mining: A literature review. Journal of King Saud University-Computer and Information Sciences 34, 10 (2022), 9523–9540.Google Scholar
- [3] . 1993. Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering 5, 6 (1993), 914–925.Google ScholarDigital Library
- [4] . 2006. Early drift detection method. In Proceedings of the International Workshop on Knowledge Discovery from Data Streams.Google Scholar
- [5] . 2006. Early drift detection method. In Proceedings of the 4th International Workshop on Knowledge Discovery from Data Streams. Vol. 6, 77–86.Google Scholar
- [6] . 2007. Learning from time-changing data with adaptive windowing. In Proceedings of the 2007 SIAM International Conference on Data Mining. SIAM, 443–448.Google ScholarCross Ref
- [7] . 2020. Kappa updated ensemble for drifting data stream mining. Machine Learning 109, 1 (2020), 175–218.Google ScholarDigital Library
- [8] . 2011. Incremental learning of concept drift in nonstationary environments. IEEE Transactions on Neural Networks 22, 10 (2011), 1517–1531.Google ScholarDigital Library
- [9] . 2015. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Transactions on Knowledge and Data Engineering 27, 3 (2015), 810–823.Google ScholarDigital Library
- [10] . 2014. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Transactions on Knowledge and Data Engineering 27, 3 (2014), 810–823.Google ScholarDigital Library
- [11] . 2004. Learning with drift detection. In Proceedings of the Brazilian Symposium on Artificial Intelligence. Springer, 286–295.Google ScholarCross Ref
- [12] . 2004. Learning with drift detection. In Brazilian Symposium on Artificial Intelligence. Springer, 286–295.Google Scholar
- [13] . 2014. A survey on concept drift adaptation. ACM Computing Surveys 46, 4 (2014), 1–37.Google ScholarDigital Library
- [14] . 2022. Concept drift type identification based on multi-sliding windows. Information Sciences 585 (2022), 1–23.Google ScholarDigital Library
- [15] . 2022. Analyzing and repairing concept drift adaptation in data stream classification. Machine Learning 111, 10 (2022), 3489–3523.Google ScholarDigital Library
- [16] . 2001. Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 97–106.Google ScholarDigital Library
- [17] . 2013. Clustering based active learning for evolving data streams. In Proceedings of the International Conference on Discovery Science. Springer, 79–93.Google ScholarCross Ref
- [18] . 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7482–7491.Google Scholar
- [19] . 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- [20] . 2017. Active and adaptive ensemble learning for online activity recognition from data streams. Knowledge-Based Systems 138 (2017), 69–78.Google ScholarDigital Library
- [21] . 2022. Concept drift detection delay index. IEEE Transactions on Knowledge and Data Engineering 35, 5 (2022), 4585–4597.Google Scholar
- [22] . 2017. Fuzzy time windowing for gradual concept drift adaptation. In Proceedings of the 2017 IEEE International Conference on Fuzzy Systems. IEEE, 1–6.Google ScholarDigital Library
- [23] . 2020. Learning under concept drift: A review. IEEE Transactions on Knowledge and Data Engineering 31, 12 (2018), 2346–2363, 2018, 1–1.Google Scholar
- [24] . 2019. Adaptive chunk-based dynamic weighted majority for imbalanced data streams with concept drift. IEEE Transactions on Neural Networks and Learning Systems 31, 8 (2019), 2764–2778.Google ScholarCross Ref
- [25] . 2007. Detecting concept drift using statistical testing. In Proceedings of the 10th International Conference on Discovery Science.Google ScholarDigital Library
- [26] . 2021. Tackling virtual and real concept drifts: An adaptive Gaussian mixture model approach. IEEE Transactions on Knowledge and Data Engineering 1 (2021), 1–1.Google ScholarCross Ref
- [27] . 2001. Experimental comparisons of online and batch versions of bagging and boosting. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 359–364.Google ScholarDigital Library
- [28] . 1992. Multilayer perceptron, fuzzy sets, classifiaction. IEEE Transactionson Neural Networks 3, (1992), 683–697.Google Scholar
- [29] . 2011. Scikit-learn: Machine learning in python. The Journal of Machine Learning Research 12 (2011), 2825–2830.Google ScholarDigital Library
- [30] . 2016. An incremental learning of concept drifts using evolving type-2 recurrent fuzzy neural networks. IEEE Transactions on Fuzzy Systems 25, 5 (2016), 1175–1192.Google ScholarDigital Library
- [31] . 2020. Reactive soft prototype computing for concept drift streams. Neurocomputing 416, (2020), 340–351.Google ScholarCross Ref
- [32] . 2022. Adaptive decision forest: An incremental machine learning framework. Pattern Recognition 122 (2022), 108345.Google ScholarDigital Library
- [33] . 1998. Online algorithms and stochastic approximations. Online Learning 5 (1998), 6–3.Google Scholar
- [34] . 2018. Online active learning ensemble framework for drifted data streams. IEEE Transactions on Neural Networks and Learning Systems 30, 2 (2018), 486–498.Google ScholarCross Ref
- [35] . 2017. Prototypical networks for few-shot learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems.Google Scholar
- [36] . 2021. Learning data streams with changing distributions and temporal dependency. IEEE Transactions on Neural Networks and Learning Systems (2021).Google Scholar
- [37] . 2018. The online performance estimation framework: Heterogeneous ensemble learning for data streams. Machine Learning 107, 1 (2018), 149–176.Google ScholarDigital Library
- [38] . 2022. Elastic gradient boosting decision tree with adaptive iterations for concept drift adaptation. Neurocomputing 491 (2022), 288–304.Google ScholarDigital Library
- [39] . 2022. Online feature screening for data streams with concept drift. IEEE Transactions on Knowledge and Data Engineering 1 (2022), 1–14.Google Scholar
- [40] . 2019. Active learning for regression using greedy sampling. Information Sciences 474 (2019), 90–105.Google ScholarCross Ref
- [41] . 2017. Dynamic extreme learning machine for data stream classification. Neurocomputing 238 (2017), 433–449.Google ScholarDigital Library
- [42] . 2020. Bayesian nonparametric unsupervised concept drift detection for data stream mining. ACM Transactions on Intelligent Systems and Technology 12, 1 (2020), 1–22.Google ScholarDigital Library
- [43] . 2022. Learn-to-adapt: Concept drift adaptation for hybrid multiple streams. Neurocomputing 496 (2022), 121–130.Google ScholarDigital Library
- [44] . 2014. Concept drift detection via competence models. Artificial Intelligence, 209 (2014), 11–18.Google Scholar
- [45] . 2023. Detecting group concept drift from multiple data streams. Pattern Recognition 134 (2023), 109113.Google ScholarDigital Library
- [46] . 2022. Real-time prediction system of train carriage load based on multi-stream fuzzy learning. IEEE Transactions on Intelligent Transportation Systems 23, 9 (2022), 15155–15165.Google Scholar
- [47] . 2020. Continuous support vector regression for nonstationary streaming data. IEEE Transactions on Cybernetics 52, 5 (2020), 3592–3605.Google ScholarCross Ref
- [48] . 2020. An online robust support vector regression for data streams. IEEE Transactions on Knowledge and Data Engineering PP, 99 (2020), 1–1.Google Scholar
- [49] . 2020. An online robust support vector regression for data streams. IEEE Transactions on Knowledge and Data Engineering 34, 1 (2020), 150–163.Google Scholar
- [50] . 2020. Topology learning-based fuzzy random neural networks for streaming data regression. IEEE Transactions on Fuzzy Systems 30, 2 (2020), 412–425.Google ScholarDigital Library
- [51] . 2021. MORStreaming: A multioutput regression system for streaming data. IEEE Transactions on Systems, Man, and Cybernetics: Systems 52, 8 (2021), 4862–4874.Google Scholar
- [52] . 2022. Meta-ADD: A meta-learning based pre-trained model for concept drift active detection. Information Sciences 608 (2022), 996–1009.Google ScholarDigital Library
- [53] . 2021. Semi-supervised classification on data streams with recurring concept drift and concept evolution. Knowledge-Based Systems 215 (2021), 106749.Google ScholarCross Ref
Index Terms
- Concept Drift Adaptation by Exploiting Drift Type
Recommendations
Unsupervised Concept Drift Detection with a Discriminative Classifier
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge ManagementIn data stream mining, one of the biggest challenges is to develop algorithms that deal with the changing data. As data evolve over time, static models become outdated. This phenomenon is called concept drift, and it is investigated extensively in the ...
Brute force concept drift detection
AbstractWe present a brute-force approach to detect concept drift behind time sequence data. This approach, named Select-Starţ searches for start points of concept drift to minimize error. In other words, Select-Start searches for the start points of new ...
Detecting and Adapting to Concept Drift in Continually Evolving Stochastic Processes
BDIOT '17: Proceedings of the International Conference on Big Data and Internet of ThingMany real world stochastic processes are non-stationary, which means that the probability distribution that generates data samples is time-varying. In the context of machine learning, this phenomenon is known as concept drift. It is important that ...
Comments