Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 6/2021

03-02-2021 | Original Article

Adaptive robust local online density estimation for streaming data

Authors: Zhong Chen, Zhide Fang, Victor Sheng, Jiabin Zhao, Wei Fan, Andrea Edwards, Kun Zhang

Published in: International Journal of Machine Learning and Cybernetics | Issue 6/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Accurate online density estimation is crucial to numerous applications that are prevalent with streaming data. Existing online approaches for density estimation somewhat lack prompt adaptability and robustness when facing concept-drifting and noisy streaming data, resulting in delayed or even deteriorated approximations. To alleviate this issue, in this work, we first propose an adaptive local online kernel density estimator (ALoKDE) for real-time density estimation on data streams. ALoKDE consists of two tightly integrated strategies: (1) a statistical test for concept drift detection and (2) an adaptive weighted local online density estimation when a drift does occur. Specifically, using a weighted form, ALoKDE seeks to provide an unbiased estimation by factoring in the statistical hallmarks of the latest learned distribution and any potential distributional changes that could be introduced by each incoming instance. A robust variant of ALoKDE, i.e., R-ALoKDE, is further developed to effectively handle data streams with varied types/levels of noise. Moreover, we analyze the asymptotic properties of ALoKDE and R-ALoKDE, and also derive their theoretical error bounds regarding bias, variance, MSE and MISE. Extensive comparative studies on various artificial and real-world (noisy) streaming data demonstrate the efficacies of ALoKDE and R-ALoKDE in online density estimation and real-time classification (with noise).

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Appendix
Available only for authorised users
Footnotes
1
This kernel is asymptotically-optimally efficient among all other kernel functions [7].
 
2
“Gaussian”, “Skewed unimodal”, “Strongly skewed”, “Kurtotic unimodal”, “Outlier”, “Bimodal”,“Separated bimodal”, “Skewed bimodal”, “Trimodal”, “Claw”, “Double Claw”, “Symmetric Claw”, “Asymmetric Double Claw”, “Smooth Comb”, and “Discrete Comb”.
 
Literature
1.
go back to reference Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv (CSUR) 46(4):44CrossRef Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv (CSUR) 46(4):44CrossRef
2.
go back to reference Cesa-Bianchi N, Shalev-Shwartz S, Shamir O (2011) Online learning of noisy data. IEEE Trans Inf Theory 57(12):7907–7931MathSciNetCrossRef Cesa-Bianchi N, Shalev-Shwartz S, Shamir O (2011) Online learning of noisy data. IEEE Trans Inf Theory 57(12):7907–7931MathSciNetCrossRef
3.
go back to reference Deng C, Yang E, Liu T, Tao D (2019) Two-stream deep hashing with class-specific centers for supervised image search. IEEE Trans Neural Netw Learn Syst 31(6):2189–2201CrossRef Deng C, Yang E, Liu T, Tao D (2019) Two-stream deep hashing with class-specific centers for supervised image search. IEEE Trans Neural Netw Learn Syst 31(6):2189–2201CrossRef
4.
go back to reference Procopiuc CM, Procopiuc O, (2005) Density estimation for spatial data streams. In: Bauzer Medeiros C, Egenhofer MJ, Bertino E (eds) Proceedings of advances in spatial and temporal databases, Heidelberg, (August 2005) Lecture notes in computer science, vol 3633. Springer, Berlin, pp 109–126 Procopiuc CM, Procopiuc O, (2005) Density estimation for spatial data streams. In: Bauzer Medeiros C, Egenhofer MJ, Bertino E (eds) Proceedings of advances in spatial and temporal databases, Heidelberg, (August 2005) Lecture notes in computer science, vol 3633. Springer, Berlin, pp 109–126
5.
go back to reference Heinz C, Seeger B (2008) Cluster kernels: resource-aware kernel density estimators over streaming data. IEEE Trans Knowl Data Eng 20(7):880–893CrossRef Heinz C, Seeger B (2008) Cluster kernels: resource-aware kernel density estimators over streaming data. IEEE Trans Knowl Data Eng 20(7):880–893CrossRef
6.
go back to reference Kristan M, Leonardis A, Skočaj D (2011) Multivariate online kernel density estimation with Gaussian kernels. Pattern Recognit 44(10–11):2630–2642CrossRef Kristan M, Leonardis A, Skočaj D (2011) Multivariate online kernel density estimation with Gaussian kernels. Pattern Recognit 44(10–11):2630–2642CrossRef
7.
go back to reference Qahtan A, Wang S, Zhang X (2017) KDE-Track: an efficient dynamic density estimator for data streams. IEEE Trans Knowl Data Eng 29(3):642–655CrossRef Qahtan A, Wang S, Zhang X (2017) KDE-Track: an efficient dynamic density estimator for data streams. IEEE Trans Knowl Data Eng 29(3):642–655CrossRef
8.
go back to reference Zhang P, Zhu X, Shi Y, Guo L, Wu X (2011) Robust ensemble learning for mining noisy data streams. Decis Support Syst 50(2):469–479CrossRef Zhang P, Zhu X, Shi Y, Guo L, Wu X (2011) Robust ensemble learning for mining noisy data streams. Decis Support Syst 50(2):469–479CrossRef
9.
go back to reference Krawczyk B, Cano A (2018) Online ensemble learning with abstaining classifiers for drifting and noisy data streams. Appl Soft Comput 68(7):677–692CrossRef Krawczyk B, Cano A (2018) Online ensemble learning with abstaining classifiers for drifting and noisy data streams. Appl Soft Comput 68(7):677–692CrossRef
10.
go back to reference Verma N, Branson K (2015) Sample complexity of learning Mahalanobis distance metrics. In: Proceedings of the advances in neural information processing systems, Montréal, December 2015, pp 2584–2592 Verma N, Branson K (2015) Sample complexity of learning Mahalanobis distance metrics. In: Proceedings of the advances in neural information processing systems, Montréal, December 2015, pp 2584–2592
11.
go back to reference Bifet A, Gavalda R (2007) Learning from time-changing data with adaptive windowing. In: Proceedings of the SIAM international conference on data mining, Minneapolis, April 2007, pp 443–448 Bifet A, Gavalda R (2007) Learning from time-changing data with adaptive windowing. In: Proceedings of the SIAM international conference on data mining, Minneapolis, April 2007, pp 443–448
12.
go back to reference Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, New YorkCrossRef Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, New YorkCrossRef
13.
14.
go back to reference Banerjee A, urlina P (2010) Efficient particle filtering via sparse kernel density estimation. IEEE Trans Image Process 19(9):2480–2490MathSciNetCrossRef Banerjee A, urlina P (2010) Efficient particle filtering via sparse kernel density estimation. IEEE Trans Image Process 19(9):2480–2490MathSciNetCrossRef
15.
go back to reference Hong X, Chen S, Qatawneh A, Daqrouq K, Sheikh M, Morfeq A (2013) Sparse probability density function estimation using the minimum integrated square error. Neurocomputing 115:122–129CrossRef Hong X, Chen S, Qatawneh A, Daqrouq K, Sheikh M, Morfeq A (2013) Sparse probability density function estimation using the minimum integrated square error. Neurocomputing 115:122–129CrossRef
16.
go back to reference Carbone P, Petri D, Barbé K (2017) Nonparametric probability density estimation via interpolation filtering. IEEE Trans Instrum Meas 66(4):681–690CrossRef Carbone P, Petri D, Barbé K (2017) Nonparametric probability density estimation via interpolation filtering. IEEE Trans Instrum Meas 66(4):681–690CrossRef
17.
go back to reference Zhang L, Lin J, Karim R (2018) Adaptive kernel density-based anomaly detection for nonlinear systems. Know Based Syst 139:50–63CrossRef Zhang L, Lin J, Karim R (2018) Adaptive kernel density-based anomaly detection for nonlinear systems. Know Based Syst 139:50–63CrossRef
18.
go back to reference Wahid A, Rao A (2019) RKDOS: a relative kernel density-based outlier score. IETE Tech Rev 8:1–12 Wahid A, Rao A (2019) RKDOS: a relative kernel density-based outlier score. IETE Tech Rev 8:1–12
19.
go back to reference Deng C, Liu X, Li C, Tao D (2018) Active multi-kernel domain adaptation for hyperspectral image classification. Pattern Recognit 77:306–315CrossRef Deng C, Liu X, Li C, Tao D (2018) Active multi-kernel domain adaptation for hyperspectral image classification. Pattern Recognit 77:306–315CrossRef
20.
go back to reference Yang M, Deng C, Nie F (2019) Adaptive-weighting discriminative regression for multi-view classification. Pattern Recognit 88:236–245CrossRef Yang M, Deng C, Nie F (2019) Adaptive-weighting discriminative regression for multi-view classification. Pattern Recognit 88:236–245CrossRef
21.
go back to reference Zhou A, Cai Z, Wei L, Qian W (2003) M-kernel merging: towards density estimation over data streams. In: Proceedings of the IEEE international conference on database systems for advanced applications, Kyoto, March 2003, pp 285–292 Zhou A, Cai Z, Wei L, Qian W (2003) M-kernel merging: towards density estimation over data streams. In: Proceedings of the IEEE international conference on database systems for advanced applications, Kyoto, March 2003, pp 285–292
22.
go back to reference Cao Y, He H, Man H (2012) SOMKE: kernel density estimation over data streams by sequences of self-organizing maps. IEEE Trans Neural Netw Learn Syst 23(8):1254–1268CrossRef Cao Y, He H, Man H (2012) SOMKE: kernel density estimation over data streams by sequences of self-organizing maps. IEEE Trans Neural Netw Learn Syst 23(8):1254–1268CrossRef
23.
go back to reference Kristan M, Leonardis A (2014) Online discriminative kernel density estimator with Gaussian kernels. IEEE Trans Cybern 44(3):355–365CrossRef Kristan M, Leonardis A (2014) Online discriminative kernel density estimator with Gaussian kernels. IEEE Trans Cybern 44(3):355–365CrossRef
24.
go back to reference Qiu T, Shen F, Zhao J (2015) Local adaptive and incremental Gaussian mixture for online density estimation. In Cao T, Lim EP, Zhou ZH, Ho TB, Cheung D, Motoda H (eds) Proceedings of the advances in knowledge discovery and data mining, Ho Chi Minh City, May 2015. Lecture Notes in Computer Science, vol 9077. Springer, Cham, pp 418–428 Qiu T, Shen F, Zhao J (2015) Local adaptive and incremental Gaussian mixture for online density estimation. In Cao T, Lim EP, Zhou ZH, Ho TB, Cheung D, Motoda H (eds) Proceedings of the advances in knowledge discovery and data mining, Ho Chi Minh City, May 2015. Lecture Notes in Computer Science, vol 9077. Springer, Cham, pp 418–428
25.
go back to reference Wilcox R (2005) Kolmogorov–Smirnov test. Encyclopedia of biostatistics Wilcox R (2005) Kolmogorov–Smirnov test. Encyclopedia of biostatistics
26.
go back to reference Lall A (2015) Data streaming algorithms for the Kolmogorov-Smirnov test. In: Proceedings of the IEEE international conference on big data, Santa Clara, October 2015, pp 95–104 Lall A (2015) Data streaming algorithms for the Kolmogorov-Smirnov test. In: Proceedings of the IEEE international conference on big data, Santa Clara, October 2015, pp 95–104
27.
go back to reference Duong T, Hazelton ML (2005) Cross-validation bandwidth matrices for multivariate kernel density estimation. Scand J Stat 32(3):485–506MathSciNetCrossRef Duong T, Hazelton ML (2005) Cross-validation bandwidth matrices for multivariate kernel density estimation. Scand J Stat 32(3):485–506MathSciNetCrossRef
28.
go back to reference Raykar VC, Duraiswami R (2006) Fast optimal bandwidth selection for kernel density estimation. In: Proceedings of the SIAM international conference on data mining, Bethesda, April 2006, pp 524–528 Raykar VC, Duraiswami R (2006) Fast optimal bandwidth selection for kernel density estimation. In: Proceedings of the SIAM international conference on data mining, Bethesda, April 2006, pp 524–528
29.
go back to reference Yang C, Duraiswami R, Gumerov NA, Davis L (2003) Improved fast gauss transform and efficient kernel density estimation. In: Proceedings of the IEEE international conference on computer vision, Nice, October 2003, pp 464 Yang C, Duraiswami R, Gumerov NA, Davis L (2003) Improved fast gauss transform and efficient kernel density estimation. In: Proceedings of the IEEE international conference on computer vision, Nice, October 2003, pp 464
30.
31.
go back to reference Foote J (2000) Automatic audio segmentation using a measure of audio novelty. In: Proceedings of the IEEE international conference on multimedia and expo, New York, July 2000, pp 452–455 Foote J (2000) Automatic audio segmentation using a measure of audio novelty. In: Proceedings of the IEEE international conference on multimedia and expo, New York, July 2000, pp 452–455
32.
go back to reference Losing V, Hammer B, Wersing H (2016) Knn classifier with self adjusting memory for heterogeneous concept drift. In: Proceedings of the IEEE international conference on data mining, Barcelona, December 2016, pp 291–300 Losing V, Hammer B, Wersing H (2016) Knn classifier with self adjusting memory for heterogeneous concept drift. In: Proceedings of the IEEE international conference on data mining, Barcelona, December 2016, pp 291–300
33.
go back to reference Boedihardjo A, Liu C, Chen F (2015) Fast adaptive kernel density estimator for data streams. Knowl Inf Syst 42(2):285–317CrossRef Boedihardjo A, Liu C, Chen F (2015) Fast adaptive kernel density estimator for data streams. Knowl Inf Syst 42(2):285–317CrossRef
34.
go back to reference Masud M, Gao J, Khan L, Han J, Thuraisingham B (2010) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874CrossRef Masud M, Gao J, Khan L, Han J, Thuraisingham B (2010) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874CrossRef
35.
go back to reference Masud M, Chen Q, Khan L, Aggarwal C, Gao J, Han J, Srivastava A, Oza N (2012) Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans Knowl Data Eng 25(7):1484–1497CrossRef Masud M, Chen Q, Khan L, Aggarwal C, Gao J, Han J, Srivastava A, Oza N (2012) Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans Knowl Data Eng 25(7):1484–1497CrossRef
Metadata
Title
Adaptive robust local online density estimation for streaming data
Authors
Zhong Chen
Zhide Fang
Victor Sheng
Jiabin Zhao
Wei Fan
Andrea Edwards
Kun Zhang
Publication date
03-02-2021
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 6/2021
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-021-01275-y

Other articles of this Issue 6/2021

International Journal of Machine Learning and Cybernetics 6/2021 Go to the issue