Skip to main content
Log in

Self-adaptive and local strategies for a smooth treatment of drifts in data streams

  • Original Paper
  • Published:
Evolving Systems Aims and scope Submit manuscript

Abstract

In this paper, we are dealing with a new concept for handling drifts in data streams during the run of on-line, evolving modeling processes in a regression context. Drifts require a specific attention in evolving modeling methods, as they usually change the underlying data distribution making previously learnt model parameters and structure outdated. Our approach comes with three new stages for an appropriate drift handling: (1) drifts are not only detected, but also quantified with a new extended version of the Page-Hinkley test; (2) we integrate an adaptive forgetting factor changing over time and which steers the degree of forgetting in dependency of the current drift intensity in the data stream; (3) we introduce local forgetting factors by addressing the different local regions of the feature space with a different forgetting intensity; this is achieved by using fuzzy model architecture within stream learning whose structural components (fuzzy rules) provide a local partitioning of the feature space and furthermore ensure smooth transitions of drift handling topology between neighboring regions. Additionally, our approach foresees an early drift recognition variant, which relies on divergence measures, indicating the degree of divergence in local parts of the feature space separately already before the global model error may start to rise significantly. Thus, it can be seen as an attempt regarding drift prevention on global model level. The new approach is successfully evaluated and compared with fixed forgetting and no forgetting on high-dimensional real-world data streams, including different types of drifts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. http://www.springer.com/physics/complexity/journal/12530.

  2. http://archive.ics.uci.edu/ml/.

  3. http://archive.ics.uci.edu/ml/.

References

  • Angelov P (2010) Evolving Takagi–Sugeno fuzzy systems from streaming data, eTS+. In: Angelov P, Filev D, Kasabov N (eds) Evolving intelligent systems: methodology and applications. Wiley, New York, pp 21–50

    Chapter  Google Scholar 

  • Angelov P, Filev D (2004) An approach to online identification of Takagi–Sugeno fuzzy models. IEEE Trans Syst Man Cybernet Part B: Cybernet 34(1):484–498

    Article  Google Scholar 

  • Angelov P, Filev D, Kasabov N (2010) Evolving intelligent systems—methodology and applications. Wiley, New York

    Book  Google Scholar 

  • Angelov P, Lughofer E, Zhou X (2008) Evolving fuzzy classifiers using different model architectures. Fuzzy Sets Syst 159(23):3160–3182

    Article  MATH  MathSciNet  Google Scholar 

  • Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res 11:1601–1604

    Google Scholar 

  • Bouchachia A (2011) Evolving clustering: an asset for evolving systems. IEEE SMC Newsl 36. http://www.my-smc.org/news/back/2011_09/main_article3.html

  • Bouchachia A, Vanaret C (2011) Incremental learning based on growing gaussian mixture models. In: Proceedings of 10th International Conference on machine learning and applications (ICMLA 2011), p to appear. Honululu, Haweii

  • Cernuda C, Lughofer E, Maerzinger W, Kasberger J (2011) NIR-based quantification of process parameters in polyetheracrylat (PEA) production using flexible non-linear fuzzy systems. Chemom Intell Lab Syst 109(1):22–33

    Article  Google Scholar 

  • Cortez P, Cerdeira A, Almeida F, Matos T, Reis J (2009) Modeling wine preferences by data mining from physicochemical properties. Decis Support Syst 47(4):547–553

  • Delany SJ, Cunningham P, Tsymbal A, Coyle L (2005) A case-based technique for tracking concept drift in spam filtering. Knowl Based Syst 18(4–5):187–195

    Article  Google Scholar 

  • Diehl C, Cauwenberghs G (2003) SVM incremental learning, adaptation and optimization. In: Proceedings of the International Joint Conference on neural networks, vol 4. Boston, pp 2685–2690

  • Dovzan D, Skrjanc I (2011) Recursive clustering based on a Gustafson–Kessel algorithm. Evol Syst 2(1):15–24

    Article  Google Scholar 

  • French RM (1999) Catastrophic forgetting in connectionist networks. Trends Cogn Sci 3(4):128–135

    Article  MathSciNet  Google Scholar 

  • Gama J (2010) Knowledge discovery from data streams. Chapman & Hall/CRC, Boca Raton

    Book  MATH  Google Scholar 

  • Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Lecture notes in computer science, vol 3171. Springer, Berlin Heidelberg, , pp 286–295

  • Gama J, Rodrigues P, Sebastiao R (2009) Evaluating algorithms that learn from data streams. In: SAC ’09 Proceedings of the 2009 ACM symposium on applied computing. ACM, New York, pp 1496–1500

  • Gama J, Sebastião R, Rodrigues P (2013) On evaluating stream learning algorithms. Mach Learn 90(3):317–346

  • Groißböck W, Lughofer E, Klement E (2004) A comparison of variable selection methods with the main focus on orthogonalization. In: Lopéz-Díaz M, Gil M, Grzegorzewski P, Hryniewicz O, Lawry J (eds) Soft methodology and random information systems, advances in soft computing. Springer, Berlin, Heidelberg, New York, pp 479–486

    Chapter  Google Scholar 

  • Hamker F (2001) RBF learning in a non-stationary environment: the stability–plasticity dilemma. In: Howlett R, Jain L (eds) Radial basis function networks 1: recent developments in theory and applications. Physica Verlag, Heidelberg, New York, pp 219–251

    Google Scholar 

  • Hisada M, Ozawa S, Zhang K, Kasabov N (2010) Incremental linear discriminant analysis for evolving feature spaces in multitask pattern recognition problems. Evol Syst 1(1):17–27

    Article  Google Scholar 

  • Ikonomovska E, Gama J, Sebastiao R, Gjorgjevik D (2009) Regression trees from data streams with drift detection. In: v. Lecture Notes in Computer Science (ed) Discovery science. Springer, Berlin, Heidelberg, pp 121–135

  • Kalhor A, Araabi B, Lucas C (2010) An online predictor model as adaptive habitually linear and transiently nonlinear model. Evol Syst 1(1):29–41

    Article  Google Scholar 

  • Kasabov N (2007) Evolving connectionist systems: the knowledge engineering approach, 2nd edn. Springer, London

    Google Scholar 

  • Klement E, Mesiar R, Pap E (2000) Triangular norms. Kluwer Academic Publishers, Dordrecht, Norwell, New York, London

    Book  MATH  Google Scholar 

  • Klinkenberg R (2004) Learning drifting concepts: example selection vs. example weighting. Intell Data Anal 8(3):281–300

    Google Scholar 

  • Kullback S, Leibler R (1951) On information and sufficiency. Ann Math Stat 22(1):79–86

    Article  MATH  MathSciNet  Google Scholar 

  • Kurlej B, Wozniak M (2011) Learning curve in concept drift while using active learning paradigm. In: Bouchachia A (ed) ICAIS 2011, LNAI 6943. Springer, Berlin, Heidelberg, pp 98–106

  • Lindstrom P, Namee B, Delany S (2013) Drift detection using uncertainty distribution divergence. Evol Syst 4(1):13–25

    Article  Google Scholar 

  • Lughofer E (2005) Aspects of incremental rule consequent learning. Technical report FLLL-TR-0502. Fuzzy logic laboratorium Linz-Hagenberg, A-4232 Hagenberg, Austria

  • Lughofer E (2008) FLEXFIS: a robust incremental learning approach for evolving TS fuzzy models. IEEE Trans Fuzzy Syst 16(6):1393–1410

    Article  Google Scholar 

  • Lughofer E (2011) Evolving fuzzy systems—methodologies. Advanced concepts and applications. Springer, Berlin, Heidelberg

    Book  MATH  Google Scholar 

  • Lughofer E (2012) Flexible evolving fuzzy inference systems from data streams (FLEXFIS++). In: Sayed-Mouchaweh M, Lughofer E (eds) Learning in non-stationary environments: methods and applications. Springer, New York, pp 205–246

    Chapter  Google Scholar 

  • Lughofer E (2012) Single-pass active learning with conflict and ignorance. Evol Syst 3(4):251–271

    Article  Google Scholar 

  • Lughofer E (2013) On-line assurance of interpretability criteria in evolving fuzzy systems—achievements, new concepts and open issues. Inf Sci 251:22–46

    Article  Google Scholar 

  • Lughofer E, Angelov P (2011) Handling drifts and shifts in on-line data streams with evolving fuzzy systems. Appl Soft Comput 11(2):2057–2068

    Article  Google Scholar 

  • Lughofer E, Bouchot JL, Shaker A (2011) On-line elimination of local redundancies in evolving fuzzy systems. Evol Syst 2(3):165–187

    Article  Google Scholar 

  • Mahalanobis PC (1936) On the generalised distance in statistics. Proc Natl Inst Sci India 2(1):49–55

    MATH  MathSciNet  Google Scholar 

  • Moe-Helgesen OM, Stranden H (2005) Catastophic forgetting in neural networks. Technical report, Norwegian University of Science and Technology, Trondheim

    Google Scholar 

  • Mouss H, Mouss D, Mouss N, Sefouhi L (2004) Test of Page-Hinkley, an approach for fault detection in an agro-alimentary production system. Proc Asian Control Conf 2:815–818

    Google Scholar 

  • Pratama M, Anavatti S, Lughofer E (2014) GENFIS: towards and effective localist network. IEEE Trans Fuzzy Syst. doi:10.1109/TFUZZ.2013.2264938

  • Qin S, Li W, Yue H (2000) Recursive PCA for adaptive process monitoring. J Process Control 10(5):471–486

    Article  Google Scholar 

  • Ramamurthy S, Bhatnagar R (2007) Tracking recurrent concept drift in streaming data using ensemble classifiers. In: Proceedings of the Sixth International Conference on machine learning and applications (ICMLA). Cincinnati, Ohio, pp 404–409

  • Raquel Sebastião Margarida M, Silva JG, Mendonça T (2011) Contributions to an advisory system for changes detection in depth of anesthesia signals. In: LEMEDS11: Proceedings of the Learning from medical data streams. Bled, Slovenia

  • Sayed-Mouchaweh M, Lughofer E (2012) Learning in non-stationary environments: methods and applications. Springer, New York

    Book  Google Scholar 

  • Sebastiao R, Silva M, Rabico R, Gama J, Mendonca T (2013) Real-time algorithm for changes detection in depth of anesthesia signals. Evol Syst 4(1):3–12

    Article  Google Scholar 

  • Serdio F, Lughofer E, Pichler K, Buchegger T, Efendic H (2014) Residual-based fault detection using soft computing techniques for condition monitoring at rolling mills. Inf Sci 259:304–320

    Article  Google Scholar 

  • Shaker A, Hüllermeier E (2012) Instance-based classification and regression on data streams. In: Lughofer E, Sayed-Mouchaweh M (eds) Learning in non-stationary environments: methods and applications. Springer, New York, pp 185–201

    Chapter  Google Scholar 

  • Shaker A, Senge R, Hüllermeier E (2013) Evolving fuzzy patterns trees for binary classification on data streams. Inf Sci 220:34–45

    Article  Google Scholar 

  • Shilton A, Palaniswami M, Ralph D, Tsoi AC (2005) Incremental training of support vector machines. IEEE Trans Neural Netw 16(1):114–131

  • Soleimani H, Lucas K, Araabi B (2010) Recursive Gath–Geva clustering as a basis for evolving neuro-fuzzy modeling. Evol Syst 1(1):59–71

    Article  Google Scholar 

  • Song M, Wang H (2005) Highly efficient incremental estimation of gaussian mixture models for online data stream clustering. In: Priddy KL (ed) Intelligent computing: theory and applications III, Proceedings of the SPIE, vol 5803. pp 174–183

  • Takagi T, Sugeno M (1985) Fuzzy identification of systems and its applications to modeling and control. IEEE Trans Syst Man Cybernet 15(1):116–132

    Article  MATH  Google Scholar 

  • Tsymbal A (2004) The problem of concept drift: definitions and related work. Technical report TCD-CS-2004-15, Department of Computer Science, Trinity College Dublin, Ireland

  • Utgoff P, Berkman NC, Clouse JA (1997) Decision tree induction based on efficient tree restructuring. Mach Learn 29(1):5–44

  • Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101

    Google Scholar 

Download references

Acknowledgments

This work was funded by the German Research Foundation (DFG) and the Austrian Science Fund (FWF, contract number I328-N23). The second author also acknowledges the support of the Austrian COMET-K2 programme of the Linz Center of Mechatronics (LCM), funded by the Austrian federal government and the federal state of Upper Austria. This publication reflects only the authors’ views.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edwin Lughofer.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shaker, A., Lughofer, E. Self-adaptive and local strategies for a smooth treatment of drifts in data streams. Evolving Systems 5, 239–257 (2014). https://doi.org/10.1007/s12530-014-9108-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12530-014-9108-y

Keywords

Navigation