Skip to main content
Top
Published in: Journal of Intelligent Information Systems 2/2012

01-10-2012

A new method of mining data streams using harmony search

Authors: Zohre Karimi, Hassan Abolhassani, Hamid Beigy

Published in: Journal of Intelligent Information Systems | Issue 2/2012

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Incremental learning has been used extensively for data stream classification. Most attention on the data stream classification paid on non-evolutionary methods. In this paper, we introduce new incremental learning algorithms based on harmony search. We first propose a new classification algorithm for the classification of batch data called harmony-based classifier and then give its incremental version for classification of data streams called incremental harmony-based classifier. Finally, we improve it to reduce its computational overhead in absence of drifts and increase its robustness in presence of noise. This improved version is called improved incremental harmony-based classifier. The proposed methods are evaluated on some real world and synthetic data sets. Experimental results show that the proposed batch classifier outperforms some batch classifiers and also the proposed incremental methods can effectively address the issues usually encountered in the data stream environments. Improved incremental harmony-based classifier has significantly better speed and accuracy on capturing concept drifts than the non-incremental harmony based method and its accuracy is comparable to non-evolutionary algorithms. The experimental results also show the robustness of improved incremental harmony-based classifier.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Bifet, A., & Gavaldà, R. (2009a). Adaptive parameter-free learning from evolving data streams. In IDA. Bifet, A., & Gavaldà, R. (2009a). Adaptive parameter-free learning from evolving data streams. In IDA.
go back to reference Bifet A., & Gavaldà, R. (2009b). Adaptive XML tree classification on evolving data streams. In Proc. of European conference on machine learning and knowledge discovery in databases, ECML/PKDD. Bifet A., & Gavaldà, R. (2009b). Adaptive XML tree classification on evolving data streams. In Proc. of European conference on machine learning and knowledge discovery in databases, ECML/PKDD.
go back to reference Cunningham, P., Nowlan, N., Delany, S. J., & Haahr, M. (2003). A case-based approach to spam filtering that can track concept drift. Technical Report TCD-CS-2003-16, Ireland, Trinity College Dublin. Cunningham, P., Nowlan, N., Delany, S. J., & Haahr, M. (2003). A case-based approach to spam filtering that can track concept drift. Technical Report TCD-CS-2003-16, Ireland, Trinity College Dublin.
go back to reference Fan, W. (2004a). StreamMiner: A classifier ensemble-based engine to mine concept-drifting data streams. In Proc. of 2004 international conference on Very Large Data Bases (VLDB’2004) (Vol. 30, pp. 1257–1260). Toronto, Canada. Fan, W. (2004a). StreamMiner: A classifier ensemble-based engine to mine concept-drifting data streams. In Proc. of 2004 international conference on Very Large Data Bases (VLDB’2004) (Vol. 30, pp. 1257–1260). Toronto, Canada.
go back to reference Fan, W. (2004b). Systematic data selection to mine concept-drifting data stream. In Proc. of ACM SIGKDD (pp. 128–137). Seattle, Washington USA. Fan, W. (2004b). Systematic data selection to mine concept-drifting data stream. In Proc. of ACM SIGKDD (pp. 128–137). Seattle, Washington USA.
go back to reference Fogel, L. (1994). Evolutionary programming in perspective: The top-down view. In: J. M. Zurada, R. Marks II, C. Robinson (Eds.), Computational intelligence: Imitating life (pp. 135–146). Piscataway: IEEE Press. Fogel, L. (1994). Evolutionary programming in perspective: The top-down view. In: J. M. Zurada, R. Marks II, C. Robinson (Eds.), Computational intelligence: Imitating life (pp. 135–146). Piscataway: IEEE Press.
go back to reference Gama, J., Medas, P., & Rocha, R. (2004). Forest trees for on-line data. In Proc. ACM symp. applied computing (SAC’04) (pp. 632–636). Gama, J., Medas, P., & Rocha, R. (2004). Forest trees for on-line data. In Proc. ACM symp. applied computing (SAC’04) (pp. 632–636).
go back to reference Geem, Z. W., Kim, J. H., & Loganathan, G. V. (2002). A new heuristic optimization algorithm: Harmony search. Simulation, 76(2), 60–68.CrossRef Geem, Z. W., Kim, J. H., & Loganathan, G. V. (2002). A new heuristic optimization algorithm: Harmony search. Simulation, 76(2), 60–68.CrossRef
go back to reference Geem, Z. W., Tseng, C., & Park, Y. (2005). Harmony search for generalized orienteering problem: Best touring in China, Springer. Lecture Notes in Computer Science, 3412, 741–750.CrossRef Geem, Z. W., Tseng, C., & Park, Y. (2005). Harmony search for generalized orienteering problem: Best touring in China, Springer. Lecture Notes in Computer Science, 3412, 741–750.CrossRef
go back to reference Goldberg, D. (1989). Genetic algorithms in search, optimization and machine learning. Addison-Wesley. Goldberg, D. (1989). Genetic algorithms in search, optimization and machine learning. Addison-Wesley.
go back to reference Guan, S. U., & Zhucollard, F. (2005). An incremental approach to genetic-algorithm-based classification. IEEE Transactions on Systems, Man and Cybernetics, Part B–Cybernetics, 35(2), 227–239.CrossRef Guan, S. U., & Zhucollard, F. (2005). An incremental approach to genetic-algorithm-based classification. IEEE Transactions on Systems, Man and Cybernetics, Part B–Cybernetics, 35(2), 227–239.CrossRef
go back to reference Han, J., & Kamber, M. (2006). Data mining: Concepts and techniques (2nd Edn.). Morgan Kaufmann Publisher. Han, J., & Kamber, M. (2006). Data mining: Concepts and techniques (2nd Edn.). Morgan Kaufmann Publisher.
go back to reference Hashemi, S., Yang, Y., Mirzamomen, Z., & Kangavari, M. (2009). Adapted one-versus-all decision trees for data stream classification. IEEE Transactions on Knowledge and Data Engineering, 21(5), 624–637.CrossRef Hashemi, S., Yang, Y., Mirzamomen, Z., & Kangavari, M. (2009). Adapted one-versus-all decision trees for data stream classification. IEEE Transactions on Knowledge and Data Engineering, 21(5), 624–637.CrossRef
go back to reference Holland, J. H. (1986). Escaping brittleness: The possibilities of general purpose learning algorithms applied to parallel rule-based systems. In Machine learning: An artificial intelligence approach (Vol. II, pp. 593–623). Morgan Kaufmann. Holland, J. H. (1986). Escaping brittleness: The possibilities of general purpose learning algorithms applied to parallel rule-based systems. In Machine learning: An artificial intelligence approach (Vol. II, pp. 593–623). Morgan Kaufmann.
go back to reference Hulten, G., Spencer, L., & Domingos, P. (2001). Mining time-changing data streams. In F. Provost (Ed.), Knowledge discovery and data mining (pp. 97–106). AAAI Press. Hulten, G., Spencer, L., & Domingos, P. (2001). Mining time-changing data streams. In F. Provost (Ed.), Knowledge discovery and data mining (pp. 97–106). AAAI Press.
go back to reference Klinkenberg, R. (2004). Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis, 8(3), 281–300. Klinkenberg, R. (2004). Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis, 8(3), 281–300.
go back to reference Kolter, J. Z., & Maloof, M. A. (2003). Dynamic weighted majority: A new ensemble method for tracking concept drift. In Proc. Of the 3rd IEEE int. conf. on data mining ICDM-2003 (pp. 123–130). IEEE CS Press: Los Alamitos, CA.CrossRef Kolter, J. Z., & Maloof, M. A. (2003). Dynamic weighted majority: A new ensemble method for tracking concept drift. In Proc. Of the 3rd IEEE int. conf. on data mining ICDM-2003 (pp. 123–130). IEEE CS Press: Los Alamitos, CA.CrossRef
go back to reference Koza, J. (1992). Genetic programming: On the programming of computers by means of natural selection. Cambridge: MIT Press.MATH Koza, J. (1992). Genetic programming: On the programming of computers by means of natural selection. Cambridge: MIT Press.MATH
go back to reference Koza, J., & Poli, R. (2005). Genetic programming. In E. Burke & G. Kendall (Eds.), Introductory tutorials in optimization, decision support and search methodology (Chapter 5, pp. 127–164). Kluwer Press. Koza, J., & Poli, R. (2005). Genetic programming. In E. Burke & G. Kendall (Eds.), Introductory tutorials in optimization, decision support and search methodology (Chapter 5, pp. 127–164). Kluwer Press.
go back to reference Lee, K. S., & Geem, Z. W (2004). A new meta-heuristic algorithm for continues engineering optimization: Harmony search theory and practice. Computer Methods in Applied Mechanics and Engineering, 194, 3902–3933.CrossRef Lee, K. S., & Geem, Z. W (2004). A new meta-heuristic algorithm for continues engineering optimization: Harmony search theory and practice. Computer Methods in Applied Mechanics and Engineering, 194, 3902–3933.CrossRef
go back to reference Liu, J., Li, X., & Zhong, W. (2009). Ambiguous decision trees for mining concept-drifting data streams. Pattern Recognition Letters, 30, 1347–1355.CrossRef Liu, J., Li, X., & Zhong, W. (2009). Ambiguous decision trees for mining concept-drifting data streams. Pattern Recognition Letters, 30, 1347–1355.CrossRef
go back to reference Mahdavi, M., Fesanghary M., & Damangir, E. (2007). An improved harmony search algorithm for solving optimization problems. Applied Mathematics and Computation, 188, 1567–1579.MathSciNetMATHCrossRef Mahdavi, M., Fesanghary M., & Damangir, E. (2007). An improved harmony search algorithm for solving optimization problems. Applied Mathematics and Computation, 188, 1567–1579.MathSciNetMATHCrossRef
go back to reference Mukhopadhyay, A., Roy, A., Das, S., & Abraham, A. (2008). Population-variance and explorative power of harmony search: an analysis. In Proceedings of 3rd IEEE international conference on digital information management (ICDIM 2008) (pp. 13–16). London, United Kingdom. Mukhopadhyay, A., Roy, A., Das, S., & Abraham, A. (2008). Population-variance and explorative power of harmony search: an analysis. In Proceedings of 3rd IEEE international conference on digital information management (ICDIM 2008) (pp. 13–16). London, United Kingdom.
go back to reference Polikar, R., Udpa, L., Udpa, S., & Honavar, V. (2001). Learn+ +: An incremental learning algorithm for supervised neural networks. IEEE Transactions on Systems, Man and Cybernetics; Part C–Cybernetics, 31, 497–508.CrossRef Polikar, R., Udpa, L., Udpa, S., & Honavar, V. (2001). Learn+ +: An incremental learning algorithm for supervised neural networks. IEEE Transactions on Systems, Man and Cybernetics; Part C–Cybernetics, 31, 497–508.CrossRef
go back to reference Storn R., & Price, K. (1997). Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11, 341–359.MathSciNetMATHCrossRef Storn R., & Price, K. (1997). Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11, 341–359.MathSciNetMATHCrossRef
go back to reference Street, W., & Kim, Y. (2001). A streaming ensemble algorithm for large scale classification. In Proceeding of the seventh international conference on knowledge discovery and data mining (pp. 377–382). NY. Street, W., & Kim, Y. (2001). A streaming ensemble algorithm for large scale classification. In Proceeding of the seventh international conference on knowledge discovery and data mining (pp. 377–382). NY.
go back to reference Wang, H., Fan, W., Yu, P., & Han, J. (2003). Mining concept-drifting data streams using ensemble classifiers. In Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD2003) (pp. 226–235). Washington, D.C. Wang, H., Fan, W., Yu, P., & Han, J. (2003). Mining concept-drifting data streams using ensemble classifiers. In Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD2003) (pp. 226–235). Washington, D.C.
go back to reference Widmer G., & Kubat, M. (1996). Learning in the presence of concept drift and hidden contexts. Machine Learning, 23(1), 69–101. Widmer G., & Kubat, M. (1996). Learning in the presence of concept drift and hidden contexts. Machine Learning, 23(1), 69–101.
go back to reference Witten, I. H., & Frank, E. (1999). Data mining: Practical machine learning tools with Java implementations. San Francisco: Morgan Kaufmann. Witten, I. H., & Frank, E. (1999). Data mining: Practical machine learning tools with Java implementations. San Francisco: Morgan Kaufmann.
go back to reference Zhang, Y., & Bhattacharyya, S. (2004). Genetic programming in classifying large-scale data: An ensemble method. Information Sciences, 163, 85–101.CrossRef Zhang, Y., & Bhattacharyya, S. (2004). Genetic programming in classifying large-scale data: An ensemble method. Information Sciences, 163, 85–101.CrossRef
Metadata
Title
A new method of mining data streams using harmony search
Authors
Zohre Karimi
Hassan Abolhassani
Hamid Beigy
Publication date
01-10-2012
Publisher
Springer US
Published in
Journal of Intelligent Information Systems / Issue 2/2012
Print ISSN: 0925-9902
Electronic ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-012-0199-2

Other articles of this Issue 2/2012

Journal of Intelligent Information Systems 2/2012 Go to the issue

Premium Partner