Skip to main content
Erschienen in: Journal of Intelligent Information Systems 2/2012

01.10.2012

A new method of mining data streams using harmony search

verfasst von: Zohre Karimi, Hassan Abolhassani, Hamid Beigy

Erschienen in: Journal of Intelligent Information Systems | Ausgabe 2/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Incremental learning has been used extensively for data stream classification. Most attention on the data stream classification paid on non-evolutionary methods. In this paper, we introduce new incremental learning algorithms based on harmony search. We first propose a new classification algorithm for the classification of batch data called harmony-based classifier and then give its incremental version for classification of data streams called incremental harmony-based classifier. Finally, we improve it to reduce its computational overhead in absence of drifts and increase its robustness in presence of noise. This improved version is called improved incremental harmony-based classifier. The proposed methods are evaluated on some real world and synthetic data sets. Experimental results show that the proposed batch classifier outperforms some batch classifiers and also the proposed incremental methods can effectively address the issues usually encountered in the data stream environments. Improved incremental harmony-based classifier has significantly better speed and accuracy on capturing concept drifts than the non-incremental harmony based method and its accuracy is comparable to non-evolutionary algorithms. The experimental results also show the robustness of improved incremental harmony-based classifier.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bifet, A., & Gavaldà, R. (2009a). Adaptive parameter-free learning from evolving data streams. In IDA. Bifet, A., & Gavaldà, R. (2009a). Adaptive parameter-free learning from evolving data streams. In IDA.
Zurück zum Zitat Bifet A., & Gavaldà, R. (2009b). Adaptive XML tree classification on evolving data streams. In Proc. of European conference on machine learning and knowledge discovery in databases, ECML/PKDD. Bifet A., & Gavaldà, R. (2009b). Adaptive XML tree classification on evolving data streams. In Proc. of European conference on machine learning and knowledge discovery in databases, ECML/PKDD.
Zurück zum Zitat Cunningham, P., Nowlan, N., Delany, S. J., & Haahr, M. (2003). A case-based approach to spam filtering that can track concept drift. Technical Report TCD-CS-2003-16, Ireland, Trinity College Dublin. Cunningham, P., Nowlan, N., Delany, S. J., & Haahr, M. (2003). A case-based approach to spam filtering that can track concept drift. Technical Report TCD-CS-2003-16, Ireland, Trinity College Dublin.
Zurück zum Zitat Fan, W. (2004a). StreamMiner: A classifier ensemble-based engine to mine concept-drifting data streams. In Proc. of 2004 international conference on Very Large Data Bases (VLDB’2004) (Vol. 30, pp. 1257–1260). Toronto, Canada. Fan, W. (2004a). StreamMiner: A classifier ensemble-based engine to mine concept-drifting data streams. In Proc. of 2004 international conference on Very Large Data Bases (VLDB’2004) (Vol. 30, pp. 1257–1260). Toronto, Canada.
Zurück zum Zitat Fan, W. (2004b). Systematic data selection to mine concept-drifting data stream. In Proc. of ACM SIGKDD (pp. 128–137). Seattle, Washington USA. Fan, W. (2004b). Systematic data selection to mine concept-drifting data stream. In Proc. of ACM SIGKDD (pp. 128–137). Seattle, Washington USA.
Zurück zum Zitat Fogel, L. (1994). Evolutionary programming in perspective: The top-down view. In: J. M. Zurada, R. Marks II, C. Robinson (Eds.), Computational intelligence: Imitating life (pp. 135–146). Piscataway: IEEE Press. Fogel, L. (1994). Evolutionary programming in perspective: The top-down view. In: J. M. Zurada, R. Marks II, C. Robinson (Eds.), Computational intelligence: Imitating life (pp. 135–146). Piscataway: IEEE Press.
Zurück zum Zitat Gama, J., Medas, P., & Rocha, R. (2004). Forest trees for on-line data. In Proc. ACM symp. applied computing (SAC’04) (pp. 632–636). Gama, J., Medas, P., & Rocha, R. (2004). Forest trees for on-line data. In Proc. ACM symp. applied computing (SAC’04) (pp. 632–636).
Zurück zum Zitat Geem, Z. W., Kim, J. H., & Loganathan, G. V. (2002). A new heuristic optimization algorithm: Harmony search. Simulation, 76(2), 60–68.CrossRef Geem, Z. W., Kim, J. H., & Loganathan, G. V. (2002). A new heuristic optimization algorithm: Harmony search. Simulation, 76(2), 60–68.CrossRef
Zurück zum Zitat Geem, Z. W., Tseng, C., & Park, Y. (2005). Harmony search for generalized orienteering problem: Best touring in China, Springer. Lecture Notes in Computer Science, 3412, 741–750.CrossRef Geem, Z. W., Tseng, C., & Park, Y. (2005). Harmony search for generalized orienteering problem: Best touring in China, Springer. Lecture Notes in Computer Science, 3412, 741–750.CrossRef
Zurück zum Zitat Goldberg, D. (1989). Genetic algorithms in search, optimization and machine learning. Addison-Wesley. Goldberg, D. (1989). Genetic algorithms in search, optimization and machine learning. Addison-Wesley.
Zurück zum Zitat Guan, S. U., & Zhucollard, F. (2005). An incremental approach to genetic-algorithm-based classification. IEEE Transactions on Systems, Man and Cybernetics, Part B–Cybernetics, 35(2), 227–239.CrossRef Guan, S. U., & Zhucollard, F. (2005). An incremental approach to genetic-algorithm-based classification. IEEE Transactions on Systems, Man and Cybernetics, Part B–Cybernetics, 35(2), 227–239.CrossRef
Zurück zum Zitat Han, J., & Kamber, M. (2006). Data mining: Concepts and techniques (2nd Edn.). Morgan Kaufmann Publisher. Han, J., & Kamber, M. (2006). Data mining: Concepts and techniques (2nd Edn.). Morgan Kaufmann Publisher.
Zurück zum Zitat Hashemi, S., Yang, Y., Mirzamomen, Z., & Kangavari, M. (2009). Adapted one-versus-all decision trees for data stream classification. IEEE Transactions on Knowledge and Data Engineering, 21(5), 624–637.CrossRef Hashemi, S., Yang, Y., Mirzamomen, Z., & Kangavari, M. (2009). Adapted one-versus-all decision trees for data stream classification. IEEE Transactions on Knowledge and Data Engineering, 21(5), 624–637.CrossRef
Zurück zum Zitat Holland, J. H. (1986). Escaping brittleness: The possibilities of general purpose learning algorithms applied to parallel rule-based systems. In Machine learning: An artificial intelligence approach (Vol. II, pp. 593–623). Morgan Kaufmann. Holland, J. H. (1986). Escaping brittleness: The possibilities of general purpose learning algorithms applied to parallel rule-based systems. In Machine learning: An artificial intelligence approach (Vol. II, pp. 593–623). Morgan Kaufmann.
Zurück zum Zitat Hulten, G., Spencer, L., & Domingos, P. (2001). Mining time-changing data streams. In F. Provost (Ed.), Knowledge discovery and data mining (pp. 97–106). AAAI Press. Hulten, G., Spencer, L., & Domingos, P. (2001). Mining time-changing data streams. In F. Provost (Ed.), Knowledge discovery and data mining (pp. 97–106). AAAI Press.
Zurück zum Zitat Klinkenberg, R. (2004). Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis, 8(3), 281–300. Klinkenberg, R. (2004). Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis, 8(3), 281–300.
Zurück zum Zitat Kolter, J. Z., & Maloof, M. A. (2003). Dynamic weighted majority: A new ensemble method for tracking concept drift. In Proc. Of the 3rd IEEE int. conf. on data mining ICDM-2003 (pp. 123–130). IEEE CS Press: Los Alamitos, CA.CrossRef Kolter, J. Z., & Maloof, M. A. (2003). Dynamic weighted majority: A new ensemble method for tracking concept drift. In Proc. Of the 3rd IEEE int. conf. on data mining ICDM-2003 (pp. 123–130). IEEE CS Press: Los Alamitos, CA.CrossRef
Zurück zum Zitat Koza, J. (1992). Genetic programming: On the programming of computers by means of natural selection. Cambridge: MIT Press.MATH Koza, J. (1992). Genetic programming: On the programming of computers by means of natural selection. Cambridge: MIT Press.MATH
Zurück zum Zitat Koza, J., & Poli, R. (2005). Genetic programming. In E. Burke & G. Kendall (Eds.), Introductory tutorials in optimization, decision support and search methodology (Chapter 5, pp. 127–164). Kluwer Press. Koza, J., & Poli, R. (2005). Genetic programming. In E. Burke & G. Kendall (Eds.), Introductory tutorials in optimization, decision support and search methodology (Chapter 5, pp. 127–164). Kluwer Press.
Zurück zum Zitat Lee, K. S., & Geem, Z. W (2004). A new meta-heuristic algorithm for continues engineering optimization: Harmony search theory and practice. Computer Methods in Applied Mechanics and Engineering, 194, 3902–3933.CrossRef Lee, K. S., & Geem, Z. W (2004). A new meta-heuristic algorithm for continues engineering optimization: Harmony search theory and practice. Computer Methods in Applied Mechanics and Engineering, 194, 3902–3933.CrossRef
Zurück zum Zitat Liu, J., Li, X., & Zhong, W. (2009). Ambiguous decision trees for mining concept-drifting data streams. Pattern Recognition Letters, 30, 1347–1355.CrossRef Liu, J., Li, X., & Zhong, W. (2009). Ambiguous decision trees for mining concept-drifting data streams. Pattern Recognition Letters, 30, 1347–1355.CrossRef
Zurück zum Zitat Mahdavi, M., Fesanghary M., & Damangir, E. (2007). An improved harmony search algorithm for solving optimization problems. Applied Mathematics and Computation, 188, 1567–1579.MathSciNetMATHCrossRef Mahdavi, M., Fesanghary M., & Damangir, E. (2007). An improved harmony search algorithm for solving optimization problems. Applied Mathematics and Computation, 188, 1567–1579.MathSciNetMATHCrossRef
Zurück zum Zitat Mukhopadhyay, A., Roy, A., Das, S., & Abraham, A. (2008). Population-variance and explorative power of harmony search: an analysis. In Proceedings of 3rd IEEE international conference on digital information management (ICDIM 2008) (pp. 13–16). London, United Kingdom. Mukhopadhyay, A., Roy, A., Das, S., & Abraham, A. (2008). Population-variance and explorative power of harmony search: an analysis. In Proceedings of 3rd IEEE international conference on digital information management (ICDIM 2008) (pp. 13–16). London, United Kingdom.
Zurück zum Zitat Omran, M. G. H., & Mahdavi, M. (2008). Global-best harmony search. Applied Mathematics and Computation, 198, 643–656.MathSciNetMATHCrossRef Omran, M. G. H., & Mahdavi, M. (2008). Global-best harmony search. Applied Mathematics and Computation, 198, 643–656.MathSciNetMATHCrossRef
Zurück zum Zitat Polikar, R., Udpa, L., Udpa, S., & Honavar, V. (2001). Learn+ +: An incremental learning algorithm for supervised neural networks. IEEE Transactions on Systems, Man and Cybernetics; Part C–Cybernetics, 31, 497–508.CrossRef Polikar, R., Udpa, L., Udpa, S., & Honavar, V. (2001). Learn+ +: An incremental learning algorithm for supervised neural networks. IEEE Transactions on Systems, Man and Cybernetics; Part C–Cybernetics, 31, 497–508.CrossRef
Zurück zum Zitat Storn R., & Price, K. (1997). Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11, 341–359.MathSciNetMATHCrossRef Storn R., & Price, K. (1997). Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11, 341–359.MathSciNetMATHCrossRef
Zurück zum Zitat Street, W., & Kim, Y. (2001). A streaming ensemble algorithm for large scale classification. In Proceeding of the seventh international conference on knowledge discovery and data mining (pp. 377–382). NY. Street, W., & Kim, Y. (2001). A streaming ensemble algorithm for large scale classification. In Proceeding of the seventh international conference on knowledge discovery and data mining (pp. 377–382). NY.
Zurück zum Zitat Wang, H., Fan, W., Yu, P., & Han, J. (2003). Mining concept-drifting data streams using ensemble classifiers. In Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD2003) (pp. 226–235). Washington, D.C. Wang, H., Fan, W., Yu, P., & Han, J. (2003). Mining concept-drifting data streams using ensemble classifiers. In Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD2003) (pp. 226–235). Washington, D.C.
Zurück zum Zitat Widmer G., & Kubat, M. (1996). Learning in the presence of concept drift and hidden contexts. Machine Learning, 23(1), 69–101. Widmer G., & Kubat, M. (1996). Learning in the presence of concept drift and hidden contexts. Machine Learning, 23(1), 69–101.
Zurück zum Zitat Witten, I. H., & Frank, E. (1999). Data mining: Practical machine learning tools with Java implementations. San Francisco: Morgan Kaufmann. Witten, I. H., & Frank, E. (1999). Data mining: Practical machine learning tools with Java implementations. San Francisco: Morgan Kaufmann.
Zurück zum Zitat Zhang, Y., & Bhattacharyya, S. (2004). Genetic programming in classifying large-scale data: An ensemble method. Information Sciences, 163, 85–101.CrossRef Zhang, Y., & Bhattacharyya, S. (2004). Genetic programming in classifying large-scale data: An ensemble method. Information Sciences, 163, 85–101.CrossRef
Metadaten
Titel
A new method of mining data streams using harmony search
verfasst von
Zohre Karimi
Hassan Abolhassani
Hamid Beigy
Publikationsdatum
01.10.2012
Verlag
Springer US
Erschienen in
Journal of Intelligent Information Systems / Ausgabe 2/2012
Print ISSN: 0925-9902
Elektronische ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-012-0199-2

Weitere Artikel der Ausgabe 2/2012

Journal of Intelligent Information Systems 2/2012 Zur Ausgabe

Premium Partner