Skip to main content
Erschienen in: Soft Computing 7/2012

01.07.2012 | Original Paper

Clustering-based initialization of Learning Classifier Systems

Effects on model performance, readability and induction time

verfasst von: Fani A. Tzima, Pericles A. Mitkas, John B. Theocharis

Erschienen in: Soft Computing | Ausgabe 7/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The present paper investigates whether an “informed” initialization process can help supervised LCS algorithms evolve rulesets with better characteristics, including greater predictive accuracy, shorter training times, and/or more compact knowledge representations. Inspired by previous research suggesting that the initialization phase of evolutionary algorithms may have a considerable impact on their convergence speed and the quality of the achieved solutions, we present an initialization method for the class of supervised Learning Classifier Systems (LCS) that extracts information about the structure of studied problems through a pre-training clustering phase and exploits this information by transforming it into rules suitable for the initialization of the learning process. The effectiveness of our approach is evaluated through an extensive experimental phase, involving a variety of real-world classification tasks. Obtained results suggest that clustering-based initialization can indeed improve the predictive accuracy, as well as the interpretability of the induced knowledge representations, and paves the way for further investigations of the potential of better-than-random initialization methods for LCS algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
The deletion scheme employed is adapted from the one reported in Kovacs (1999).
 
2
We assume that no expert knowledge on the classification task and/or its solution is available at the time of learning.
 
3
Code for both algorithms, as well as the clustering-based initialization process, is available upon request from the first author.
 
Literatur
Zurück zum Zitat Aguilar-Ruiz JS, Riquelme JC, Toro M (2003) Evolutionary learning of hierarchical decision rules. IEEE Trans Syst Man Cybern B 33(2):324–331CrossRef Aguilar-Ruiz JS, Riquelme JC, Toro M (2003) Evolutionary learning of hierarchical decision rules. IEEE Trans Syst Man Cybern B 33(2):324–331CrossRef
Zurück zum Zitat Alcalá-Fdez J, Sánchez L, García S, del Jesus M, Ventura S, Garrell J, Otero J, Romero C, Bacardit J, Rivas V, Fernández J, Herrera F (2009) Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13:307–318. doi:10.1007/s00500-008-0323-y CrossRef Alcalá-Fdez J, Sánchez L, García S, del Jesus M, Ventura S, Garrell J, Otero J, Romero C, Bacardit J, Rivas V, Fernández J, Herrera F (2009) Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13:307–318. doi:10.​1007/​s00500-008-0323-y CrossRef
Zurück zum Zitat Bacardit J (2004) Pittsburgh genetics-based machine learning in the data mining era: representations, generalization, and run-time. PhD thesis, Ramon Llull University, Barcelona, Catalonia, Spain Bacardit J (2004) Pittsburgh genetics-based machine learning in the data mining era: representations, generalization, and run-time. PhD thesis, Ramon Llull University, Barcelona, Catalonia, Spain
Zurück zum Zitat Bacardit J (2005) Analysis of the initialization stage of a Pittsburgh approach learning classifier system. In: Proceedings of the 2005 conference on genetic and evolutionary computation (GECCO ’05). ACM, New York, pp 1843–1850 Bacardit J (2005) Analysis of the initialization stage of a Pittsburgh approach learning classifier system. In: Proceedings of the 2005 conference on genetic and evolutionary computation (GECCO ’05). ACM, New York, pp 1843–1850
Zurück zum Zitat Bonelli P, Parodi A, Sen S, Wilson SW (1990) NEWBOOLE: a fast GBML system. In: Proceedings of the 7th international conference on machine learning. Morgan Kaufmann, San Francisco, pp 153–159 Bonelli P, Parodi A, Sen S, Wilson SW (1990) NEWBOOLE: a fast GBML system. In: Proceedings of the 7th international conference on machine learning. Morgan Kaufmann, San Francisco, pp 153–159
Zurück zum Zitat Breiman L (2002) Wald Lecture II—looking inside the black box. In: 277th meeting of the Institute of Mathematical Statistics Breiman L (2002) Wald Lecture II—looking inside the black box. In: 277th meeting of the Institute of Mathematical Statistics
Zurück zum Zitat Burke EK, Newall JP, Weare RF (1998) Initialization strategies and diversity in evolutionary timetabling. Evol Comput 6:81–103CrossRef Burke EK, Newall JP, Weare RF (1998) Initialization strategies and diversity in evolutionary timetabling. Evol Comput 6:81–103CrossRef
Zurück zum Zitat Butz MV, Wilson SW (2001) An algorithmic description of XCS. In: IWLCS ’00: revised papers from the third international workshop on advances in learning classifier systems. Springer, London, pp 253–272 Butz MV, Wilson SW (2001) An algorithmic description of XCS. In: IWLCS ’00: revised papers from the third international workshop on advances in learning classifier systems. Springer, London, pp 253–272
Zurück zum Zitat Butz MV, Kovacs T, Lanzi PL, Wilson SW (2004) Toward a theory of generalization and learning in XCS. IEEE Trans Evol Comput 8(1):28–46CrossRef Butz MV, Kovacs T, Lanzi PL, Wilson SW (2004) Toward a theory of generalization and learning in XCS. IEEE Trans Evol Comput 8(1):28–46CrossRef
Zurück zum Zitat Chou CH, Chen JN (2000) Genetic algorithms: initialization schemes and genes extraction. In: The ninth IEEE international conference on fuzzy systems, 2000 (FUZZ IEEE 2000), vol 2, pp 965–968 Chou CH, Chen JN (2000) Genetic algorithms: initialization schemes and genes extraction. In: The ninth IEEE international conference on fuzzy systems, 2000 (FUZZ IEEE 2000), vol 2, pp 965–968
Zurück zum Zitat De Jong KA (1975) An analysis of the behavior of a class of genetic adaptive systems. PhD thesis, University of Michigan, Ann Arbor, MI, USA De Jong KA (1975) An analysis of the behavior of a class of genetic adaptive systems. PhD thesis, University of Michigan, Ann Arbor, MI, USA
Zurück zum Zitat Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH
Zurück zum Zitat Dixon PW, Corne DW, Oates MJ (2003) A ruleset reduction algorithm for the XCS learning classifier system. In: Learning classifier systems, 5th international workshop, IWLCS 2002, Granada, Spain, September 7–8, 2002, revised papers. Lecture notes in computer science, vol 2661. Springer, Berlin-Heidelberg, pp 20–29 Dixon PW, Corne DW, Oates MJ (2003) A ruleset reduction algorithm for the XCS learning classifier system. In: Learning classifier systems, 5th international workshop, IWLCS 2002, Granada, Spain, September 7–8, 2002, revised papers. Lecture notes in computer science, vol 2661. Springer, Berlin-Heidelberg, pp 20–29
Zurück zum Zitat Frank E, Witten IH (1998) Generating accurate rule sets without global optimization. In: ICML ’98: proceedings of the fifteenth international conference on machine learning. Morgan Kaufmann, San Francisco, pp 144–151 Frank E, Witten IH (1998) Generating accurate rule sets without global optimization. In: ICML ’98: proceedings of the fifteenth international conference on machine learning. Morgan Kaufmann, San Francisco, pp 144–151
Zurück zum Zitat Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701 Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701
Zurück zum Zitat Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92MATHCrossRef Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92MATHCrossRef
Zurück zum Zitat García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13:959–977. doi:10.1007/s00500-008-0392-y CrossRef García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13:959–977. doi:10.​1007/​s00500-008-0392-y CrossRef
Zurück zum Zitat Holland JH (1975) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control and artificial intelligence. University of Michigan Press, Ann Arbor Holland JH (1975) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control and artificial intelligence. University of Michigan Press, Ann Arbor
Zurück zum Zitat Holmes JH, Sager JA (2005) Rule discovery in epidemiologic surveillance data using EpiXCS: an evolutionary computation approach. In: Miksch S, Hunter J, Keravnou ET (eds) AIME. Lecture notes in computer science, vol 3581. Springer, Heidelberg, pp 444–452 Holmes JH, Sager JA (2005) Rule discovery in epidemiologic surveillance data using EpiXCS: an evolutionary computation approach. In: Miksch S, Hunter J, Keravnou ET (eds) AIME. Lecture notes in computer science, vol 3581. Springer, Heidelberg, pp 444–452
Zurück zum Zitat Janikow CZ (1992) Inductive learning of decision rules from attribute-based examples: a knowledge-intensive genetic algorithm approach. PhD thesis, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA Janikow CZ (1992) Inductive learning of decision rules from attribute-based examples: a knowledge-intensive genetic algorithm approach. PhD thesis, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Zurück zum Zitat Kallel L, Schoenauer M (1997) Alternative random initialization in genetic algorithms. In: Proceedings of the 7th international conference on genetic algorithms. Morgan Kaufmann, pp 268–275 Kallel L, Schoenauer M (1997) Alternative random initialization in genetic algorithms. In: Proceedings of the 7th international conference on genetic algorithms. Morgan Kaufmann, pp 268–275
Zurück zum Zitat Kang RG, Jung CY (2006) The improved initialization method of genetic algorithm for solving the optimization problem. In: King I, Wang J, Chan LW, Wang D (eds) Neural information processing. Lecture notes in computer science, vol 4234. Springer, Berlin-Heidelberg, pp 789–796 Kang RG, Jung CY (2006) The improved initialization method of genetic algorithm for solving the optimization problem. In: King I, Wang J, Chan LW, Wang D (eds) Neural information processing. Lecture notes in computer science, vol 4234. Springer, Berlin-Heidelberg, pp 789–796
Zurück zum Zitat Kovacs T (1999) Deletion schemes for classifier systems. In: Banzhaf W, Daida J, Eiben AE, Garzon MH, Honavar V, Jakiela M, Smith RE (eds) Proceedings of the genetic and evolutionary computation conference (GECCO-99). Morgan Kaufmann, San Francisco, pp. 329–336 Kovacs T (1999) Deletion schemes for classifier systems. In: Banzhaf W, Daida J, Eiben AE, Garzon MH, Honavar V, Jakiela M, Smith RE (eds) Proceedings of the genetic and evolutionary computation conference (GECCO-99). Morgan Kaufmann, San Francisco, pp. 329–336
Zurück zum Zitat Kovacs T (2002a) XCS’s strength-based twin: part I. In: Lanzi et al (2003), pp 61–80 Kovacs T (2002a) XCS’s strength-based twin: part I. In: Lanzi et al (2003), pp 61–80
Zurück zum Zitat Kovacs T (2002b) XCS’s strength-based twin: part II. In: Lanzi et al (2003), pp 81–98 Kovacs T (2002b) XCS’s strength-based twin: part II. In: Lanzi et al (2003), pp 81–98
Zurück zum Zitat Lanzi PL (2008) Learning classifier systems: then and now. Evol Intell 1(1):63–82CrossRef Lanzi PL (2008) Learning classifier systems: then and now. Evol Intell 1(1):63–82CrossRef
Zurück zum Zitat Lanzi PL, Stolzmann W, Wilson SW (eds) (2003) Learning classifier systems. 5th international workshop, IWLCS 2002, Granada, Spain, September 7–8, 2002, revised papers. Lecture notes in computer science, vol 2661. Springer, Berlin Lanzi PL, Stolzmann W, Wilson SW (eds) (2003) Learning classifier systems. 5th international workshop, IWLCS 2002, Granada, Spain, September 7–8, 2002, revised papers. Lecture notes in computer science, vol 2661. Springer, Berlin
Zurück zum Zitat Louis SJ, McDonnell J (2004) Learning with case-injected genetic algorithms. IEEE Trans Evol Comput 8(4):316–328CrossRef Louis SJ, McDonnell J (2004) Learning with case-injected genetic algorithms. IEEE Trans Evol Comput 8(4):316–328CrossRef
Zurück zum Zitat Maaranen H, Miettinen K, Mäkelä MM (2004) Quasi-random initial population for genetic algorithms. Comput Math Appl 47(12):1885–1895MathSciNetMATHCrossRef Maaranen H, Miettinen K, Mäkelä MM (2004) Quasi-random initial population for genetic algorithms. Comput Math Appl 47(12):1885–1895MathSciNetMATHCrossRef
Zurück zum Zitat Mitchell TM (1997) Machine learning. McGraw-Hill Higher Education Mitchell TM (1997) Machine learning. McGraw-Hill Higher Education
Zurück zum Zitat Nemenyi PB (1963) Distribution-free multiple comparisons. PhD thesis, Princeton University Nemenyi PB (1963) Distribution-free multiple comparisons. PhD thesis, Princeton University
Zurück zum Zitat Orriols-Puig A, Bernadó-Mansilla E (2008a) Mining imbalanced data with learning classifier systems. In: Bull L, Bernadó-Mansilla E, Holmes JH (eds) Learning classifier systems in data mining. Studies in computational intelligence, vol 125. Springer, Berlin, pp 123–145. doi:10.1007/978-3-540-78979-6_6 Orriols-Puig A, Bernadó-Mansilla E (2008a) Mining imbalanced data with learning classifier systems. In: Bull L, Bernadó-Mansilla E, Holmes JH (eds) Learning classifier systems in data mining. Studies in computational intelligence, vol 125. Springer, Berlin, pp 123–145. doi:10.​1007/​978-3-540-78979-6_​6
Zurück zum Zitat Orriols-Puig A, Bernadó-Mansilla E (2008b) Revisiting UCS: description, fitness sharing, and comparison with XCS. Learning classifier systems: 10th international workshop, IWLCS 2006, Seattle, MA, USA, July 8, 2006 and 11th international workshop, IWLCS 2007, London, UK, July 8, 2007, revised selected papers, pp 96–116. doi:10.1007/978-3-540-88138-4_6 Orriols-Puig A, Bernadó-Mansilla E (2008b) Revisiting UCS: description, fitness sharing, and comparison with XCS. Learning classifier systems: 10th international workshop, IWLCS 2006, Seattle, MA, USA, July 8, 2006 and 11th international workshop, IWLCS 2007, London, UK, July 8, 2007, revised selected papers, pp 96–116. doi:10.​1007/​978-3-540-88138-4_​6
Zurück zum Zitat Orriols-Puig A, Goldberg DE, Sastry K, Bernadó-Mansilla E (2007) Modeling XCS in class imbalances: population size and parameter settings. In: GECCO ’07: proceedings of the 9th annual conference on Genetic and evolutionary computation. ACM, New York, pp 1838–1845. doi:10.1145/1276958.1277324 Orriols-Puig A, Goldberg DE, Sastry K, Bernadó-Mansilla E (2007) Modeling XCS in class imbalances: population size and parameter settings. In: GECCO ’07: proceedings of the 9th annual conference on Genetic and evolutionary computation. ACM, New York, pp 1838–1845. doi:10.​1145/​1276958.​1277324
Zurück zum Zitat Orriols-Puig A, Casillas J, Bernadó-Mansilla E (2009) Fuzzy-UCS: a Michigan-style learning fuzzy-classifier system for supervised learning. IEEE Trans Evol Comput 13(2):260–283CrossRef Orriols-Puig A, Casillas J, Bernadó-Mansilla E (2009) Fuzzy-UCS: a Michigan-style learning fuzzy-classifier system for supervised learning. IEEE Trans Evol Comput 13(2):260–283CrossRef
Zurück zum Zitat Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco
Zurück zum Zitat Quinlan JR (1996) Learning first-order definitions of functions. J Artif Intell Res 5:139–161MATH Quinlan JR (1996) Learning first-order definitions of functions. J Artif Intell Res 5:139–161MATH
Zurück zum Zitat Rahnamayan S, Tizhoosh HR, Salama MMA (2007) A novel population initialization method for accelerating evolutionary algorithms. Comput Math Appl 53(10):1605–1614MathSciNetMATHCrossRef Rahnamayan S, Tizhoosh HR, Salama MMA (2007) A novel population initialization method for accelerating evolutionary algorithms. Comput Math Appl 53(10):1605–1614MathSciNetMATHCrossRef
Zurück zum Zitat Ramsey CL, Grefenstette JJ (1993) Case-based initialization of genetic algorithms. In: Proceedings of the 5th international conference on genetic algorithms. Morgan Kaufmann, San Francisco, pp 84–91 Ramsey CL, Grefenstette JJ (1993) Case-based initialization of genetic algorithms. In: Proceedings of the 5th international conference on genetic algorithms. Morgan Kaufmann, San Francisco, pp 84–91
Zurück zum Zitat Tzima F, Mitkas P (2010) Comparing strength and accuracy-based supervised learning classifier systems. Technical report, Intelligent Systems and Software Engineering Labgroup, Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki, Thessaloniki, Greece, GR-541 24 Tzima F, Mitkas P (2010) Comparing strength and accuracy-based supervised learning classifier systems. Technical report, Intelligent Systems and Software Engineering Labgroup, Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki, Thessaloniki, Greece, GR-541 24
Zurück zum Zitat Tzima FA, Mitkas PA, Voukantsis D, Karatzas KD (2011) Sparse episode identification in environmental datasets: the case of air quality assessment. Expert Syst Appl 38(5):5019–5027CrossRef Tzima FA, Mitkas PA, Voukantsis D, Karatzas KD (2011) Sparse episode identification in environmental datasets: the case of air quality assessment. Expert Syst Appl 38(5):5019–5027CrossRef
Zurück zum Zitat Vavliakis KN, Symeonidis AL, Mitkas PA (2010) Towards understanding how personality, motivation, and events trigger Web user activity. In: IEEE/WIC/ACM international conference on Web intelligence and intelligent agent technology Vavliakis KN, Symeonidis AL, Mitkas PA (2010) Towards understanding how personality, motivation, and events trigger Web user activity. In: IEEE/WIC/ACM international conference on Web intelligence and intelligent agent technology
Zurück zum Zitat Venturini G (1993) Sia: a supervised inductive algorithm with genetic search for learning attributes based concepts. In: Proceedings of the European conference on machine learning. Springer, London, pp 280–296 Venturini G (1993) Sia: a supervised inductive algorithm with genetic search for learning attributes based concepts. In: Proceedings of the European conference on machine learning. Springer, London, pp 280–296
Zurück zum Zitat Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83CrossRef Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83CrossRef
Zurück zum Zitat Wilson SW (1994) ZCS: A zeroth-level classifier system. Evol Comput 2(1):1–18CrossRef Wilson SW (1994) ZCS: A zeroth-level classifier system. Evol Comput 2(1):1–18CrossRef
Zurück zum Zitat Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175CrossRef Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175CrossRef
Zurück zum Zitat Wilson SW (2002) Compact rulesets from XCSI. In: IWLCS ’01: revised papers from the 4th international workshop on advances in learning classifier systems. Springer, London, pp 197–210 Wilson SW (2002) Compact rulesets from XCSI. In: IWLCS ’01: revised papers from the 4th international workshop on advances in learning classifier systems. Springer, London, pp 197–210
Zurück zum Zitat Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
Zurück zum Zitat Zhang G, Gao L, Shi Y (2011) An effective genetic algorithm for the flexible job-shop scheduling problem. Expert Syst Appl 38(4):3563–3573CrossRef Zhang G, Gao L, Shi Y (2011) An effective genetic algorithm for the flexible job-shop scheduling problem. Expert Syst Appl 38(4):3563–3573CrossRef
Metadaten
Titel
Clustering-based initialization of Learning Classifier Systems
Effects on model performance, readability and induction time
verfasst von
Fani A. Tzima
Pericles A. Mitkas
John B. Theocharis
Publikationsdatum
01.07.2012
Verlag
Springer-Verlag
Erschienen in
Soft Computing / Ausgabe 7/2012
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-012-0811-y

Weitere Artikel der Ausgabe 7/2012

Soft Computing 7/2012 Zur Ausgabe