Skip to main content

2024 | OriginalPaper | Buchkapitel

Measuring Similarities in Model Structure of Metaheuristic Rule Set Learners

verfasst von : David Pätzel, Richard Nordsieck, Jörg Hähner

Erschienen in: Applications of Evolutionary Computation

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present a way to measure similarity between sets of rules for regression tasks. This was identified to be an important but missing tool to investigate Metaheuristic Rule Set Learners (MRSLs), a class of algorithms that utilize metaheuristics such as Genetic Algorithms to solve learning tasks: The commonly-used predictive performance-based metrics such as mean absolute error do not capture most users’ actual preferences when they choose these kinds of models since they typically aim for model interpretability (i. e. low number of rules, meaningful rule placement etc.) and not low error alone. Our similarity measure is based on a form of metaheuristic-agnostic edit distance. It is meant to be used—in conjunction with a certain class of benchmark problems—for analysing and improving an as-of-yet underresearched part of MRSL algorithms: The metaheuristic that optimizes the model’s structure (i. e. the set of rule conditions). We discuss the measure’s most important properties and demonstrate its applicability by performing experiments on the best-known MRSL, XCSF, comparing it with two non-metaheuristic Rule Set Learners, Decision Trees and Random Forests.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
A rule k’s training match set is the set of training data points that \(m(\psi _{k}; \cdot )\) is fulfilled for, i. e. \(\{x \in X \mid m(\psi _{k}; x) = 1\}\).
 
2
We slightly abuse notation here and overload the matching function m to be able to pass the training data input \(N\times \mathcal {D}_\mathcal {X}\) matrix X consisting of N vectors \(x_{n} \in \mathcal {X}\) to a single condition \(m(\psi ; \cdot )\) to get an N-vector, i. e. \(m(\psi ; X) = \left( m(\psi ; x_{n})\right) _{n=1}^{N} \in \{0, 1\}^{N}\).
 
3
For \(N=768\) training data points, our own (not at all optimized) code took around 0.0005 s per computation of \(\delta _{X}\) (mean over all computations of \(d_{X}\) with \(N=768\) performed for Fig. 2) and correspondingly around 0.2 s for computing \(d_{X}\) for two model structures of size 20. For \(N=2000\), we measured 0.002 s per \(\delta _{X}\) computation (and correspondingly 0.8 s for size 20 model structures).
 
Literatur
1.
Zurück zum Zitat Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019) Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019)
4.
Zurück zum Zitat Brusco, M., Cradit, J.D., Steinley, D.: A comparison of 71 binary similarity coefficients: the effect of base rates. Plos One 16(4) (2021) Brusco, M., Cradit, J.D., Steinley, D.: A comparison of 71 binary similarity coefficients: the effect of base rates. Plos One 16(4) (2021)
6.
Zurück zum Zitat Choi, S.S., Cha, S.H., Tappert, C.C.: A survey of binary similarity and distance measures. J. Syst. Cybernet. Inform. 8(1), 43–48 (2010) Choi, S.S., Cha, S.H., Tappert, C.C.: A survey of binary similarity and distance measures. J. Syst. Cybernet. Inform. 8(1), 43–48 (2010)
11.
Zurück zum Zitat Ganti, V., Gehrke, J., Ramakrishnan, R.: A framework for measuring changes in data characteristics. In: Proceedings of the Eighteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 1999 pp. 126–137. Association for Computing Machinery, New York (1999). https://doi.org/10.1145/303976.303989 Ganti, V., Gehrke, J., Ramakrishnan, R.: A framework for measuring changes in data characteristics. In: Proceedings of the Eighteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 1999 pp. 126–137. Association for Computing Machinery, New York (1999). https://​doi.​org/​10.​1145/​303976.​303989
15.
Zurück zum Zitat Heider, M., Stegherr, H., Nordsieck, R., Hähner, J.: Learning classifier systems for self-explaining socio-technical-systems (2022) Heider, M., Stegherr, H., Nordsieck, R., Hähner, J.: Learning classifier systems for self-explaining socio-technical-systems (2022)
17.
Zurück zum Zitat Heider, M., Stegherr, H., Wurth, J., Sraj, R., Hähner, J.: Separating rule discovery and global solution composition in a learning classifier system. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO 2022, pp. 248–251. Association for Computing Machinery, New York(2022). https://doi.org/10.1145/3520304.3529014 Heider, M., Stegherr, H., Wurth, J., Sraj, R., Hähner, J.: Separating rule discovery and global solution composition in a learning classifier system. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO 2022, pp. 248–251. Association for Computing Machinery, New York(2022). https://​doi.​org/​10.​1145/​3520304.​3529014
19.
Zurück zum Zitat Kovacs, T.: Deletion schemes for classifier systems. In: Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation, pp. 329–336 (1999) Kovacs, T.: Deletion schemes for classifier systems. In: Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation, pp. 329–336 (1999)
20.
Zurück zum Zitat Kovacs, T.: What should a classifier system learn and how should we measure it? Soft. Comput. 6(3), 171–182 (2002)CrossRef Kovacs, T.: What should a classifier system learn and how should we measure it? Soft. Comput. 6(3), 171–182 (2002)CrossRef
23.
Zurück zum Zitat Liu, B., Hsu, W., Ma, Y.: Discovering the set of fundamental rule changes. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2001, pp. 335–340. Association for Computing Machinery, New York (2001). https://doi.org/10.1145/502512.502561 Liu, B., Hsu, W., Ma, Y.: Discovering the set of fundamental rule changes. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2001, pp. 335–340. Association for Computing Machinery, New York (2001). https://​doi.​org/​10.​1145/​502512.​502561
25.
27.
Zurück zum Zitat Pätzel, D., Heider, M., Hähner, J.: Towards principled synthetic benchmarks for explainable rule set learning algorithms. In: Proceedings of the Companion Conference on Genetic and Evolutionary Computation, GECCO 2023 Companion, pp. 1657–1662. Association for Computing Machinery, New York (2023). https://doi.org/10.1145/3583133.3596416 Pätzel, D., Heider, M., Hähner, J.: Towards principled synthetic benchmarks for explainable rule set learning algorithms. In: Proceedings of the Companion Conference on Genetic and Evolutionary Computation, GECCO 2023 Companion, pp. 1657–1662. Association for Computing Machinery, New York (2023). https://​doi.​org/​10.​1145/​3583133.​3596416
28.
Zurück zum Zitat Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNet Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNet
34.
Zurück zum Zitat Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993) Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993)
37.
38.
Zurück zum Zitat Tamee, K., Bull, L., Pinngern, O.: Towards clustering with XCS. In: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, GECCO 2007, pp. 1854–1860. Association for Computing Machinery, New York (2007). https://doi.org/10.1145/1276958.1277326 Tamee, K., Bull, L., Pinngern, O.: Towards clustering with XCS. In: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, GECCO 2007, pp. 1854–1860. Association for Computing Machinery, New York (2007). https://​doi.​org/​10.​1145/​1276958.​1277326
39.
Zurück zum Zitat Tan, J., Moore, J.H., Urbanowicz, R.J.: Rapid rule compaction strategies for global knowledge discovery in a supervised learning classifier system. In: Liò, P., Miglino, O., Nicosia, G., Nolfi, S., Pavone, M. (eds.) Proceedings of the Twelfth European Conference on the Synthesis and Simulation of Living Systems: Advances in Artificial Life, ECAL 2013, Sicily, Italy, 2–6 September 2013, pp. 110–117. MIT Press (2013). https://doi.org/10.7551/978-0-262-31709-2-CH017 Tan, J., Moore, J.H., Urbanowicz, R.J.: Rapid rule compaction strategies for global knowledge discovery in a supervised learning classifier system. In: Liò, P., Miglino, O., Nicosia, G., Nolfi, S., Pavone, M. (eds.) Proceedings of the Twelfth European Conference on the Synthesis and Simulation of Living Systems: Advances in Artificial Life, ECAL 2013, Sicily, Italy, 2–6 September 2013, pp. 110–117. MIT Press (2013). https://​doi.​org/​10.​7551/​978-0-262-31709-2-CH017
41.
Zurück zum Zitat Wilson, S.W.: Classifier fitness based on accuracy. Evol. Comput. 3(2), 149–175 (1995)CrossRef Wilson, S.W.: Classifier fitness based on accuracy. Evol. Comput. 3(2), 149–175 (1995)CrossRef
Metadaten
Titel
Measuring Similarities in Model Structure of Metaheuristic Rule Set Learners
verfasst von
David Pätzel
Richard Nordsieck
Jörg Hähner
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-56855-8_16

Premium Partner