Skip to main content
Erschienen in: Soft Computing 4/2012

01.04.2012 | Original Paper

OFP_CLASS: a hybrid method to generate optimized fuzzy partitions for classification

verfasst von: Jose M. Cadenas, M. Carmen Garrido, Raquel Martínez, Piero P. Bonissone

Erschienen in: Soft Computing | Ausgabe 4/2012

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The discretization of values plays a critical role in data mining and knowledge discovery. The representation of information through intervals is more concise and easier to understand at certain levels of knowledge than the representation by mean continuous values. In this paper, we propose a method for discretizing continuous attributes by means of fuzzy sets, which constitute a fuzzy partition of the domains of these attributes. This method carries out a fuzzy discretization of continuous attributes in two stages. A fuzzy decision tree is used in the first stage to propose an initial set of crisp intervals, while a genetic algorithm is used in the second stage to define the membership functions and the cardinality of the partitions. After defining the fuzzy partitions, we evaluate and compare them with previously existing ones in the literature.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Antonelli M, Ducange P, Lazzerini B, Marcelloni F (2010) Learning knowledge bases of multi-objective evolutionary fuzzy systems by simultaneously optimizing accuracy, complexity and partition integrity. Soft Comput. doi:10.1007/s00500-010-0665-0 Antonelli M, Ducange P, Lazzerini B, Marcelloni F (2010) Learning knowledge bases of multi-objective evolutionary fuzzy systems by simultaneously optimizing accuracy, complexity and partition integrity. Soft Comput. doi:10.​1007/​s00500-010-0665-0
Zurück zum Zitat Au W-H, Chan KC, Wong A (2006) A fuzzy approach to partitioning continuous attributes for classification. IEEE Trans Knowl Data Eng 18(5):715–719CrossRef Au W-H, Chan KC, Wong A (2006) A fuzzy approach to partitioning continuous attributes for classification. IEEE Trans Knowl Data Eng 18(5):715–719CrossRef
Zurück zum Zitat Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B 57:289–300 Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B 57:289–300
Zurück zum Zitat Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New YorkMATH Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New YorkMATH
Zurück zum Zitat Boulle M (2004) Khiops: a statistical discretization method of continuous attributes. Mach Learn 55:53–69MATHCrossRef Boulle M (2004) Khiops: a statistical discretization method of continuous attributes. Mach Learn 55:53–69MATHCrossRef
Zurück zum Zitat Catlett J (1991) On changing continuous attributes into ordered discrete attributes. In: Proceedings of Fifth European Working Session on Learning. Porto, Portugal, pp 164–177 Catlett J (1991) On changing continuous attributes into ordered discrete attributes. In: Proceedings of Fifth European Working Session on Learning. Porto, Portugal, pp 164–177
Zurück zum Zitat Chan CC, Bartur C, Srinivasasn A (1991) Determination of Quantization Intervals in Rule Based Model for Dynamic Systems. In: Proceedings of IEEE Conference on System, Man, and Cybernetics. Charlottesville, VA , USA, pp 1719–1723 Chan CC, Bartur C, Srinivasasn A (1991) Determination of Quantization Intervals in Rule Based Model for Dynamic Systems. In: Proceedings of IEEE Conference on System, Man, and Cybernetics. Charlottesville, VA , USA, pp 1719–1723
Zurück zum Zitat Choi YS, Moon BR (2007) Feature Selection in Genetic Fuzzy Discretization for Pattern Classification Problems. IEICE Trans Inform Syst E90-D(7):1047–1054 Choi YS, Moon BR (2007) Feature Selection in Genetic Fuzzy Discretization for Pattern Classification Problems. IEICE Trans Inform Syst E90-D(7):1047–1054
Zurück zum Zitat Cox E, Taber R, O’Hagan M (1998) The fuzzy systems handbook. 2nd edn. AP Professional, Oswego (2nd Bk&Cd edition) Cox E, Taber R, O’Hagan M (1998) The fuzzy systems handbook. 2nd edn. AP Professional, Oswego (2nd Bk&Cd edition)
Zurück zum Zitat Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, California, USA, pp 194–202 Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, California, USA, pp 194–202
Zurück zum Zitat Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes in decision tree generation. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, Chambéry, France, pp 1022–1027 Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes in decision tree generation. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, Chambéry, France, pp 1022–1027
Zurück zum Zitat Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996) From data mining to Knoweledge Discovery: An Overview. In: Advances in Knoweledge Discovery and Data Mining, U.M. Fayyad, G Piatetsky-Shapiro, P Smyth P, Uthrusamy R (eds.), AAAI/MIT Press, Massachusetts, pp 1–34 Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996) From data mining to Knoweledge Discovery: An Overview. In: Advances in Knoweledge Discovery and Data Mining, U.M. Fayyad, G Piatetsky-Shapiro, P Smyth P, Uthrusamy R (eds.), AAAI/MIT Press, Massachusetts, pp 1–34
Zurück zum Zitat García S, Fernández A, Luengo J, Herrera F (2009) A study statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977CrossRef García S, Fernández A, Luengo J, Herrera F (2009) A study statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977CrossRef
Zurück zum Zitat Gustafson DE, Kessel WC (1979) Fuzzy clustering with a fuzzy covariance matrix. In: Proceedins of IEEE Conference on Decision and Control, San Diego, CA, pp 761–766 Gustafson DE, Kessel WC (1979) Fuzzy clustering with a fuzzy covariance matrix. In: Proceedins of IEEE Conference on Decision and Control, San Diego, CA, pp 761–766
Zurück zum Zitat Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA Data Mining Software: An Update. ACM SIGKDD Explor Newslett 11(1):10–18CrossRef Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA Data Mining Software: An Update. ACM SIGKDD Explor Newslett 11(1):10–18CrossRef
Zurück zum Zitat Goldberg D E (1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman Publishing Co., MA, USA Goldberg D E (1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman Publishing Co., MA, USA
Zurück zum Zitat Ho KM, Scott PD (1997) Zeta: A Global Method for Discretization of Continuous Variables. In: Proceedings of 3rd International Conference on Knowledge Discovery and Data Mining, Newport Beach, California, pp 191–194 Ho KM, Scott PD (1997) Zeta: A Global Method for Discretization of Continuous Variables. In: Proceedings of 3rd International Conference on Knowledge Discovery and Data Mining, Newport Beach, California, pp 191–194
Zurück zum Zitat Holte RC (1993) Very simple classification rules perform well on most on most commonly used datasets. Mach Learn 11:63–90 Holte RC (1993) Very simple classification rules perform well on most on most commonly used datasets. Mach Learn 11:63–90
Zurück zum Zitat Janikov CZ (1999) Fuzzy partitionig with fid 3.1. In: Proceedings of 18th International Conference of the North American Fuzzy Information Processing Society, New York, USA, pp 467–471 Janikov CZ (1999) Fuzzy partitionig with fid 3.1. In: Proceedings of 18th International Conference of the North American Fuzzy Information Processing Society, New York, USA, pp 467–471
Zurück zum Zitat Kbir MA, Maalmi K, Benslimane R, Benkirane H (2000) Hierarchical fuzzy partition for pattern classification with fuzzy if-then rules. Pattern Recognit Lett 21(6–7):503–509CrossRef Kbir MA, Maalmi K, Benslimane R, Benkirane H (2000) Hierarchical fuzzy partition for pattern classification with fuzzy if-then rules. Pattern Recognit Lett 21(6–7):503–509CrossRef
Zurück zum Zitat Kerber R (1992) ChiMerge: Discretization of Numeric Attributes. In: Proceedings of Tenth Conf. Artificial Intelligence, CA, USA, pp 123–128 Kerber R (1992) ChiMerge: Discretization of Numeric Attributes. In: Proceedings of Tenth Conf. Artificial Intelligence, CA, USA, pp 123–128
Zurück zum Zitat Khan SS, Ahmad A (2004) Cluster center initialization algorithm for K-means clustering. Pattern Recognit Lett 25(11):1293–1302CrossRef Khan SS, Ahmad A (2004) Cluster center initialization algorithm for K-means clustering. Pattern Recognit Lett 25(11):1293–1302CrossRef
Zurück zum Zitat Kurgan L, Cios KJ (2004) CAIM discretization algorithm. IEEE Trans Knowl Data Eng 16(2):145–153CrossRef Kurgan L, Cios KJ (2004) CAIM discretization algorithm. IEEE Trans Knowl Data Eng 16(2):145–153CrossRef
Zurück zum Zitat Li Ch (2009) A Combination Scheme for Fuzzy Partitions Based on Fuzzy Majority Voting Rule. In: Proceedings of International Conference on Networks Security, Wireless Communications and Trusted Computing. Wuhan, Hubei, China, pp 675–678 Li Ch (2009) A Combination Scheme for Fuzzy Partitions Based on Fuzzy Majority Voting Rule. In: Proceedings of International Conference on Networks Security, Wireless Communications and Trusted Computing. Wuhan, Hubei, China, pp 675–678
Zurück zum Zitat Li Ch, Wang Y, Dai H (2009) A Combination Scheme for Fuzzy Partitions Based on Fuzzy Weighted Majority Voting Rule. In: Proceedings of International Conference on Digital Image Processing. Bangkok, Thailand, pp 3–7 Li Ch, Wang Y, Dai H (2009) A Combination Scheme for Fuzzy Partitions Based on Fuzzy Weighted Majority Voting Rule. In: Proceedings of International Conference on Digital Image Processing. Bangkok, Thailand, pp 3–7
Zurück zum Zitat Li Ch, Wang Y, Zuo Y (2009) A Selection Model for Optimal Fuzzy Clustering Algorithm and Number of Cluster Based on Competitive Comprehensive Fuzzy Evaluation. IEEE Trans Fuzzy Syst 17:568–577 Li Ch, Wang Y, Zuo Y (2009) A Selection Model for Optimal Fuzzy Clustering Algorithm and Number of Cluster Based on Competitive Comprehensive Fuzzy Evaluation. IEEE Trans Fuzzy Syst 17:568–577
Zurück zum Zitat Liu H, Setiono R (1997) Feature Selection via Discretization. IEEE Trans Knowl Data Eng 9(4):642–645 Liu H, Setiono R (1997) Feature Selection via Discretization. IEEE Trans Knowl Data Eng 9(4):642–645
Zurück zum Zitat Liu H, Hussain F, Tan CL, Dash M (2002) Discretization: an enabling technique. J Data Min Knowl Discov 6(4):393–423 Liu H, Hussain F, Tan CL, Dash M (2002) Discretization: an enabling technique. J Data Min Knowl Discov 6(4):393–423
Zurück zum Zitat Marzuki Z, Ahmad F (2007) Data Mining Discretization Methods and Performances. In: Proceedings of the International Conference on Electrical Engineering and Informatics. Bandung, Indonesia, pp 535–537 Marzuki Z, Ahmad F (2007) Data Mining Discretization Methods and Performances. In: Proceedings of the International Conference on Electrical Engineering and Informatics. Bandung, Indonesia, pp 535–537
Zurück zum Zitat Mirkin B (1996) Mathematical classification and clustering. Kluwer Academic Publishers, NetherlandsMATHCrossRef Mirkin B (1996) Mathematical classification and clustering. Kluwer Academic Publishers, NetherlandsMATHCrossRef
Zurück zum Zitat Mirkin B, Satarov G (1990) Method of fuzzy additive types for analysis of multidimensional data: I, II. Autom Remote Control 51(5–6):683–688, 817–821 Mirkin B, Satarov G (1990) Method of fuzzy additive types for analysis of multidimensional data: I, II. Autom Remote Control 51(5–6):683–688, 817–821
Zurück zum Zitat Myles AJ, Brown SD (2003) Induction of Decision Trees Using Fuzzy Partitions. J Chemom 17:531–536 Myles AJ, Brown SD (2003) Induction of Decision Trees Using Fuzzy Partitions. J Chemom 17:531–536
Zurück zum Zitat Nascimento S, Mirkin B, Moura-Pires F (2000) A fuzzy clustering model of data and fuzzy c-means. In: Proceedings of the IEEE Conference on Fuzzy Systems, San Antonio, TX, USA, pp 302–307 Nascimento S, Mirkin B, Moura-Pires F (2000) A fuzzy clustering model of data and fuzzy c-means. In: Proceedings of the IEEE Conference on Fuzzy Systems, San Antonio, TX, USA, pp 302–307
Zurück zum Zitat Peng YH, Flach PA (2001) Soft Discretization to Enhance the Continuous Decision Tree Induction. In: Proceedings of ECML/PKDD-2001 Workshop IDDM-2001, Freiburg, Germany, pp 109–118 Peng YH, Flach PA (2001) Soft Discretization to Enhance the Continuous Decision Tree Induction. In: Proceedings of ECML/PKDD-2001 Workshop IDDM-2001, Freiburg, Germany, pp 109–118
Zurück zum Zitat Piñero PY, Arco L, García MM, Acevedo L (2003) Algoritmos Genéticos en la construcción de funciones de pertenencia borrosas. Revista Iberoamericana de Inteligencia Artificial 18:25–33 Piñero PY, Arco L, García MM, Acevedo L (2003) Algoritmos Genéticos en la construcción de funciones de pertenencia borrosas. Revista Iberoamericana de Inteligencia Artificial 18:25–33
Zurück zum Zitat Quilan JR (1986) Induction of decision trees. Mach Learn 1:81–106 Quilan JR (1986) Induction of decision trees. Mach Learn 1:81–106
Zurück zum Zitat Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann Publishers, San Francisco, CA Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann Publishers, San Francisco, CA
Zurück zum Zitat Redmond SJ, Heneghan C (2007) A method for initializing the K-means clustering algorithm using kd-tree. Pattern Recognit Lett 28:965–973CrossRef Redmond SJ, Heneghan C (2007) A method for initializing the K-means clustering algorithm using kd-tree. Pattern Recognit Lett 28:965–973CrossRef
Zurück zum Zitat Sriparna S, Sanghamitra B (2007) A Fuzzy Genetic Clustering Technique Using a New Symmetry Based Distance for Automatic Evolution of Clusters. In: Proceedings of International Conference on Computing: Theory and Applications, Kolkata, India, pp 309–314 Sriparna S, Sanghamitra B (2007) A Fuzzy Genetic Clustering Technique Using a New Symmetry Based Distance for Automatic Evolution of Clusters. In: Proceedings of International Conference on Computing: Theory and Applications, Kolkata, India, pp 309–314
Zurück zum Zitat Torra V (2005) Fuzzy C-Means for Fuzzy Hierarchical Clustering. In: Proceedings of IEEE International Conference on Fuzzy Systems, Reno, Nevada, USA, pp 646–651 Torra V (2005) Fuzzy C-Means for Fuzzy Hierarchical Clustering. In: Proceedings of IEEE International Conference on Fuzzy Systems, Reno, Nevada, USA, pp 646–651
Zurück zum Zitat Tsai CJ, Lee CI, Yang WP (2008) A discretization algorithm based on class-attribute contingency coefficient. Inf Sci 178:714–731CrossRef Tsai CJ, Lee CI, Yang WP (2008) A discretization algorithm based on class-attribute contingency coefficient. Inf Sci 178:714–731CrossRef
Zurück zum Zitat Umano M, Okamolo H, Hatono I, Tamura H (1994) Fuzzy decision trees by fuzzy ID3 algorithm and its application to Diagnosis System. In: Proceedings of Third IEEE Intl. Conf. Fuzzy Systems, Orlando, USA, pp 2113–2118 Umano M, Okamolo H, Hatono I, Tamura H (1994) Fuzzy decision trees by fuzzy ID3 algorithm and its application to Diagnosis System. In: Proceedings of Third IEEE Intl. Conf. Fuzzy Systems, Orlando, USA, pp 2113–2118
Zurück zum Zitat Wu KL, Yang MS (2002) Alternative C-means Clustering Algorithm. Pattern Recognit 35(1):2267–2278MATHCrossRef Wu KL, Yang MS (2002) Alternative C-means Clustering Algorithm. Pattern Recognit 35(1):2267–2278MATHCrossRef
Zurück zum Zitat Yang Y, Jia Z, Chang C, Qin X, Li T, Wang H, Zhao J (2008) An efficient fuzzy kohonen clustering network algorithm. In: Proceedings of Fuzzy Systems and Knowledge Discovery, Shandong, China, pp 510–513 Yang Y, Jia Z, Chang C, Qin X, Li T, Wang H, Zhao J (2008) An efficient fuzzy kohonen clustering network algorithm. In: Proceedings of Fuzzy Systems and Knowledge Discovery, Shandong, China, pp 510–513
Zurück zum Zitat Zadeh LA (1975) The Concept of a Linguistic Variable and its Application to Approximate Reasoning-I. Inf Sci 8(3):199–249MathSciNetCrossRef Zadeh LA (1975) The Concept of a Linguistic Variable and its Application to Approximate Reasoning-I. Inf Sci 8(3):199–249MathSciNetCrossRef
Metadaten
Titel
OFP_CLASS: a hybrid method to generate optimized fuzzy partitions for classification
verfasst von
Jose M. Cadenas
M. Carmen Garrido
Raquel Martínez
Piero P. Bonissone
Publikationsdatum
01.04.2012
Verlag
Springer-Verlag
Erschienen in
Soft Computing / Ausgabe 4/2012
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-011-0778-0

Weitere Artikel der Ausgabe 4/2012

Soft Computing 4/2012 Zur Ausgabe