Top

Journal of Classification

Published in:

18-02-2020

Spherical Classification of Data, a New Rule-Based Learning Method

Authors: Zhengyu Ma, Hong Seo Ryoo

Published in: Journal of Classification | Issue 1/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

This paper presents a new rule-based classification method that partitions data under analysis into spherical patterns. The forte of the method is twofold. One, it exploits the efficiency of distance metric-based clustering to fast collect similar data into spherical patterns. The other, spherical patterns are each a trait shared among one type of data only, hence are built for classification of new data. Numerical studies with public machine learning datasets from Lichman (2013), in comparison with well-established classification methods from Boros et al. (IEEE Transactions on Knowledge and Data Engineering, 12, 292–306, 2000) and Waikato Environment for Knowledge Analysis (http://www.cs.waikato.ac.nz/ml/weka/), demonstrate the aforementioned utilities of the new method well.

previous article Modified Subspace Constrained Mean Shift Algorithm

next article Consumer Segmentation Based on Use Patterns

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Aha, D., Kibler, D., Albert, M. (1991). Instance-based learning. Machine Learning, 6(1), 37–66.

Alexe, G., & Hammer, P.L. (2006a). Spanned patterns for the logical analysis of data. Discrete Mathematics, 154(7), 1039–1049.MathSciNetMATHCrossRef

Alexe, S., & Hammer, P.L. (2006b). Accelerated algorithm for pattern detection in logical analysis of data. Discrete Mathematics, 154(7), 1050–1063.MathSciNetMATHCrossRef

Alexe, G., Alexe, S., Bonates, T., Kogan, A. (2007). Logical analysis of data – the vision of Peter L. Hammer. Annals of Mathematics and Artificial Intelligence, 49, 265–312.MathSciNetMATHCrossRef

Balcan, M. -F., Blum, A., Vempala, S. (2008). A discriminative framework for clustering via similarity functions. In Proceedings of the Fortieth ACM Symposium on Theory of Computing (pp. 671– 680).

Bazaraa, M., Sherali, H., Shetty, C. (2006). Nonlinear programming: theory and algorithms. New York: Wiley.MATHCrossRef

Beasley, J., & Chu, P. (1996). A genetic algorithm for the set covering problem. European Journal of Operation Research, 94, 392–404.MATHCrossRef

Bennett, K., & Mangasarian, O. (1992). Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software, 1, 23–34.CrossRef

Bennett, K., & Mangasarian, O. (1994). Bilinear separation of two sets in n −space. Computational Optimization and Applications, 2, 207–227.MathSciNetMATHCrossRef

Bonates, T., Hammer, P. L., Kogan, A. (2008). Maximum patterns in datasets. Discrete Applied Mathematics, 156(6), 846–861.MathSciNetMATHCrossRef

Boros, E., Hammer, P.L., Ibaraki, T., Kogan, A., Mayoraz, E., Muchnik, I. (2000). An implementation of logical analysis of data. IEEE Transactions on Knowledge and Data Engineering, 12, 292–306.CrossRef

Bradley, P., & Mangasarian, O. (2000). Massive data discrimination via linear support vector machines. Optimization Methods and Software, 13(1), 1–20.MathSciNetMATHCrossRef

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.MATHCrossRef

Chvatal, V. (1979). A greedy heuristic for the set covering problem. Mathematics of Operations Research, 4, 233–235.MathSciNetMATHCrossRef

Cohen, W. W. (1995). Fast effective rule induction. In Proceedings of the Twelfth International Conference on Machine Learning (pp. 115–123).

Cortes, C., & Vapnik, V. (1995). Support vector networks. Machine Learning, 20, 273–297.MATH

Eick, C. F., Zeidat, N., Zhao, Z. (2004). Supervised clustering – algorithms and benefits. In 16Th IEEE international conference on tools with artificial intelligence (pp. 774–776).

Falk, J., & Lopez-Cardona, E. (1997). The surgical separation of sets. Journal of Global Optimization, 11, 433–462.MathSciNetMATHCrossRef

Frank, E., & Witten, I. H. (1998). Generating accurate rule sets without global optimization. In Proceedings of the Fifteenth International Conference on Machine Learning (pp. 144–151).

Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In Thirteenth International Conference on Machine Learning (pp. 148–156).

Fung, G., & Mangasarian, O. (2003). Finite Newton method for Lagrangian support vector machine classification. Neurocomputing, 55, 39–55.CrossRef

Guo, C., & Ryoo, H.S. (2012). Compact MILP models for optimal and Pareto-optimal LAD patterns. Discrete Applied Mathematics, 160, 2339–2348.MathSciNetMATHCrossRef

Guo, C., & Ryoo, H.S. (2018). On Pareto-optimal Boolean logical patterns for numerical data. Submitted for publication.

Gurobi Optimization Inc. (2017). Gurobi optimizer reference manual. http://www.gurobi.com.

Hammer, P.L., Kogan, A., Simeone, B., Szedmak, S. (2004). Pareto-optimal patterns in logical analysis of data. Discrete Applied Mathematics, 144, 79–102.MathSciNetMATHCrossRef

Haykin, S. (1999). Neural networks: a comprehensive foundation. Englewood Cliffs: Prentice Hall.MATH

Hoffman, K., & Padberg, M. (1993). Solving airline crew scheduling problems by branch-and-cut. Management Science, 39(6), 657–682.MATHCrossRef

Jain, A., Murty, M., Flynn, P. (1999). Data clustering: a review. ACM Computing Surveys, 31(3), 264–323.CrossRef

Jain, A. (2010). Data clustering: 50 years beyond k-means. Pattern Recognition Letters, 31, 651–666.CrossRef

John, G., & Langley, P. (1995). Estimating continuous distributions in Bayesian classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (pp. 338–345).

Kim, K., & Ryoo, H.S. (2007a). Data separation via a finite number of discriminant functions: a global optimization approach. Applied Mathematics and Computation, 190 (1), 476–489.MathSciNetMATHCrossRef

Kim, K., & Ryoo, H.S.S. (2007b). Nonlinear separation of data via mixed 0-1 integer and linear programming. Applied Mathematics and Computation, 193(1), 183–196.MathSciNetMATHCrossRef

Kim, K., & Ryoo, H.S. (2008). A LAD-based method for selecting short oligo probes for genotyping applications. OR Spectrum, 30(2), 249–268.MathSciNetMATHCrossRef

Kohavi, R. (1995). The power of decision tables. In Proceedings of the Eighth European Conference on Machine Learning (pp. 179–189).

Kolesar, P., & Walker, W. (1974). An algorithm for the dynamic relocation of fire companies. Operations Research, 22, 249–274.CrossRef

Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml.

Lorena, L., & Lopes, F. (1994). A surrogate heuristic for set covering problems. European Journal of Operational Research, 79, 138–150.MATHCrossRef

Ma, Z., & Ryoo, H.S. (2012). General set covering for feature selection in data mining. Management Science and Financial Engineering, 18(2), 13–17.CrossRef

Mangasarian, O. (1965). Linear and nonlinear separation of patterns by linear programming. Operations Research, 13(3), 444–452.MathSciNetMATHCrossRef

Mangasarian, O. (1968). Multisurface method of pattern separation. IEEE Transactions on Information Theory, 14(6), 801–807.MATHCrossRef

Mangasarian, O. (1993). Mathematical programming in neural network. ORSA Journal on Computing, 5(4), 349–360.MATHCrossRef

Platt, J. (1999). Fast training of support vector machines using sequential minimal optimization, (pp. 185–208). Cambridge: MIT Press.

Quinlan, R. (1993). C4.5: Programs for machine learning. San Mateo: Morgan Kaufmann Publishers.

Ryoo, H.S., & Jang, I. (2009). MILP approach to pattern generation in logical analysis of data. Discrete Applied Mathematics, 157, 749–761.MathSciNetMATHCrossRef

Ullman, J. (1973). Pattern recognition techniques. London: Crane.

Vapnik, V. (1998). Statistical learning theory. New York: Wiley-Interscience.MATH

Vapnik, V. (2000). The nature of statistical learning theory, 2nd edn. Berlin: Springer.MATHCrossRef

Wedelin, D. (1995). An algorithm for large scale 0-1 inter programming with application to airline crew scheduling. Annals of Operations Research, 57, 283–301.MathSciNetMATHCrossRef

Yan, K., & Ryoo, H.S. (2017a). 0-1 multilinear programming as a unifying theory for LAD pattern generation. Discrete Applied Mathematics, 218, 21–39.MathSciNetMATHCrossRef

Yan, K., & Ryoo, H.S. (2017b). Strong valid inequalities for Boolean logical pattern generation. Journal of Global Optimization, 69(1), 183–230.MathSciNetMATHCrossRef

Yan, K., & Ryoo, H.S. (2020). Cliques for Multi-Term linearization of 0-1 multilinear program for Boolean logical pattern generation. In Optimization of Complex Systems: Theory, Models, Algorithms and Applications, Advances in Intelligent Systems and Computing, 991, 376–386.

Title: Spherical Classification of Data, a New Rule-Based Learning Method
Authors: Zhengyu Ma
Hong Seo Ryoo
Publication date: 18-02-2020
Publisher: Springer US
Published in: Journal of Classification / Issue 1/2021
Print ISSN: 0176-4268
Electronic ISSN: 1432-1343
DOI: https://doi.org/10.1007/s00357-019-09355-z

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 1/2021

Consumer Segmentation Based on Use Patterns

A Variational Approximations-DIC Rubric for Parameter Estimation and Mixture Model Selection Within a Family Setting

Comparing High-Dimensional Partitions with the Co-clustering Adjusted Rand Index

Gaussian-Based Visualization of Gaussian and Non-Gaussian-Based Clustering

A Membership Probability–Based Undersampling Algorithm for Imbalanced Data

A New Performance Evaluation Metric for Classifiers: Polygon Area Metric

Premium Partner