Skip to main content

2015 | OriginalPaper | Buchkapitel

On Measuring the Complexity of Classification Problems

verfasst von : Ana Carolina Lorena, Marcilio C. P. de Souto

Erschienen in: Neural Information Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

There has been a growing interest in describing the difficulty of solving a classification problem. This knowledge can be used, among other things, to support more grounded decisions concerning data pre-processing, as well as for the development of new data-driven pattern recognition techniques. Indeed, to estimate the intrinsic complexity of a classification problem, there are a variety of measures that can be extracted from a training data set. This paper presents some of them, performing a theoretical analysis.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Antolnez, N.M.: Data complexity in supervised learning: a far-reaching implication. Ph.D. thesis, La Salle, Universitat Ramon Llull (2011) Antolnez, N.M.: Data complexity in supervised learning: a far-reaching implication. Ph.D. thesis, La Salle, Universitat Ramon Llull (2011)
2.
3.
Zurück zum Zitat Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)CrossRefMATH Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)CrossRefMATH
4.
Zurück zum Zitat Cummins, L.: Combining and choosing case base maintenance algorithms. Ph.D. thesis, National University of Ireland, Cork (2013) Cummins, L.: Combining and choosing case base maintenance algorithms. Ph.D. thesis, National University of Ireland, Cork (2013)
5.
Zurück zum Zitat Dong, M., Kothari, R.: Feature subset selection using a new definition of classificability. PRL 24, 1215–1225 (2003)CrossRefMATH Dong, M., Kothari, R.: Feature subset selection using a new definition of classificability. PRL 24, 1215–1225 (2003)CrossRefMATH
6.
Zurück zum Zitat Flores, M.J., Gámez, J.A., Martínez, A.M.: Domains of competence of the semi-naive bayesian network classifiers. Inf. Sci. 260, 120–148 (2014)MathSciNetCrossRefMATH Flores, M.J., Gámez, J.A., Martínez, A.M.: Domains of competence of the semi-naive bayesian network classifiers. Inf. Sci. 260, 120–148 (2014)MathSciNetCrossRefMATH
7.
Zurück zum Zitat Garcia, L.P.F., de Carvalho, A.C.P.L.F., Lorena, A.C.: Effect of label noise in the complexity of classification problems. Neurocomputing (accepted) (2015, in press) Garcia, L.P.F., de Carvalho, A.C.P.L.F., Lorena, A.C.: Effect of label noise in the complexity of classification problems. Neurocomputing (accepted) (2015, in press)
8.
Zurück zum Zitat Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 289–300 (2002)CrossRef Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 289–300 (2002)CrossRef
9.
Zurück zum Zitat Hoekstra, A., Duin, R.P.: On the nonlinearity of pattern classifiers. In: Proceedings of the 13th International Conference on Pattern Recognition, vol. 4, pp. 271–275. IEEE (1996) Hoekstra, A., Duin, R.P.: On the nonlinearity of pattern classifiers. In: Proceedings of the 13th International Conference on Pattern Recognition, vol. 4, pp. 271–275. IEEE (1996)
10.
Zurück zum Zitat Hu, Q., Pedrycz, W., Yu, D., Lang, J.: Selecting discrete and continuous features based on neighborhood decision error minimization. IEEE Trans. Syst. Man Cybern. Part B Cybern. 40(1), 137–150 (2010)CrossRef Hu, Q., Pedrycz, W., Yu, D., Lang, J.: Selecting discrete and continuous features based on neighborhood decision error minimization. IEEE Trans. Syst. Man Cybern. Part B Cybern. 40(1), 137–150 (2010)CrossRef
11.
Zurück zum Zitat Li, L., Abu-Mostafa, Y.S.: Data complexity in machine learning. Technical Report CaltechCSTR:2006.004, Caltech Computer Science (2006) Li, L., Abu-Mostafa, Y.S.: Data complexity in machine learning. Technical Report CaltechCSTR:2006.004, Caltech Computer Science (2006)
12.
Zurück zum Zitat Lorena, A.C., Costa, I.G., Spolar, N., Souto, M.C.P.: Analysis of complexity indices for classification problems: cancer gene expression data. Neurocomputing 75, 33–42 (2012)CrossRef Lorena, A.C., Costa, I.G., Spolar, N., Souto, M.C.P.: Analysis of complexity indices for classification problems: cancer gene expression data. Neurocomputing 75, 33–42 (2012)CrossRef
13.
Zurück zum Zitat Luengo, J., Herrera, F.: Shared domains of competence of approximate learning models using measures of separability of classes. Inf. Sci. 185(1), 43–65 (2012)MathSciNetCrossRef Luengo, J., Herrera, F.: Shared domains of competence of approximate learning models using measures of separability of classes. Inf. Sci. 185(1), 43–65 (2012)MathSciNetCrossRef
14.
Zurück zum Zitat Mansilla, E.B., Ho, T.K.: On classifier domains of competence. In: Proceedings of the 17th ICPR, pp. 136–139 (2004) Mansilla, E.B., Ho, T.K.: On classifier domains of competence. In: Proceedings of the 17th ICPR, pp. 136–139 (2004)
15.
Zurück zum Zitat Mollineda, R.A., Sánchez, J.S., Sotoca, J.M.: Data characterization for effective prototype selection. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3523, pp. 27–34. Springer, Heidelberg (2005) CrossRef Mollineda, R.A., Sánchez, J.S., Sotoca, J.M.: Data characterization for effective prototype selection. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3523, pp. 27–34. Springer, Heidelberg (2005) CrossRef
16.
Zurück zum Zitat Orriols-Puig, A., Maci, N., Ho, T.K.: Documentation for the data complexity library in c++. Technical report, La Salle - Universitat Ramon Llull (2010) Orriols-Puig, A., Maci, N., Ho, T.K.: Documentation for the data complexity library in c++. Technical report, La Salle - Universitat Ramon Llull (2010)
17.
Zurück zum Zitat Singh, S.: Multiresolution estimates of classification complexity. IEEE Trans. PAMI 25, 1534–1539 (2003)CrossRef Singh, S.: Multiresolution estimates of classification complexity. IEEE Trans. PAMI 25, 1534–1539 (2003)CrossRef
18.
Zurück zum Zitat Smith, M.R., Martinez, T., Giraud-Carrier, C.: An instance level analysis of data complexity. Mach. Learn. 95(2), 225–256 (2014)MathSciNetCrossRef Smith, M.R., Martinez, T., Giraud-Carrier, C.: An instance level analysis of data complexity. Mach. Learn. 95(2), 225–256 (2014)MathSciNetCrossRef
19.
Zurück zum Zitat Souto, M.C.P., Lorena, A.C., Spolar, N., Costa, I.G.: Complexity measures of supervised classification tasks: a case study for cancer gene expression data. In: Proceedings of IJCNN, pp. 1352–1358 (2010) Souto, M.C.P., Lorena, A.C., Spolar, N., Costa, I.G.: Complexity measures of supervised classification tasks: a case study for cancer gene expression data. In: Proceedings of IJCNN, pp. 1352–1358 (2010)
Metadaten
Titel
On Measuring the Complexity of Classification Problems
verfasst von
Ana Carolina Lorena
Marcilio C. P. de Souto
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-26532-2_18

Premium Partner