Skip to main content
Erschienen in: Health and Technology 5/2020

01.07.2020 | Original Paper

Assessing the impact of parameters tuning in ensemble based breast Cancer classification

verfasst von: Ali Idri, El Ouassif Bouchra, Mohamed Hosni, Ibtissam Abnane

Erschienen in: Health and Technology | Ausgabe 5/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Breast cancer is one of the major causes of death among women. Different decision support systems were proposed to assist oncologists to accurately diagnose their patients. These decision support systems mainly used classification techniques to categorize the diagnosis into Malign or Benign tumors. Given that no consensus has been reached on the classifier that can perform best in all circumstances, ensemble-based classification, which classifies patients by combining more than one single classification technique, has recently been investigated. In this paper, heterogeneous ensembles based on three well-known machine learning techniques (support vector machines, multilayer perceptron, and decision trees) were developed and evaluated by investigating the impact of parameter values of the ensemble members on classification performance. In particular, we investigate three parameters tuning techniques: Grid Search (GS), Particle Swarm Optimization (PSO) and the default parameters of the Weka Tool to evaluate whether setting ensemble parameters permits more accurate classification in breast cancer over four datasets obtained from the Machine Learning repository. The heterogeneous ensembles of this study were built using the majority voting technique as a combination rule. The overall results obtained suggest that: (1) Using GS or PSO techniques for single techniques provide more accurate classification; (2) In general, ensembles generate more accurate classification than their single techniques regardless of the optimization techniques used. (3) Heterogeneous ensembles based on optimized single classifiers generate better results than the Uniform Configuration of Weka (UC-WEKA) ensembles, and (4) PSO and GS slightly have the same impact on the performances of ensembles.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
8.
Zurück zum Zitat Vapnik V. Principles of risk minimization for learning theory, in advances in neural information processing systems; 1992. Vapnik V. Principles of risk minimization for learning theory, in advances in neural information processing systems; 1992.
10.
Zurück zum Zitat Sadri J, Suen C, Bui T. Application of support vector machines for recognition of handwritten Arabic/Persian digits. Second Conf Mach Vis Image Process Appl (MVIP 2003). 2003;1:300–7. Sadri J, Suen C, Bui T. Application of support vector machines for recognition of handwritten Arabic/Persian digits. Second Conf Mach Vis Image Process Appl (MVIP 2003). 2003;1:300–7.
12.
Zurück zum Zitat Haykin S. Neural networks: a comprehensive foundation; 1999. Haykin S. Neural networks: a comprehensive foundation; 1999.
13.
16.
Zurück zum Zitat Wang Y, Wang Y, Witten IH. Inducing model trees for continuous classes. Proc. 9TH Eur. Conf. Mach. Learn. POSTER Pap.; 1997: 128–37. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.9768 (accessed June 30, 2019). Wang Y, Wang Y, Witten IH. Inducing model trees for continuous classes. Proc. 9TH Eur. Conf. Mach. Learn. POSTER Pap.; 1997: 128–37. http://​citeseerx.​ist.​psu.​edu/​viewdoc/​summary?​doi=​10.​1.​1.​50.​9768 (accessed June 30, 2019).
19.
Zurück zum Zitat Hosni M, Idri A, Abran A. Evaluating filter fuzzy analogy homogenous ensembles for software development effort estimation. J Softw Evol Process. 2018;31(7):e2117. Hosni M, Idri A, Abran A. Evaluating filter fuzzy analogy homogenous ensembles for software development effort estimation. J Softw Evol Process. 2018;31(7):e2117.
22.
Zurück zum Zitat Boeringer DW, Werner DH. Particle swarm optimization versus genetic algorithms for phased array synthesis. In: IEEE Trans. Antennas Propag.; 2004. Boeringer DW, Werner DH. Particle swarm optimization versus genetic algorithms for phased array synthesis. In: IEEE Trans. Antennas Propag.; 2004.
23.
Zurück zum Zitat Skurichina M, Duin RPW. Bagging and the random subspace method for redundant feature spaces. In: Lect. Notes Comput. Sci. (Including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), Springer Verlag, 2001: pp. 1–10. https://doi.org/10.1007/3-540-48219-9_1. Skurichina M, Duin RPW. Bagging and the random subspace method for redundant feature spaces. In: Lect. Notes Comput. Sci. (Including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), Springer Verlag, 2001: pp. 1–10. https://​doi.​org/​10.​1007/​3-540-48219-9_​1.
34.
Zurück zum Zitat Schapire RE. A brief introduction to boosting. Proc. 16th Int. Jt. Conf. Artif. Intell. - vol. 2. 1999: pp. 1401–6. https://dl.acm.org/citation.cfm?id=1624417 (accessed June 20, 2019). Schapire RE. A brief introduction to boosting. Proc. 16th Int. Jt. Conf. Artif. Intell. - vol. 2. 1999: pp. 1401–6. https://​dl.​acm.​org/​citation.​cfm?​id=​1624417 (accessed June 20, 2019).
41.
Zurück zum Zitat Al-Quraishi T, Abawajy JH, Chowdhury MU, Rajasegarar S, Abdalrada AS. Breast cancer recurrence prediction using random forest model. In: Int. Conf. Soft Comput. Data Min.; 2018: pp. 318–29. Al-Quraishi T, Abawajy JH, Chowdhury MU, Rajasegarar S, Abdalrada AS. Breast cancer recurrence prediction using random forest model. In: Int. Conf. Soft Comput. Data Min.; 2018: pp. 318–29.
45.
Zurück zum Zitat Borges L, Ferreira D. Power and type I errors rate of Scott–Knott, Tukey and Newman–Keuls tests under normal and no-normal distributions of the residues. Rev Matemática e Estatística. 2003;21:67–83 http://jaguar.fcav.unesp.br/RME/fasciculos/v21/v21_n1/A4_LiviaBorges.pdf.MATH Borges L, Ferreira D. Power and type I errors rate of Scott–Knott, Tukey and Newman–Keuls tests under normal and no-normal distributions of the residues. Rev Matemática e Estatística. 2003;21:67–83 http://​jaguar.​fcav.​unesp.​br/​RME/​fasciculos/​v21/​v21_​n1/​A4_​LiviaBorges.​pdf.​MATH
47.
Zurück zum Zitat Cox DR, Spjøtvoll E. On partitioning means into groups source, Wiley behalf board found. Scand J St. 1982: 147–52. Cox DR, Spjøtvoll E. On partitioning means into groups source, Wiley behalf board found. Scand J St. 1982: 147–52.
50.
Zurück zum Zitat Bony S, Pichon N, Ravel C, Durixl A, Balfourier F. The relationship between mycotoxin synthesis and isolatemorphology in fungal endophytes of Lolium perenne. 2001; 152:125–37. Bony S, Pichon N, Ravel C, Durixl A, Balfourier F. The relationship between mycotoxin synthesis and isolatemorphology in fungal endophytes of Lolium perenne. 2001; 152:125–37.
51.
Zurück zum Zitat Azhar D, Riddle P, Mendes E, Mittas N, Angelis L. Using Ensembles for web effort estimation. 2016; https://researchspace.auckland.ac.nz/handle/2292/29236 (). Azhar D, Riddle P, Mendes E, Mittas N, Angelis L. Using Ensembles for web effort estimation. 2016; https://​researchspace.​auckland.​ac.​nz/​handle/​2292/​29236 ().
58.
Zurück zum Zitat Hsu C-W, Chang C-C, Lin C-J. A practical guide to support vector classification. 2003. http://www.csie.ntu.edu.tw/~cjlin (accessed May 16, 2020). Hsu C-W, Chang C-C, Lin C-J. A practical guide to support vector classification. 2003. http://​www.​csie.​ntu.​edu.​tw/​~cjlin (accessed May 16, 2020).
59.
Zurück zum Zitat Kernel width selection for SVM classification: A meta-learning approach: Computer Science & IT Book Chapter | IGI Global. n.d. https://www.igi-global.com/chapter/kernel-width-selection-svm-classification/26135 (accessed May 16, 2020). Kernel width selection for SVM classification: A meta-learning approach: Computer Science & IT Book Chapter | IGI Global. n.d. https://​www.​igi-global.​com/​chapter/​kernel-width-selection-svm-classification/​26135 (accessed May 16, 2020).
60.
Zurück zum Zitat Huang H-Y, Lin C-J. Linear and kernel classification: when to use which? n.d. http://www.csie.ntu.edu.tw/ (accessed May 16, 2020). Huang H-Y, Lin C-J. Linear and kernel classification: when to use which? n.d. http://​www.​csie.​ntu.​edu.​tw/​ (accessed May 16, 2020).
64.
65.
Zurück zum Zitat Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. 2002. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. 2002.
66.
Zurück zum Zitat UCI Machine Learning Repository: Breast Cancer data set. n.d. https://archive.ics.uci.edu/ml/datasets/breast+cancer (accessed December 18, 2018). UCI Machine Learning Repository: Breast Cancer data set. n.d. https://​archive.​ics.​uci.​edu/​ml/​datasets/​breast+cancer (accessed December 18, 2018).
67.
Zurück zum Zitat UCI Machine Learning Repository: Breast Cancer Wisconsin (diagnostic) data set. n.d. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic) (accessed May 16, 2020). UCI Machine Learning Repository: Breast Cancer Wisconsin (diagnostic) data set. n.d. https://​archive.​ics.​uci.​edu/​ml/​datasets/​Breast+Cancer+Wi​sconsin+(Diagnostic) (accessed May 16, 2020).
68.
Zurück zum Zitat UCI Machine Learning Repository: Breast Cancer Wisconsin (original) data set. n.d. https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original) (accessed May 16, 2020). UCI Machine Learning Repository: Breast Cancer Wisconsin (original) data set. n.d. https://​archive.​ics.​uci.​edu/​ml/​datasets/​breast+cancer+wi​sconsin+(original) (accessed May 16, 2020).
69.
Zurück zum Zitat UCI Machine Learning Repository: Breast Cancer Wisconsin (prognostic) data set. n.d.. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Prognostic) (accessed May 16, 2020). UCI Machine Learning Repository: Breast Cancer Wisconsin (prognostic) data set. n.d.. https://​archive.​ics.​uci.​edu/​ml/​datasets/​Breast+Cancer+Wi​sconsin+(Prognostic) (accessed May 16, 2020).
70.
Zurück zum Zitat Göndör M, Bresfelean VP. REPTree and M5P for measuring fiscal policy influences on the Romanian capital market during 2003–2010. Int J Math Comput Simul. 2012;6:3783–86 http://naun.org/main/NAUN/mcs/17-414.pdf (accessed June 21, 2019). Göndör M, Bresfelean VP. REPTree and M5P for measuring fiscal policy influences on the Romanian capital market during 2003–2010. Int J Math Comput Simul. 2012;6:3783–86 http://​naun.​org/​main/​NAUN/​mcs/​17-414.​pdf (accessed June 21, 2019).
71.
Zurück zum Zitat Kalmegh SR. Analysis of WEKA data mining algorithm REPTree. Simple Cart and RandomTree for Classification of Indian News. 2015. https://www.semanticscholar.org/paper/Analysis-of-WEKA-Data-Mining-Algorithm-REPTree%2C-and-Kalmegh/26d673f140807942313545489b38241c1f0401d0 (accessed June 21, 2019). Kalmegh SR. Analysis of WEKA data mining algorithm REPTree. Simple Cart and RandomTree for Classification of Indian News. 2015. https://​www.​semanticscholar.​org/​paper/​Analysis-of-WEKA-Data-Mining-Algorithm-REPTree%2C-and-Kalmegh/26d673f140807942313545489b38241c1f0401d0 (accessed June 21, 2019).
77.
Zurück zum Zitat Osborne JW. Improving your data transformations: applying the Box-Cox transformation. Pract Assess Res Eval. 2010;15:1–9. Osborne JW. Improving your data transformations: applying the Box-Cox transformation. Pract Assess Res Eval. 2010;15:1–9.
78.
Zurück zum Zitat Sakia R. M., The Box-Cox Transformation Technique: A Review, Journal of the Royal Statistical Society. Series D (The Statistician), Vol. 41, No. 2 (1992), pp. 169–178 Sakia R. M., The Box-Cox Transformation Technique: A Review, Journal of the Royal Statistical Society. Series D (The Statistician), Vol. 41, No. 2 (1992), pp. 169–178
Metadaten
Titel
Assessing the impact of parameters tuning in ensemble based breast Cancer classification
verfasst von
Ali Idri
El Ouassif Bouchra
Mohamed Hosni
Ibtissam Abnane
Publikationsdatum
01.07.2020
Verlag
Springer Berlin Heidelberg
Erschienen in
Health and Technology / Ausgabe 5/2020
Print ISSN: 2190-7188
Elektronische ISSN: 2190-7196
DOI
https://doi.org/10.1007/s12553-020-00453-2

Weitere Artikel der Ausgabe 5/2020

Health and Technology 5/2020 Zur Ausgabe

Premium Partner