Skip to main content
Top
Published in: Health and Technology 5/2020

01-07-2020 | Original Paper

Assessing the impact of parameters tuning in ensemble based breast Cancer classification

Authors: Ali Idri, El Ouassif Bouchra, Mohamed Hosni, Ibtissam Abnane

Published in: Health and Technology | Issue 5/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Breast cancer is one of the major causes of death among women. Different decision support systems were proposed to assist oncologists to accurately diagnose their patients. These decision support systems mainly used classification techniques to categorize the diagnosis into Malign or Benign tumors. Given that no consensus has been reached on the classifier that can perform best in all circumstances, ensemble-based classification, which classifies patients by combining more than one single classification technique, has recently been investigated. In this paper, heterogeneous ensembles based on three well-known machine learning techniques (support vector machines, multilayer perceptron, and decision trees) were developed and evaluated by investigating the impact of parameter values of the ensemble members on classification performance. In particular, we investigate three parameters tuning techniques: Grid Search (GS), Particle Swarm Optimization (PSO) and the default parameters of the Weka Tool to evaluate whether setting ensemble parameters permits more accurate classification in breast cancer over four datasets obtained from the Machine Learning repository. The heterogeneous ensembles of this study were built using the majority voting technique as a combination rule. The overall results obtained suggest that: (1) Using GS or PSO techniques for single techniques provide more accurate classification; (2) In general, ensembles generate more accurate classification than their single techniques regardless of the optimization techniques used. (3) Heterogeneous ensembles based on optimized single classifiers generate better results than the Uniform Configuration of Weka (UC-WEKA) ensembles, and (4) PSO and GS slightly have the same impact on the performances of ensembles.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
8.
go back to reference Vapnik V. Principles of risk minimization for learning theory, in advances in neural information processing systems; 1992. Vapnik V. Principles of risk minimization for learning theory, in advances in neural information processing systems; 1992.
10.
go back to reference Sadri J, Suen C, Bui T. Application of support vector machines for recognition of handwritten Arabic/Persian digits. Second Conf Mach Vis Image Process Appl (MVIP 2003). 2003;1:300–7. Sadri J, Suen C, Bui T. Application of support vector machines for recognition of handwritten Arabic/Persian digits. Second Conf Mach Vis Image Process Appl (MVIP 2003). 2003;1:300–7.
12.
go back to reference Haykin S. Neural networks: a comprehensive foundation; 1999. Haykin S. Neural networks: a comprehensive foundation; 1999.
13.
16.
go back to reference Wang Y, Wang Y, Witten IH. Inducing model trees for continuous classes. Proc. 9TH Eur. Conf. Mach. Learn. POSTER Pap.; 1997: 128–37. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.9768 (accessed June 30, 2019). Wang Y, Wang Y, Witten IH. Inducing model trees for continuous classes. Proc. 9TH Eur. Conf. Mach. Learn. POSTER Pap.; 1997: 128–37. http://​citeseerx.​ist.​psu.​edu/​viewdoc/​summary?​doi=​10.​1.​1.​50.​9768 (accessed June 30, 2019).
19.
go back to reference Hosni M, Idri A, Abran A. Evaluating filter fuzzy analogy homogenous ensembles for software development effort estimation. J Softw Evol Process. 2018;31(7):e2117. Hosni M, Idri A, Abran A. Evaluating filter fuzzy analogy homogenous ensembles for software development effort estimation. J Softw Evol Process. 2018;31(7):e2117.
22.
go back to reference Boeringer DW, Werner DH. Particle swarm optimization versus genetic algorithms for phased array synthesis. In: IEEE Trans. Antennas Propag.; 2004. Boeringer DW, Werner DH. Particle swarm optimization versus genetic algorithms for phased array synthesis. In: IEEE Trans. Antennas Propag.; 2004.
23.
34.
go back to reference Schapire RE. A brief introduction to boosting. Proc. 16th Int. Jt. Conf. Artif. Intell. - vol. 2. 1999: pp. 1401–6. https://dl.acm.org/citation.cfm?id=1624417 (accessed June 20, 2019). Schapire RE. A brief introduction to boosting. Proc. 16th Int. Jt. Conf. Artif. Intell. - vol. 2. 1999: pp. 1401–6. https://​dl.​acm.​org/​citation.​cfm?​id=​1624417 (accessed June 20, 2019).
41.
go back to reference Al-Quraishi T, Abawajy JH, Chowdhury MU, Rajasegarar S, Abdalrada AS. Breast cancer recurrence prediction using random forest model. In: Int. Conf. Soft Comput. Data Min.; 2018: pp. 318–29. Al-Quraishi T, Abawajy JH, Chowdhury MU, Rajasegarar S, Abdalrada AS. Breast cancer recurrence prediction using random forest model. In: Int. Conf. Soft Comput. Data Min.; 2018: pp. 318–29.
45.
go back to reference Borges L, Ferreira D. Power and type I errors rate of Scott–Knott, Tukey and Newman–Keuls tests under normal and no-normal distributions of the residues. Rev Matemática e Estatística. 2003;21:67–83 http://jaguar.fcav.unesp.br/RME/fasciculos/v21/v21_n1/A4_LiviaBorges.pdf.MATH Borges L, Ferreira D. Power and type I errors rate of Scott–Knott, Tukey and Newman–Keuls tests under normal and no-normal distributions of the residues. Rev Matemática e Estatística. 2003;21:67–83 http://​jaguar.​fcav.​unesp.​br/​RME/​fasciculos/​v21/​v21_​n1/​A4_​LiviaBorges.​pdf.​MATH
47.
go back to reference Cox DR, Spjøtvoll E. On partitioning means into groups source, Wiley behalf board found. Scand J St. 1982: 147–52. Cox DR, Spjøtvoll E. On partitioning means into groups source, Wiley behalf board found. Scand J St. 1982: 147–52.
50.
go back to reference Bony S, Pichon N, Ravel C, Durixl A, Balfourier F. The relationship between mycotoxin synthesis and isolatemorphology in fungal endophytes of Lolium perenne. 2001; 152:125–37. Bony S, Pichon N, Ravel C, Durixl A, Balfourier F. The relationship between mycotoxin synthesis and isolatemorphology in fungal endophytes of Lolium perenne. 2001; 152:125–37.
51.
go back to reference Azhar D, Riddle P, Mendes E, Mittas N, Angelis L. Using Ensembles for web effort estimation. 2016; https://researchspace.auckland.ac.nz/handle/2292/29236 (). Azhar D, Riddle P, Mendes E, Mittas N, Angelis L. Using Ensembles for web effort estimation. 2016; https://​researchspace.​auckland.​ac.​nz/​handle/​2292/​29236 ().
58.
go back to reference Hsu C-W, Chang C-C, Lin C-J. A practical guide to support vector classification. 2003. http://www.csie.ntu.edu.tw/~cjlin (accessed May 16, 2020). Hsu C-W, Chang C-C, Lin C-J. A practical guide to support vector classification. 2003. http://​www.​csie.​ntu.​edu.​tw/​~cjlin (accessed May 16, 2020).
59.
go back to reference Kernel width selection for SVM classification: A meta-learning approach: Computer Science & IT Book Chapter | IGI Global. n.d. https://www.igi-global.com/chapter/kernel-width-selection-svm-classification/26135 (accessed May 16, 2020). Kernel width selection for SVM classification: A meta-learning approach: Computer Science & IT Book Chapter | IGI Global. n.d. https://​www.​igi-global.​com/​chapter/​kernel-width-selection-svm-classification/​26135 (accessed May 16, 2020).
60.
go back to reference Huang H-Y, Lin C-J. Linear and kernel classification: when to use which? n.d. http://www.csie.ntu.edu.tw/ (accessed May 16, 2020). Huang H-Y, Lin C-J. Linear and kernel classification: when to use which? n.d. http://​www.​csie.​ntu.​edu.​tw/​ (accessed May 16, 2020).
65.
go back to reference Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. 2002. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. 2002.
66.
go back to reference UCI Machine Learning Repository: Breast Cancer data set. n.d. https://archive.ics.uci.edu/ml/datasets/breast+cancer (accessed December 18, 2018). UCI Machine Learning Repository: Breast Cancer data set. n.d. https://​archive.​ics.​uci.​edu/​ml/​datasets/​breast+cancer (accessed December 18, 2018).
67.
go back to reference UCI Machine Learning Repository: Breast Cancer Wisconsin (diagnostic) data set. n.d. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic) (accessed May 16, 2020). UCI Machine Learning Repository: Breast Cancer Wisconsin (diagnostic) data set. n.d. https://​archive.​ics.​uci.​edu/​ml/​datasets/​Breast+Cancer+Wi​sconsin+(Diagnostic) (accessed May 16, 2020).
68.
go back to reference UCI Machine Learning Repository: Breast Cancer Wisconsin (original) data set. n.d. https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original) (accessed May 16, 2020). UCI Machine Learning Repository: Breast Cancer Wisconsin (original) data set. n.d. https://​archive.​ics.​uci.​edu/​ml/​datasets/​breast+cancer+wi​sconsin+(original) (accessed May 16, 2020).
69.
go back to reference UCI Machine Learning Repository: Breast Cancer Wisconsin (prognostic) data set. n.d.. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Prognostic) (accessed May 16, 2020). UCI Machine Learning Repository: Breast Cancer Wisconsin (prognostic) data set. n.d.. https://​archive.​ics.​uci.​edu/​ml/​datasets/​Breast+Cancer+Wi​sconsin+(Prognostic) (accessed May 16, 2020).
70.
go back to reference Göndör M, Bresfelean VP. REPTree and M5P for measuring fiscal policy influences on the Romanian capital market during 2003–2010. Int J Math Comput Simul. 2012;6:3783–86 http://naun.org/main/NAUN/mcs/17-414.pdf (accessed June 21, 2019). Göndör M, Bresfelean VP. REPTree and M5P for measuring fiscal policy influences on the Romanian capital market during 2003–2010. Int J Math Comput Simul. 2012;6:3783–86 http://​naun.​org/​main/​NAUN/​mcs/​17-414.​pdf (accessed June 21, 2019).
71.
go back to reference Kalmegh SR. Analysis of WEKA data mining algorithm REPTree. Simple Cart and RandomTree for Classification of Indian News. 2015. https://www.semanticscholar.org/paper/Analysis-of-WEKA-Data-Mining-Algorithm-REPTree%2C-and-Kalmegh/26d673f140807942313545489b38241c1f0401d0 (accessed June 21, 2019). Kalmegh SR. Analysis of WEKA data mining algorithm REPTree. Simple Cart and RandomTree for Classification of Indian News. 2015. https://​www.​semanticscholar.​org/​paper/​Analysis-of-WEKA-Data-Mining-Algorithm-REPTree%2C-and-Kalmegh/26d673f140807942313545489b38241c1f0401d0 (accessed June 21, 2019).
77.
go back to reference Osborne JW. Improving your data transformations: applying the Box-Cox transformation. Pract Assess Res Eval. 2010;15:1–9. Osborne JW. Improving your data transformations: applying the Box-Cox transformation. Pract Assess Res Eval. 2010;15:1–9.
78.
go back to reference Sakia R. M., The Box-Cox Transformation Technique: A Review, Journal of the Royal Statistical Society. Series D (The Statistician), Vol. 41, No. 2 (1992), pp. 169–178 Sakia R. M., The Box-Cox Transformation Technique: A Review, Journal of the Royal Statistical Society. Series D (The Statistician), Vol. 41, No. 2 (1992), pp. 169–178
Metadata
Title
Assessing the impact of parameters tuning in ensemble based breast Cancer classification
Authors
Ali Idri
El Ouassif Bouchra
Mohamed Hosni
Ibtissam Abnane
Publication date
01-07-2020
Publisher
Springer Berlin Heidelberg
Published in
Health and Technology / Issue 5/2020
Print ISSN: 2190-7188
Electronic ISSN: 2190-7196
DOI
https://doi.org/10.1007/s12553-020-00453-2

Other articles of this Issue 5/2020

Health and Technology 5/2020 Go to the issue

Premium Partner