Skip to main content
main-content

Tipp

Weitere Artikel dieser Ausgabe durch Wischen aufrufen

25.09.2018 | Methodologies and Application

Binary teaching–learning-based optimization algorithm with a new update mechanism for sample subset optimization in software defect prediction

Zeitschrift:
Soft Computing
Autoren:
Thanh Tung Khuat, My Hanh Le
Wichtige Hinweise
Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Abstract

Software defect prediction has gained considerable attention in recent years. A broad range of computational methods has been developed for accurate prediction of faulty modules based on code and design metrics. One of the challenges in training classifiers is the highly imbalanced class distribution in available datasets, leading to an undesirable bias in the prediction performance for the minority class. Data sampling is a widespread technique to tackle this problem. However, traditional sampling methods, which depend mainly on random resampling from a given dataset, do not take advantage of useful information available in training sets, such as sample quality and representative instances. To cope with this limitation, evolutionary undersampling methods are usually used for identifying an optimal sample subset for the training dataset. This paper proposes a binary teaching–learning- based optimization algorithm employing a distribution-based solution update rule, namely BTLBOd, to generate a balanced subset of highly valuable examples. This subset is then applied to train a classifier for reliable prediction of potentially defective modules in a software system. Each individual in BTLBOd includes two vectors: a real-valued vector generated by the distribution-based update mechanism, and a binary vector produced from the corresponding real vector by a proposed mapping function. Empirical results showed that the optimal sample subset produced by BTLBOd might ameliorate the classification accuracy of the predictor on highly imbalanced software defect data. Obtained results also demonstrated the superior performance of the proposed sampling method compared to other popular sampling techniques.

Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten

Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 69.000 Bücher
  • über 500 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Umwelt
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe

Testen Sie jetzt 30 Tage kostenlos.

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 58.000 Bücher
  • über 300 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb

Testen Sie jetzt 30 Tage kostenlos.

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 50.000 Bücher
  • über 380 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Umwelt
  • Maschinenbau + Werkstoffe​​​​​​​​​​​​​​

Testen Sie jetzt 30 Tage kostenlos.

Literatur
Über diesen Artikel

Premium Partner

    Bildnachweise