Skip to main content
Erschienen in: Innovations in Systems and Software Engineering 4/2015

01.12.2015 | Original Paper

A hybrid one-class rule learning approach based on swarm intelligence for software fault prediction

verfasst von: Yousef Abdi, Saeed Parsa, Yousef Seyfari

Erschienen in: Innovations in Systems and Software Engineering | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Software testing is a fundamental activity in the software development process aimed to determine the quality of software. To reduce the effort and cost of this process, defect prediction methods can be used to determine fault-prone software modules through software metrics to focus testing activities on them. Because of model interpretation and easily used by programmers and testers some recent studies presented classification rules to make prediction models. This study presents a rule-based prediction approach based on kernel k-means clustering algorithm and Distance based Multi-objective Particle Swarm Optimization (DSMOPSO). Because of discrete search space, we modified this algorithm and named it DSMOPSO-D. We prevent best global rules to dominate local rules by dividing the search space with kernel k-means algorithm and by taking different approaches for imbalanced and balanced clusters, we solved imbalanced data set problem. The presented model performance was evaluated by four publicly available data sets from the PROMISE repository and compared with other machine learning and rule learning algorithms. The obtained results demonstrate that our model presents very good performance, especially in large data sets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Arisholm E, Briand, LC, Johannessen E (2008) Data mining techniques, candidate measures and evaluation methods for building practically useful fault-proneness prediction models. Dissertation, University of Oslo Arisholm E, Briand, LC, Johannessen E (2008) Data mining techniques, candidate measures and evaluation methods for building practically useful fault-proneness prediction models. Dissertation, University of Oslo
2.
Zurück zum Zitat Anil KJ (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31(8):651–666CrossRef Anil KJ (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31(8):651–666CrossRef
3.
Zurück zum Zitat de Carvalho AB, Pozo A, Vergilio SR (2010) A symbolic fault prediction model based on multiobjective particle swarm optimization. J Syst Softw 83(5):868–882CrossRef de Carvalho AB, Pozo A, Vergilio SR (2010) A symbolic fault prediction model based on multiobjective particle swarm optimization. J Syst Softw 83(5):868–882CrossRef
4.
Zurück zum Zitat Catal C (2011) Software fault prediction: a literature review and current trends. Expert Syst Appl 38(4):4626–4636CrossRef Catal C (2011) Software fault prediction: a literature review and current trends. Expert Syst Appl 38(4):4626–4636CrossRef
5.
Zurück zum Zitat Catal C, Diri B (2009) A systematic review of software fault predictions studies. Expert Syst Appl 36(4):7346–7354CrossRef Catal C, Diri B (2009) A systematic review of software fault predictions studies. Expert Syst Appl 36(4):7346–7354CrossRef
6.
Zurück zum Zitat Chulani S, Ray B, Santhanam P, Leszkowicz R (2003) Metrics for managing customer view of software quality. In: Proceedings of 9th IEEE international conference on software metrics symposium, pp 189–198 Chulani S, Ray B, Santhanam P, Leszkowicz R (2003) Metrics for managing customer view of software quality. In: Proceedings of 9th IEEE international conference on software metrics symposium, pp 189–198
7.
Zurück zum Zitat Coello CA, Pulido GT, Lechuga MS (2004) Handling multiple objectives with particle swarm optimization. IEEE Trans Evol Comput 8(3):256–279CrossRef Coello CA, Pulido GT, Lechuga MS (2004) Handling multiple objectives with particle swarm optimization. IEEE Trans Evol Comput 8(3):256–279CrossRef
8.
Zurück zum Zitat Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MATHMathSciNet Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MATHMathSciNet
9.
Zurück zum Zitat Elish KO, Elish MO (2008) Predicting defect-prone software modules using support vector machines. J Syst Softw 81(5):649–660CrossRef Elish KO, Elish MO (2008) Predicting defect-prone software modules using support vector machines. J Syst Softw 81(5):649–660CrossRef
10.
Zurück zum Zitat Fenton N, Neil M, Marsh W, Hearty P, Marquez D, Krause P, Mishra R (2007) Predicting software defects in varying development lifecycles using bayesian nets. Inf Softw Technol 49(1):32–43CrossRef Fenton N, Neil M, Marsh W, Hearty P, Marquez D, Krause P, Mishra R (2007) Predicting software defects in varying development lifecycles using bayesian nets. Inf Softw Technol 49(1):32–43CrossRef
11.
Zurück zum Zitat Filippone M, Camastra F, Masulli F, Rovetta S (2008) A survey of kernel and spectral methods for IEEE clustering. Pattern Recogn 41(1):176–190MATHCrossRef Filippone M, Camastra F, Masulli F, Rovetta S (2008) A survey of kernel and spectral methods for IEEE clustering. Pattern Recogn 41(1):176–190MATHCrossRef
12.
Zurück zum Zitat Freitas AA (2008) A review of evolutionary algorithms for data mining. In: Maimon O, Rockach L (eds) Soft computing for knowledge discovery and data mining, 2nd edn. Springer, New York, pp 79–111CrossRef Freitas AA (2008) A review of evolutionary algorithms for data mining. In: Maimon O, Rockach L (eds) Soft computing for knowledge discovery and data mining, 2nd edn. Springer, New York, pp 79–111CrossRef
13.
Zurück zum Zitat He H (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284CrossRef He H (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284CrossRef
14.
Zurück zum Zitat Hu X, Eberhart R (2002) Multiobjective optimization using dynamic neighborhood paricle swarm optimization. In: Proceeding of second international conference on evolutionary computation, pp 1677–1681 Hu X, Eberhart R (2002) Multiobjective optimization using dynamic neighborhood paricle swarm optimization. In: Proceeding of second international conference on evolutionary computation, pp 1677–1681
15.
Zurück zum Zitat Kennedy J, Eberhart RC (1995) Particle swarm optimization. In: Proceeding of IEEE international conference on neural networks, pp 1942–1948 Kennedy J, Eberhart RC (1995) Particle swarm optimization. In: Proceeding of IEEE international conference on neural networks, pp 1942–1948
16.
Zurück zum Zitat Kennedy J, Spears W (1998) Matching algorithms to problems: an experimental test of the particle swarm and some genetic algorithms on the multimodal problem generator. In: Proceeding of IEEE international conference on computational intelligence, pp 74–77 Kennedy J, Spears W (1998) Matching algorithms to problems: an experimental test of the particle swarm and some genetic algorithms on the multimodal problem generator. In: Proceeding of IEEE international conference on computational intelligence, pp 74–77
17.
Zurück zum Zitat Kim DW, Lee KY, Lee D, Lee KH (2005) Evaluation of the performance of clustering algorithms in kernel-induced feature space. Pattern Recogn 38(4):607–611CrossRef Kim DW, Lee KY, Lee D, Lee KH (2005) Evaluation of the performance of clustering algorithms in kernel-induced feature space. Pattern Recogn 38(4):607–611CrossRef
18.
Zurück zum Zitat Khoshgoftaar TM, Gao K, Seliya N (2010) Attribute selection and imbalanced data: problems in software defect prediction. In: Proceedings of 22nd IEEE international conference on tools with artificial intelligence, pp 137–144 Khoshgoftaar TM, Gao K, Seliya N (2010) Attribute selection and imbalanced data: problems in software defect prediction. In: Proceedings of 22nd IEEE international conference on tools with artificial intelligence, pp 137–144
19.
Zurück zum Zitat Koru G, Liu H (2005) Building effective defect prediction models in practice. IEEE Softw 22(6):23–29CrossRef Koru G, Liu H (2005) Building effective defect prediction models in practice. IEEE Softw 22(6):23–29CrossRef
20.
Zurück zum Zitat Kwedlo W, Iwanowicz P (2010) Using genetic algorithm for selection of initial cluster centers for the k-means method. In: Proceeding of 10th international conference on artifical intelligence and soft computing, pp 165–172 Kwedlo W, Iwanowicz P (2010) Using genetic algorithm for selection of initial cluster centers for the k-means method. In: Proceeding of 10th international conference on artifical intelligence and soft computing, pp 165–172
21.
Zurück zum Zitat Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496CrossRef Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496CrossRef
22.
Zurück zum Zitat Lletı R, Ortiz MC, Sarabia LA, Sánchez MS (2004) Selecting variables for k-means cluster analysis by using a genetic algorithm that optimises the silhouettes. Anal Chim Acta 515(1):87–100CrossRef Lletı R, Ortiz MC, Sarabia LA, Sánchez MS (2004) Selecting variables for k-means cluster analysis by using a genetic algorithm that optimises the silhouettes. Anal Chim Acta 515(1):87–100CrossRef
23.
Zurück zum Zitat Lounis H, Ait-Mehedine L (2004) Machine-learning techniques for software product quality assessment. In: Proceeding of 4th IEEE international conference on quality software, pp 102–109 Lounis H, Ait-Mehedine L (2004) Machine-learning techniques for software product quality assessment. In: Proceeding of 4th IEEE international conference on quality software, pp 102–109
25.
Zurück zum Zitat Mahanti R, Antony J (2005) Confluence of six sigma, simulation and software development. Manag Audit J 20(7):739–762CrossRef Mahanti R, Antony J (2005) Confluence of six sigma, simulation and software development. Manag Audit J 20(7):739–762CrossRef
26.
Zurück zum Zitat Mahaweerawat A, Sophatsathit P, Lursinsap C, Musilek P (2004) Fault prediction in object-oriented software using neural network techniques. In: Proceeding in Tech Conference on, pp 27–34 Mahaweerawat A, Sophatsathit P, Lursinsap C, Musilek P (2004) Fault prediction in object-oriented software using neural network techniques. In: Proceeding in Tech Conference on, pp 27–34
27.
Zurück zum Zitat Mardia K, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, London Mardia K, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, London
28.
Zurück zum Zitat Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13CrossRef Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2–13CrossRef
29.
Zurück zum Zitat Michalewicz Z (1994) Genetic algorithms + data structures = evolution programs. Springer, New YorkMATHCrossRef Michalewicz Z (1994) Genetic algorithms + data structures = evolution programs. Springer, New YorkMATHCrossRef
30.
Zurück zum Zitat Mostaghim S, Teich J (2003) Strategies for finding good local guides in multiobjective particle swarm optimization. In: Proceeding fo third IEEE international conference on Swarm intelligence, pp 26–33 Mostaghim S, Teich J (2003) Strategies for finding good local guides in multiobjective particle swarm optimization. In: Proceeding fo third IEEE international conference on Swarm intelligence, pp 26–33
31.
Zurück zum Zitat Müller KR, Mika S, Rätsch G, Tsuda K, Schölkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 12(2):181–202CrossRef Müller KR, Mika S, Rätsch G, Tsuda K, Schölkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 12(2):181–202CrossRef
32.
Zurück zum Zitat Pai GJ, Dugan JB (2007) Empirical analysis of software fault content and fault proneness using bayesian methods. IEEE Trans Softw Eng 33(10):675–686CrossRef Pai GJ, Dugan JB (2007) Empirical analysis of software fault content and fault proneness using bayesian methods. IEEE Trans Softw Eng 33(10):675–686CrossRef
33.
Zurück zum Zitat Prez-Miana E, Gras J-J (2006) Improving fault prediction using bayesian networks for the development of embedded software applications: research articles. Softw Test Verification Reliab 16(3):157–174CrossRef Prez-Miana E, Gras J-J (2006) Improving fault prediction using bayesian networks for the development of embedded software applications: research articles. Softw Test Verification Reliab 16(3):157–174CrossRef
34.
Zurück zum Zitat Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3):203–231MATHCrossRef Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3):203–231MATHCrossRef
35.
Zurück zum Zitat Rodríguez D, Ruiz R, Riquelme JC, Aguilar-Ruiz JS (2012) Searching for rules to detect defective modules: a subgroup discovery approach. Inf Sci 191:14–30CrossRef Rodríguez D, Ruiz R, Riquelme JC, Aguilar-Ruiz JS (2012) Searching for rules to detect defective modules: a subgroup discovery approach. Inf Sci 191:14–30CrossRef
36.
Zurück zum Zitat Riquelme JC, Ruiz R, Rodríguez D, Moreno J (2008) Finding defective modules from highly unbalanced datasets. Actas de los Talleres de las Jornadas de Ingeniería del Software y Bases de Datos 2(1):67–74 Riquelme JC, Ruiz R, Rodríguez D, Moreno J (2008) Finding defective modules from highly unbalanced datasets. Actas de los Talleres de las Jornadas de Ingeniería del Software y Bases de Datos 2(1):67–74
37.
Zurück zum Zitat Rousseeuw P (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65MATHCrossRef Rousseeuw P (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65MATHCrossRef
38.
Zurück zum Zitat Shayeghi H, Mahdavi M, Bagheri A (2010) An improved DPSO with mutation based on similarity algorithm for optimization of transmission lines loading. Energy Convers Manag 51(12):2715–2723CrossRef Shayeghi H, Mahdavi M, Bagheri A (2010) An improved DPSO with mutation based on similarity algorithm for optimization of transmission lines loading. Energy Convers Manag 51(12):2715–2723CrossRef
39.
Zurück zum Zitat Seiffert C, Khoshgoftaar TM, Hulse JV, Folleco A (2007) An empirical study of the classification performance of learners on imbalanced and noisy software quality data. In: Proceeding of IEEE international conference on information reuse and integration, pp 651–658 Seiffert C, Khoshgoftaar TM, Hulse JV, Folleco A (2007) An empirical study of the classification performance of learners on imbalanced and noisy software quality data. In: Proceeding of IEEE international conference on information reuse and integration, pp 651–658
40.
Zurück zum Zitat Singh Y, Kaur A, Malhotra R (2009) Software fault pronennes prediction using support vector machines. In: Proceeding of IEEE international conference on engineering Singh Y, Kaur A, Malhotra R (2009) Software fault pronennes prediction using support vector machines. In: Proceeding of IEEE international conference on engineering
41.
42.
Zurück zum Zitat Tan KC, Yu Q, Ang JH (2006) A dual-objective evolutionary algorithm for rules extraction in data mining. Comput Optim Appl 34(2):273–294MATHMathSciNetCrossRef Tan KC, Yu Q, Ang JH (2006) A dual-objective evolutionary algorithm for rules extraction in data mining. Comput Optim Appl 34(2):273–294MATHMathSciNetCrossRef
43.
Zurück zum Zitat Tax DMJ, Duin RPW (2002) Uniform object generation for optimizing one-class classifiers. J Mach Learn Res 2:155–173MATH Tax DMJ, Duin RPW (2002) Uniform object generation for optimizing one-class classifiers. J Mach Learn Res 2:155–173MATH
44.
Zurück zum Zitat Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443CrossRef Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443CrossRef
45.
Zurück zum Zitat Xing F, Guo P, Lyu MR (2005) A novel method for early software quality prediction sbased on support vector machine. In: Proceeding of 16th IEEE international conference on software reliability engineering, pp 213–222 Xing F, Guo P, Lyu MR (2005) A novel method for early software quality prediction sbased on support vector machine. In: Proceeding of 16th IEEE international conference on software reliability engineering, pp 213–222
46.
Zurück zum Zitat Zhongkai L, Zhencai Z, Shanzeng L (2010) A distance sorting based multi-objective particle swarm optimizer and its applications. Life Syst Model Intell Comput 98:30–36CrossRef Zhongkai L, Zhencai Z, Shanzeng L (2010) A distance sorting based multi-objective particle swarm optimizer and its applications. Life Syst Model Intell Comput 98:30–36CrossRef
Metadaten
Titel
A hybrid one-class rule learning approach based on swarm intelligence for software fault prediction
verfasst von
Yousef Abdi
Saeed Parsa
Yousef Seyfari
Publikationsdatum
01.12.2015
Verlag
Springer London
Erschienen in
Innovations in Systems and Software Engineering / Ausgabe 4/2015
Print ISSN: 1614-5046
Elektronische ISSN: 1614-5054
DOI
https://doi.org/10.1007/s11334-015-0258-2

Weitere Artikel der Ausgabe 4/2015

Innovations in Systems and Software Engineering 4/2015 Zur Ausgabe

Premium Partner