Skip to main content
Erschienen in: Innovations in Systems and Software Engineering 4/2015

01.12.2015 | Original Paper

A comparative analysis between two techniques for the prediction of software defects: fuzzy and statistical linear regression

verfasst von: Fernando Valles-Barajas

Erschienen in: Innovations in Systems and Software Engineering | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Software engineers should estimate the necessary resources (time, people, software tools among others) to satisfy software project requirements; this activity is carried out in the planning phase. The estimated time for developing software projects is a necessary element to establish the cost of software projects and to assign human resources to every phase of software projects. Most companies fail to finish software projects on time because of a poor estimation technique or the lack of the same. The estimated time must consider the time spent eliminating software defects injected during each of the software phases. A comparative analysis between two techniques (fuzzy linear regression and statistical linear regression) to perform software defect estimation is presented. These two techniques model uncertainty in a different way; statistical linear regression models uncertainty as randomness, whereas fuzzy linear regression models uncertainty as fuzziness. The main objective of this paper was to establish the kind of uncertainty associated with software defect prediction and to contrast these two prediction techniques. The KC1 NASA data set was used to do this analysis. Only six of the metrics included in KC1 data set and lines of code metric were used in this comparative analysis. Descriptive statistics was first used to have an overview of the main characteristics of the data set used in this research. Linearity property between predictor variables and the variable of interest number of defects was checked using scatter plots and Pearson’s correlation coefficient. Then the problem of multicollinearity was verified using inter-correlations among metrics and the variance inflation factor. Best subset regression was applied to detect the most influencing subset of predictor variables; this subset was later used to build fuzzy and statistical regression models. Linearity property between metrics and number of defects was confirmed. The problem of multicollinearity was not detected in the predictor variables. Best subset regression found that the subset composed of 5 variables was the most influencing subset. The analysis showed that the statistical regression model in general outperformed the fuzzy regression model. Techniques for making software defect prediction should be carefully employed in order to have quality plans. Software engineers should consider and understand a set of prediction techniques and know their weaknesses and strengths. At least, in the KC1 data set, the uncertainty in the software defect prediction model is due to randomness so it is reasonable to use statistical linear regression instead of fuzzy linear regression to build a prediction model.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Pressman R, Maxim B (2014) Software engineering: a practitioner’s approach, 8th edn. McGraw-Hill, New York Pressman R, Maxim B (2014) Software engineering: a practitioner’s approach, 8th edn. McGraw-Hill, New York
2.
Zurück zum Zitat Chidamber SR, Kemerer CF (1994) A metrics suite for object-oriented design. IEEE Trans Softw Eng 20(6):476–493CrossRef Chidamber SR, Kemerer CF (1994) A metrics suite for object-oriented design. IEEE Trans Softw Eng 20(6):476–493CrossRef
3.
4.
Zurück zum Zitat Weimin Y, Longshu L (2008) A new method to predict software defect based on rough sets. In: First international conference on intelligent networks and intelligent systems (IEEE), pp 135–138 Weimin Y, Longshu L (2008) A new method to predict software defect based on rough sets. In: First international conference on intelligent networks and intelligent systems (IEEE), pp 135–138
5.
Zurück zum Zitat Nasiri M, Shanbeh M, Tavanai H (2005) Comparison of statistical regression, fuzzy regression and artificial neural network modeling methodologies in polyester dyeing. In: Proceedings of the international conference on computational intelligence for modelling, control and automation and international conference on intelligent agents, web technologies and internet Commerce vol 1 (CIMCA-IAWTIC’06)–vol 01, CIMCA ’05. IEEE Computer Society, Washington, DC, pp 505–510. http://dl.acm.org/citation.cfm?id=1134823.1135169 Nasiri M, Shanbeh M, Tavanai H (2005) Comparison of statistical regression, fuzzy regression and artificial neural network modeling methodologies in polyester dyeing. In: Proceedings of the international conference on computational intelligence for modelling, control and automation and international conference on intelligent agents, web technologies and internet Commerce vol 1 (CIMCA-IAWTIC’06)–vol 01, CIMCA ’05. IEEE Computer Society, Washington, DC, pp 505–510. http://​dl.​acm.​org/​citation.​cfm?​id=​1134823.​1135169
6.
Zurück zum Zitat Pushphavathi T, Suma V, Ramaswamy V (2014). In: Electronics and communication systems (ICECS), 2014 international conference on, a novel method for software defect prediction: hybrid of FCM and random forest. pp 1–5 Pushphavathi T, Suma V, Ramaswamy V (2014). In: Electronics and communication systems (ICECS), 2014 international conference on, a novel method for software defect prediction: hybrid of FCM and random forest. pp 1–5
7.
Zurück zum Zitat Zalewski W (1998) Comparison of the fuzzy regression analysis and the least squares regression method to the electrical load estimation. In: Electrotechnical conference, 1998. MELECON 98., 9th Mediterranean, vol 1. IEEE Computer Society, pp 207 – 211 Zalewski W (1998) Comparison of the fuzzy regression analysis and the least squares regression method to the electrical load estimation. In: Electrotechnical conference, 1998. MELECON 98., 9th Mediterranean, vol 1. IEEE Computer Society, pp 207 – 211
8.
Zurück zum Zitat Koru AG, Liu H (2005) An investigation of the effect of module size on defect prediction using static measures. SIGSOFT Softw Eng Notes 30(4):1–5 Koru AG, Liu H (2005) An investigation of the effect of module size on defect prediction using static measures. SIGSOFT Softw Eng Notes 30(4):1–5
9.
Zurück zum Zitat Gyimóthy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910CrossRef Gyimóthy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910CrossRef
10.
Zurück zum Zitat Zeng H, Rine D (2004) Estimation of software defects fix effort using neural networks. In: Proceedings of the 28th annual international computer software and applications conference-workshops and fast abstracts, vol 02, COMPSAC ’04. IEEE Computer Society, Washington, DC, pp 20–21. http://dl.acm.org/citation.cfm?id=1025118.1025573 Zeng H, Rine D (2004) Estimation of software defects fix effort using neural networks. In: Proceedings of the 28th annual international computer software and applications conference-workshops and fast abstracts, vol 02, COMPSAC ’04. IEEE Computer Society, Washington, DC, pp 20–21. http://​dl.​acm.​org/​citation.​cfm?​id=​1025118.​1025573
11.
Zurück zum Zitat Zhou Y, Leung H (2006) Empirical analysis of object-oiented design metrics for predicting high and low severity faults. IEEE Trans Softw Eng 32(10):771–789CrossRef Zhou Y, Leung H (2006) Empirical analysis of object-oiented design metrics for predicting high and low severity faults. IEEE Trans Softw Eng 32(10):771–789CrossRef
12.
Zurück zum Zitat Bowerman B, O’Connell R, Murphree E (2013) Business statistics in practice, 7th edn. McGraw-Hill/Irwin, New York Bowerman B, O’Connell R, Murphree E (2013) Business statistics in practice, 7th edn. McGraw-Hill/Irwin, New York
13.
Zurück zum Zitat Miller I, Miller M (2012) John E. Freund’s mathematical statistics with applications, 8th edn. Prentice Hall, New Jersey Miller I, Miller M (2012) John E. Freund’s mathematical statistics with applications, 8th edn. Prentice Hall, New Jersey
14.
Zurück zum Zitat Berenson ML, Levine DM, Szabat KA (2014) Basic business statistics, 13th edn. Prentice Hall, New Jersey Berenson ML, Levine DM, Szabat KA (2014) Basic business statistics, 13th edn. Prentice Hall, New Jersey
15.
Zurück zum Zitat Montgomery DC, Runger GC (2013) Applied statistics and probability for engineers, 6th edn. Wiley, New York Montgomery DC, Runger GC (2013) Applied statistics and probability for engineers, 6th edn. Wiley, New York
16.
Zurück zum Zitat Navidi W (2014) Statistics for engineers and scientists, 4th edn. McGraw-Hill, New York Navidi W (2014) Statistics for engineers and scientists, 4th edn. McGraw-Hill, New York
17.
Zurück zum Zitat Walpole RE, Myers RH, Myers SL, Ye KE (2011) Probability and statistics for engineers and scientists, 9th edn. Prentice Hall, New Jersey Walpole RE, Myers RH, Myers SL, Ye KE (2011) Probability and statistics for engineers and scientists, 9th edn. Prentice Hall, New Jersey
18.
Zurück zum Zitat Tanaka H, Uejima S, Asai K (1982) Linear regression analysis with fuzzy model linear regression analysis with fuzzy model. IEEE Trans Syst Man Cybern 12(6):903MATHCrossRef Tanaka H, Uejima S, Asai K (1982) Linear regression analysis with fuzzy model linear regression analysis with fuzzy model. IEEE Trans Syst Man Cybern 12(6):903MATHCrossRef
19.
Zurück zum Zitat Shapiro AF (2004) Fuzzy regression and the term structure of interest rate revisited. In: Proceedings of the 14th International AFIR Colloquium (AFIR2004), pp 29–45 Shapiro AF (2004) Fuzzy regression and the term structure of interest rate revisited. In: Proceedings of the 14th International AFIR Colloquium (AFIR2004), pp 29–45
20.
Zurück zum Zitat Hillier FS (2014) Introduction to operations research, 10th edn. McGraw-Hill, New York Hillier FS (2014) Introduction to operations research, 10th edn. McGraw-Hill, New York
Metadaten
Titel
A comparative analysis between two techniques for the prediction of software defects: fuzzy and statistical linear regression
verfasst von
Fernando Valles-Barajas
Publikationsdatum
01.12.2015
Verlag
Springer London
Erschienen in
Innovations in Systems and Software Engineering / Ausgabe 4/2015
Print ISSN: 1614-5046
Elektronische ISSN: 1614-5054
DOI
https://doi.org/10.1007/s11334-015-0256-4

Weitere Artikel der Ausgabe 4/2015

Innovations in Systems and Software Engineering 4/2015 Zur Ausgabe

Premium Partner