Skip to main content
Erschienen in: Empirical Software Engineering 6/2014

01.12.2014

Variation factors in the design and analysis of replicated controlled experiments

Three (dis)similar studies on inspections versus unit testing

verfasst von: Per Runeson, Andreas Stefik, Anneliese Andrews

Erschienen in: Empirical Software Engineering | Ausgabe 6/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In formal experiments on software engineering, the number of factors that may impact an outcome is very high. Some factors are controlled and change by design, while others are are either unforeseen or due to chance. This paper aims to explore how context factors change in a series of formal experiments and to identify implications for experimentation and replication practices to enable learning from experimentation. We analyze three experiments on code inspections and structural unit testing. The first two experiments use the same experimental design and instrumentation (replication), while the third, conducted by different researchers, replaces the programs and adapts defect detection methods accordingly (reproduction). Experimental procedures and location also differ between the experiments. Contrary to expectations, there are significant differences between the original experiment and the replication, as well as compared to the reproduction. Some of the differences are due to factors other than the ones designed to vary between experiments, indicating the sensitivity to context factors in software engineering experimentation. In aggregate, the analysis indicates that reducing the complexity of software engineering experiments should be considered by researchers who want to obtain reliable and repeatable empirical measures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Anderson T, Darling D (1952) Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. Ann Math Stat 23(2):193–212MathSciNetMATHCrossRef Anderson T, Darling D (1952) Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. Ann Math Stat 23(2):193–212MathSciNetMATHCrossRef
Zurück zum Zitat Basili VR, Selby RW (1987) Comparing the effectiveness of software testing strategies. IEEE Trans Softw Eng 13(12):1278–1296CrossRef Basili VR, Selby RW (1987) Comparing the effectiveness of software testing strategies. IEEE Trans Softw Eng 13(12):1278–1296CrossRef
Zurück zum Zitat Basili VR, Shull F, Lanubile F (1999) Building knowledge through families of experiments. IEEE Trans Softw Eng 25(4):456–473CrossRef Basili VR, Shull F, Lanubile F (1999) Building knowledge through families of experiments. IEEE Trans Softw Eng 25(4):456–473CrossRef
Zurück zum Zitat Berling T, Runeson P (2003) Evaluation of a perspective based review method applied in an industrial setting. IEE Proc SW 150(3):177–184CrossRef Berling T, Runeson P (2003) Evaluation of a perspective based review method applied in an industrial setting. IEE Proc SW 150(3):177–184CrossRef
Zurück zum Zitat Cartwright N (1991) Replicability, reproducibility, and robustness: comments on Harry Collins. Hist Polit Econ 23(1):143–155CrossRef Cartwright N (1991) Replicability, reproducibility, and robustness: comments on Harry Collins. Hist Polit Econ 23(1):143–155CrossRef
Zurück zum Zitat Clarke P, O’Connor RV (2012) The situational factors that affect the software development process: towards a comprehensive reference framework. Inf Softw Technol 54(5):433–447CrossRef Clarke P, O’Connor RV (2012) The situational factors that affect the software development process: towards a comprehensive reference framework. Inf Softw Technol 54(5):433–447CrossRef
Zurück zum Zitat da Silva FQB, Suassuna M, França ACC, Grubb AM, Gouveia TB, Monteiro CVF, dos Santos IE (2012) Replication of empirical studies in software engineering research: a systematic mapping study. Empir Softw Eng. doi:10.1007/s10664-012-9227-7 da Silva FQB, Suassuna M, França ACC, Grubb AM, Gouveia TB, Monteiro CVF, dos Santos IE (2012) Replication of empirical studies in software engineering research: a systematic mapping study. Empir Softw Eng. doi:10.​1007/​s10664-012-9227-7
Zurück zum Zitat Dybå T, Sjøberg DIK, Cruzes DS (2012) What works for whom, where, when, and why?: on the role of context in empirical software engineering. In: Proceedings of the 11th international symposium on empirical software engineering and measurement, pp 19–28 Dybå T, Sjøberg DIK, Cruzes DS (2012) What works for whom, where, when, and why?: on the role of context in empirical software engineering. In: Proceedings of the 11th international symposium on empirical software engineering and measurement, pp 19–28
Zurück zum Zitat Gomez OS, Juristo N, Vegas S (2010) Replications types in experimental disciplines. In: Proceedings of the fourth international symposium on empirical software engineering and measurement Gomez OS, Juristo N, Vegas S (2010) Replications types in experimental disciplines. In: Proceedings of the fourth international symposium on empirical software engineering and measurement
Zurück zum Zitat Hannay J, Jørgensen M (2008) The role of deliberate artificial design elements in software engineering experiments. IEEE Trans Softw Eng 34(2):242–259CrossRef Hannay J, Jørgensen M (2008) The role of deliberate artificial design elements in software engineering experiments. IEEE Trans Softw Eng 34(2):242–259CrossRef
Zurück zum Zitat Hetzel W (1972) An experimental analysis of program verification problem solving capabilities as they relate to programmer efficiency. Comput Pers 3(3):10–15CrossRef Hetzel W (1972) An experimental analysis of program verification problem solving capabilities as they relate to programmer efficiency. Comput Pers 3(3):10–15CrossRef
Zurück zum Zitat Hoaglin D, Andrews D (1975) The reporting of computation-based results in statistics. Am Stat 29(3):112–126 Hoaglin D, Andrews D (1975) The reporting of computation-based results in statistics. Am Stat 29(3):112–126
Zurück zum Zitat Humphrey WS (1995) A discipline for software engineering. Addison-Wesley, Reading, MA Humphrey WS (1995) A discipline for software engineering. Addison-Wesley, Reading, MA
Zurück zum Zitat Jedlitschka A, Pfahl D (2005) Reporting guidelines for controlled experiments in software engineering. In: Proceedings of the 4th international symposium on empirical software engineering, pp 95–104 Jedlitschka A, Pfahl D (2005) Reporting guidelines for controlled experiments in software engineering. In: Proceedings of the 4th international symposium on empirical software engineering, pp 95–104
Zurück zum Zitat Jørgensen M, Grimstad S (2011) The impact of irrelevant and misleading information on software development effort estimates: a randomized controlled field experiment. IEEE Trans Softw Eng 37(5):695–707CrossRef Jørgensen M, Grimstad S (2011) The impact of irrelevant and misleading information on software development effort estimates: a randomized controlled field experiment. IEEE Trans Softw Eng 37(5):695–707CrossRef
Zurück zum Zitat Jørgensen M, Grimstad S (2012) Software development estimation biases: the role of interdependence. IEEE Trans Softw Eng 38(3):677–693CrossRef Jørgensen M, Grimstad S (2012) Software development estimation biases: the role of interdependence. IEEE Trans Softw Eng 38(3):677–693CrossRef
Zurück zum Zitat Jørgensen M, Gruschke T (2009) The impact of lessons-learned sessions on effort estimation and uncertainty assessments. IEEE Trans Softw Eng 35(3):368–383CrossRef Jørgensen M, Gruschke T (2009) The impact of lessons-learned sessions on effort estimation and uncertainty assessments. IEEE Trans Softw Eng 35(3):368–383CrossRef
Zurück zum Zitat Jørgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. IEEE Trans Softw Eng 33:33–53CrossRef Jørgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. IEEE Trans Softw Eng 33:33–53CrossRef
Zurück zum Zitat Juristo N, Gomez OS (2012) Replication of software engineering experiments. In: Meyer B, Nordio M (eds) Empirical software engineering and verification. LNCS, vol 7007. Springer, pp 60–88 Juristo N, Gomez OS (2012) Replication of software engineering experiments. In: Meyer B, Nordio M (eds) Empirical software engineering and verification. LNCS, vol 7007. Springer, pp 60–88
Zurück zum Zitat Juristo N, Vegas S (2011) The role of non-exact replications in software engineering experiments. Empir Softw Eng 16(3):295–324CrossRef Juristo N, Vegas S (2011) The role of non-exact replications in software engineering experiments. Empir Softw Eng 16(3):295–324CrossRef
Zurück zum Zitat Juristo N, Moreno AM, Vegas S (2004) Reviewing 25 years of testing technique experiments. Empir Softw Eng 9(1–2):7–44CrossRef Juristo N, Moreno AM, Vegas S (2004) Reviewing 25 years of testing technique experiments. Empir Softw Eng 9(1–2):7–44CrossRef
Zurück zum Zitat Juristo N, Moreno AM, Vegas S, Solari M (2006) In search of what we experimentally know about unit testing. IEEE Softw 23:72–80CrossRef Juristo N, Moreno AM, Vegas S, Solari M (2006) In search of what we experimentally know about unit testing. IEEE Softw 23:72–80CrossRef
Zurück zum Zitat Juristo N, Vegas S, Solari M, Abrahao S, Ramos I (2012) Comparing the effectiveness of equivalence partitioning, branch testing and code reading be stepwise abstraction applied by subjects. In: Proceedings fifth IEEE international conference on software testing, verification and validation, Montreal, Canada, pp 330–339 Juristo N, Vegas S, Solari M, Abrahao S, Ramos I (2012) Comparing the effectiveness of equivalence partitioning, branch testing and code reading be stepwise abstraction applied by subjects. In: Proceedings fifth IEEE international conference on software testing, verification and validation, Montreal, Canada, pp 330–339
Zurück zum Zitat Kitchenham BA, Fry J, Linkman SG (2003) The case against cross-over designs in software engineering. In: 11th international workshop on software technology and engineering practice (STEP 2003), Amsterdam, The Netherlands, pp 65–67 Kitchenham BA, Fry J, Linkman SG (2003) The case against cross-over designs in software engineering. In: 11th international workshop on software technology and engineering practice (STEP 2003), Amsterdam, The Netherlands, pp 65–67
Zurück zum Zitat Kitchenham, BA (2008) The role of replications in empirical software engineering—a word of warning. Empir Softw Eng 13:219–221CrossRef Kitchenham, BA (2008) The role of replications in empirical software engineering—a word of warning. Empir Softw Eng 13:219–221CrossRef
Zurück zum Zitat Kitchenham BA, Al-Khilidar H, Babar MA, Berry M, Cox K, Keung J, Kurniawati F, Staples M, Zhang H, Zhu L (2007) Evaluating guidelines for reporting empirical software engineering studies. Empir Softw Eng 13(1):97–121CrossRef Kitchenham BA, Al-Khilidar H, Babar MA, Berry M, Cox K, Keung J, Kurniawati F, Staples M, Zhang H, Zhu L (2007) Evaluating guidelines for reporting empirical software engineering studies. Empir Softw Eng 13(1):97–121CrossRef
Zurück zum Zitat Kitchenham B, PearlBrereton O, Budgen D, Turner M, Bailey J, Linkman S (2009) Systematic literature reviews in software engineering—a systematic literature review. Inf Softw Technol 51(1):7–15CrossRef Kitchenham B, PearlBrereton O, Budgen D, Turner M, Bailey J, Linkman S (2009) Systematic literature reviews in software engineering—a systematic literature review. Inf Softw Technol 51(1):7–15CrossRef
Zurück zum Zitat Laitenberger O (1998) Studying the effects of code inspection and structural testing on software quality. In: Proceedings 9th international symposium on software reliability engineering, pp 237–246 Laitenberger O (1998) Studying the effects of code inspection and structural testing on software quality. In: Proceedings 9th international symposium on software reliability engineering, pp 237–246
Zurück zum Zitat Lindsay RM, Ehrenberg ASC (1993) The design of replicated studies. Am Stat 47(3):217–227 Lindsay RM, Ehrenberg ASC (1993) The design of replicated studies. Am Stat 47(3):217–227
Zurück zum Zitat Mäntylä MV, Lasseinus C, Vanhanen J (2010) Rethinking replication in software engineering: can we see the forest for the trees? In: Knutson C, Krein J (eds) 1st international workshop on replication in empirical software engineering research, Cape Town, South Africa Mäntylä MV, Lasseinus C, Vanhanen J (2010) Rethinking replication in software engineering: can we see the forest for the trees? In: Knutson C, Krein J (eds) 1st international workshop on replication in empirical software engineering research, Cape Town, South Africa
Zurück zum Zitat Miller J (2000) Applying meta-analytical procedures to software engineering experiments. J Syst Softw 54(1):29–39CrossRef Miller J (2000) Applying meta-analytical procedures to software engineering experiments. J Syst Softw 54(1):29–39CrossRef
Zurück zum Zitat Miller J (2005) Replicating software engineering experiments: a poisoned chalice or the holy grail. Inf Softw Technol 47(4):233–244CrossRef Miller J (2005) Replicating software engineering experiments: a poisoned chalice or the holy grail. Inf Softw Technol 47(4):233–244CrossRef
Zurück zum Zitat Montgomery DC (2001) Design and analysis of experiments, 5th edn. Wiley, New York Montgomery DC (2001) Design and analysis of experiments, 5th edn. Wiley, New York
Zurück zum Zitat Pickard L, Kitchenham BA, Jones P (1998) Combining empirical results in software engineering. Inf Softw Technol 40(14):811–821CrossRef Pickard L, Kitchenham BA, Jones P (1998) Combining empirical results in software engineering. Inf Softw Technol 40(14):811–821CrossRef
Zurück zum Zitat Runeson P, Andrews A (2003) Detection or isolation of defects? An experimental comparison of unit testing and code inspection. In: 14th international symposium on software reliability engineering, pp 3–13 Runeson P, Andrews A (2003) Detection or isolation of defects? An experimental comparison of unit testing and code inspection. In: 14th international symposium on software reliability engineering, pp 3–13
Zurück zum Zitat Runeson P, Anderson C, Thelin T, Andrews A, Berling T (2006) What do we know about defect detection methods? IEEE Softw 23(3):82–90CrossRef Runeson P, Anderson C, Thelin T, Andrews A, Berling T (2006) What do we know about defect detection methods? IEEE Softw 23(3):82–90CrossRef
Zurück zum Zitat Runeson P, Stefik A, Andrews A, Grönblom S, Porres I, Siebert S (2011) A comparative analysis of three replicated experiments comparing inspection and unit testing. In: Proceedings 2nd international workshop on replication in empirical software engineering research, Banff, Canada, pp 35–42 Runeson P, Stefik A, Andrews A, Grönblom S, Porres I, Siebert S (2011) A comparative analysis of three replicated experiments comparing inspection and unit testing. In: Proceedings 2nd international workshop on replication in empirical software engineering research, Banff, Canada, pp 35–42
Zurück zum Zitat Runeson P, Höst M, Rainer A, Regnell B (2012) Case study research in software engineering—guidelines and examples. Wiley, New YorkCrossRef Runeson P, Höst M, Rainer A, Regnell B (2012) Case study research in software engineering—guidelines and examples. Wiley, New YorkCrossRef
Zurück zum Zitat Schmidt S (2009) Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Rev Gen Psychol 13(2):90–100CrossRef Schmidt S (2009) Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Rev Gen Psychol 13(2):90–100CrossRef
Zurück zum Zitat Shull F, Basili VR, Carver J, Maldonado JC, Travassos GH, Mendonca M, Fabbri S (2002) Replicating software engineering experiments: addressing the tacit knowledge problem. In: Proceedings of the 1st international symposium empirical software engineering, pp 7–16 Shull F, Basili VR, Carver J, Maldonado JC, Travassos GH, Mendonca M, Fabbri S (2002) Replicating software engineering experiments: addressing the tacit knowledge problem. In: Proceedings of the 1st international symposium empirical software engineering, pp 7–16
Zurück zum Zitat Shull FJ, Carver J, Vegas S, Juristo N (2008) The role of replications in empirical software engineering. Empir Softw Eng 13(2):211–218CrossRef Shull FJ, Carver J, Vegas S, Juristo N (2008) The role of replications in empirical software engineering. Empir Softw Eng 13(2):211–218CrossRef
Zurück zum Zitat Siegel S, Castellan N (1956) Nonparametric statistics for the behavioural sciences. McGraw-Hill, New York Siegel S, Castellan N (1956) Nonparametric statistics for the behavioural sciences. McGraw-Hill, New York
Zurück zum Zitat Sjøberg DIK (2007) Knowledge acquisition in software engineering requires sharing of data and artifacts. In: Basili V, Rombach H, Schneider K, Kitchenham B, Pfahl D, Selby R (eds) Empirical software engineering issues: critical assessment and future directions. LNCS, vol 4336. Springer, pp 77–82 Sjøberg DIK (2007) Knowledge acquisition in software engineering requires sharing of data and artifacts. In: Basili V, Rombach H, Schneider K, Kitchenham B, Pfahl D, Selby R (eds) Empirical software engineering issues: critical assessment and future directions. LNCS, vol 4336. Springer, pp 77–82
Zurück zum Zitat So S, Cha S, Shimeall T, Kwon Y (2002) An empirical evaluation of six methods to detect faults in software. SW Test Ver Rel 12(3):155–171CrossRef So S, Cha S, Shimeall T, Kwon Y (2002) An empirical evaluation of six methods to detect faults in software. SW Test Ver Rel 12(3):155–171CrossRef
Zurück zum Zitat Teasley BE, Leventhal LM, Mynatt CR, Rohlman DS (1994) Why software testing is sometimes ineffective: two applied studies of positive test strategy. J Appl Psychol 79(1):142–155CrossRef Teasley BE, Leventhal LM, Mynatt CR, Rohlman DS (1994) Why software testing is sometimes ineffective: two applied studies of positive test strategy. J Appl Psychol 79(1):142–155CrossRef
Zurück zum Zitat Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslen A (2012) Experimentation in software engineering. Springer Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslen A (2012) Experimentation in software engineering. Springer
Zurück zum Zitat Yin RK (2009) Case study research design and methods, 4th edn. Sage Publications, Beverly Hills, CA Yin RK (2009) Case study research design and methods, 4th edn. Sage Publications, Beverly Hills, CA
Metadaten
Titel
Variation factors in the design and analysis of replicated controlled experiments
Three (dis)similar studies on inspections versus unit testing
verfasst von
Per Runeson
Andreas Stefik
Anneliese Andrews
Publikationsdatum
01.12.2014
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 6/2014
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-013-9262-z

Weitere Artikel der Ausgabe 6/2014

Empirical Software Engineering 6/2014 Zur Ausgabe

Premium Partner