Skip to main content
Erschienen in: Empirical Software Engineering 6/2023

01.11.2023

Operationalizing validity of empirical software engineering studies

verfasst von: Johannes Härtel, Ralf Lämmel

Erschienen in: Empirical Software Engineering | Ausgabe 6/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Empirical Software Engineering studies apply methods, like linear regression, statistic tests, or correlation analysis, to better understand software engineering scenarios. Assuring the validity of such methods and corresponding results is challenging but critical. This is also reflected by quality criteria on the validity that are part of the reviewing process for the corresponding research results. However, such criteria are often hard to define operationally and thus hard to judge by the reviewers. In this paper, we describe a new strategy to define and communicate the validity of methods and results. We conceptually decompose a study into an empirical scenario, a used method, and the produced results. Validity can only be described as the relationship between the three parts. To make the empirical scenario fully operational, we convert informal assumptions on it into executable simulation code that leverages artificial data to replace (or complement) our real data. We can then run the method on the artificial data and examine the impact of our assumptions on the quality of results. This may operationally i) support the validity of a method for a valid result, ii) threaten the validity of a method for an invalid result if assumptions are controversial, or iii) invalidate a method for an invalid result if assumptions are plausible. We encourage researchers to submit simulations as additional artifacts to the reviewing process to make such statements explicit. Rating if a simulated scenario is plausible or controversial is subjective and may benefit from involving a reviewer. We show that existing empirical software engineering studies can benefit from such additional validation artifacts.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
1
In R, random number generators are vertorized and start with a letter r followed by an abbreviation for the distribution family (we will see rbinom, rnorm and rpoisson).
 
2
All our reproductions of other papers are fully available online to guarantee the reproduction of this paper.
 
Literatur
Zurück zum Zitat Akaike H (1998) Information theory and an extension of the maximum likelihood principle. In: Parzen E, Tanabe K, Kitagawa G (eds) Selected papers of hirotugu akaike. Springer, pp 199–213CrossRef Akaike H (1998) Information theory and an extension of the maximum likelihood principle. In: Parzen E, Tanabe K, Kitagawa G (eds) Selected papers of hirotugu akaike. Springer, pp 199–213CrossRef
Zurück zum Zitat Alali A, Kagdi HH, Maletic JI (2008) What’s a typical commit? A characterization of open source software repositories. In: ICPC, pp 182–191. IEEE Computer society Alali A, Kagdi HH, Maletic JI (2008) What’s a typical commit? A characterization of open source software repositories. In: ICPC, pp 182–191. IEEE Computer society
Zurück zum Zitat Albayrak Ö, Carver JC (2014) Investigation of individual factors impacting the effectiveness of requirements inspections: a replicated experiment. Empir Softw Eng 19(1):241–266CrossRef Albayrak Ö, Carver JC (2014) Investigation of individual factors impacting the effectiveness of requirements inspections: a replicated experiment. Empir Softw Eng 19(1):241–266CrossRef
Zurück zum Zitat Anda B, Sjøberg DIK (2005) Investigating the role of use cases in the construction of class diagrams. Empir Softw Eng 10(3):285–309CrossRef Anda B, Sjøberg DIK (2005) Investigating the role of use cases in the construction of class diagrams. Empir Softw Eng 10(3):285–309CrossRef
Zurück zum Zitat Apa C, Dieste O, Espinosa GEG, Fonseca CER (2014) Effectiveness for detecting faults within and outside the scope of testing techniques: an independent replication. Empir Softw Eng 19(2):378–417 CrossRef Apa C, Dieste O, Espinosa GEG, Fonseca CER (2014) Effectiveness for detecting faults within and outside the scope of testing techniques: an independent replication. Empir Softw Eng 19(2):378–417 CrossRef
Zurück zum Zitat Baayen RH, Davidson DJ, Bates DM (2008) Mixed-effects modeling with crossed random effects for subjects and items. J Memory Lang 59(4):390–412CrossRef Baayen RH, Davidson DJ, Bates DM (2008) Mixed-effects modeling with crossed random effects for subjects and items. J Memory Lang 59(4):390–412CrossRef
Zurück zum Zitat Bangash AA, Sahar H, Hindle A, Ali K (2020) On the time-based conclusion stability of cross-project defect prediction models. Empirical Software Engineering pp 1–38 Bangash AA, Sahar H, Hindle A, Ali K (2020) On the time-based conclusion stability of cross-project defect prediction models. Empirical Software Engineering pp 1–38
Zurück zum Zitat Barón MM, Wyrich M, Graziotin D, Wagner S (2023) Evidence profiles for validity threats in program comprehension experiments. In: ICSE, pp 1907–1919. IEEE Barón MM, Wyrich M, Graziotin D, Wagner S (2023) Evidence profiles for validity threats in program comprehension experiments. In: ICSE, pp 1907–1919. IEEE
Zurück zum Zitat Barr DJ, Levy R, Scheepers C, Tily HJ (2013) Random effects structure for confirmatory hypothesis testing: Keep it maximal. J Memory Lang 368(3):255–278CrossRef Barr DJ, Levy R, Scheepers C, Tily HJ (2013) Random effects structure for confirmatory hypothesis testing: Keep it maximal. J Memory Lang 368(3):255–278CrossRef
Zurück zum Zitat Beheim B, Atkinson QD, Bulbulia J, Gervais W, Gray RD, Henrich J, Lang M, Monroe MW, Muthukrishna M, Norenzayan A, Purzycki BG, Shariff A, Slingerland E, Spicer R, Willard AK (2021) Treatment of missing data determined conclusions regarding moralizing gods. Nature 595(7866):1476–4687CrossRef Beheim B, Atkinson QD, Bulbulia J, Gervais W, Gray RD, Henrich J, Lang M, Monroe MW, Muthukrishna M, Norenzayan A, Purzycki BG, Shariff A, Slingerland E, Spicer R, Willard AK (2021) Treatment of missing data determined conclusions regarding moralizing gods. Nature 595(7866):1476–4687CrossRef
Zurück zum Zitat Bidoki NH, Schiappa M, Sukthankar G, Garibay I (2020) Modeling social coding dynamics with sampled historical data. Online Soc Netw Med 16:100070CrossRef Bidoki NH, Schiappa M, Sukthankar G, Garibay I (2020) Modeling social coding dynamics with sampled historical data. Online Soc Netw Med 16:100070CrossRef
Zurück zum Zitat Bird C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, Devanbu PT (2009) Fair and balanced?: bias in bug-fix datasets. In: ESEC/SIGSOFT FSE, pp 121–130. ACM Bird C, Bachmann A, Aune E, Duffy J, Bernstein A, Filkov V, Devanbu PT (2009) Fair and balanced?: bias in bug-fix datasets. In: ESEC/SIGSOFT FSE, pp 121–130. ACM
Zurück zum Zitat Blythe J, Bollenbacher J, Huang D, Hui P, Krohn R, Pacheco D, Muric G, Sapienza A, Tregubov A, Ahn Y, Flammini A, Lerman K, Menczer F, Weninger T, Ferrara E (2019) Massive multi-agent data-driven simulations of the GitHub ecosystem. In: PAAMS, Lecture notes in computer science, vol 11523, pp 3–15. Springer Blythe J, Bollenbacher J, Huang D, Hui P, Krohn R, Pacheco D, Muric G, Sapienza A, Tregubov A, Ahn Y, Flammini A, Lerman K, Menczer F, Weninger T, Ferrara E (2019) Massive multi-agent data-driven simulations of the GitHub ecosystem. In: PAAMS, Lecture notes in computer science, vol 11523, pp 3–15. Springer
Zurück zum Zitat Boh WF, Slaughter S, Espinosa JA (2007) Learning from experience in software development: A multilevel analysis. Manag Sci 53(8):1315–1331 Boh WF, Slaughter S, Espinosa JA (2007) Learning from experience in software development: A multilevel analysis. Manag Sci 53(8):1315–1331
Zurück zum Zitat Borges H, Hora AC, Valente MT (2016) Predicting the popularity of GitHub repositories. In: PROMISE, pp 9:1–9:10. ACM Borges H, Hora AC, Valente MT (2016) Predicting the popularity of GitHub repositories. In: PROMISE, pp 9:1–9:10. ACM
Zurück zum Zitat Borle NC, Feghhi M, Stroulia E, Greiner R, Hindle A (2018) Analyzing the effects of test driven development in GitHub. Empir Softw Eng 23(4):1931–1958CrossRef Borle NC, Feghhi M, Stroulia E, Greiner R, Hindle A (2018) Analyzing the effects of test driven development in GitHub. Empir Softw Eng 23(4):1931–1958CrossRef
Zurück zum Zitat Burton A, Altman DG, Royston P, Holder RL (2006) The design of simulation studies in medical statistics. Stat Med 25(24):4279–4292MathSciNetCrossRef Burton A, Altman DG, Royston P, Holder RL (2006) The design of simulation studies in medical statistics. Stat Med 25(24):4279–4292MathSciNetCrossRef
Zurück zum Zitat Canfora G, Lucia AD, Penta MD, Oliveto R, Panichella A, Panichella S (2015) Defect prediction as a multiobjective optimization problem. Softw Test Verification Reliab 25(4):426–459CrossRef Canfora G, Lucia AD, Penta MD, Oliveto R, Panichella A, Panichella S (2015) Defect prediction as a multiobjective optimization problem. Softw Test Verification Reliab 25(4):426–459CrossRef
Zurück zum Zitat Casalnuovo C, Devanbu PT, Oliveira A, Filkov V, Ray B (2015) Assert use in GitHub projects. In: ICSE (1), pp 755–766. IEEE Computer Society Casalnuovo C, Devanbu PT, Oliveira A, Filkov V, Ray B (2015) Assert use in GitHub projects. In: ICSE (1), pp 755–766. IEEE Computer Society
Zurück zum Zitat Clyburne-Sherin A, Fei X, Green SA (2019) Computational reproducibility via containers in psychology. Meta-psychology 3 Clyburne-Sherin A, Fei X, Green SA (2019) Computational reproducibility via containers in psychology. Meta-psychology 3
Zurück zum Zitat Cohen J, Cohen P, West SG, Aiken LS (2013) Applied multiple regression/correlation analysis for the behavioral sciences. RoutledgeCrossRef Cohen J, Cohen P, West SG, Aiken LS (2013) Applied multiple regression/correlation analysis for the behavioral sciences. RoutledgeCrossRef
Zurück zum Zitat Cosentino V, Izquierdo JLC, Cabot J (2016) Findings from GitHub: methods, datasets and limitations. In: Proceedings MSR, pp 137–141 Cosentino V, Izquierdo JLC, Cabot J (2016) Findings from GitHub: methods, datasets and limitations. In: Proceedings MSR, pp 137–141
Zurück zum Zitat Dias M, Bacchelli A, Gousios G, Cassou D, Ducasse S (2015) Untangling fine-grained code changes. In: SANER, pp 341–350. IEEE Computer society Dias M, Bacchelli A, Gousios G, Cassou D, Ducasse S (2015) Untangling fine-grained code changes. In: SANER, pp 341–350. IEEE Computer society
Zurück zum Zitat Falcão F, Barbosa C, Fonseca B, Garcia A, Ribeiro M, Gheyi R (2020) On relating technical, social factors, and the introduction of bugs. In: SANER, pp 378–388. IEEE Falcão F, Barbosa C, Fonseca B, Garcia A, Ribeiro M, Gheyi R (2020) On relating technical, social factors, and the introduction of bugs. In: SANER, pp 378–388. IEEE
Zurück zum Zitat Fang H, Lamba H, Herbsleb JD, Vasilescu B (2022) This is damn slick! estimating the impact of tweets on open source project popularity and new contributors. In: ICSE, pp 2116–2129. ACM Fang H, Lamba H, Herbsleb JD, Vasilescu B (2022) This is damn slick! estimating the impact of tweets on open source project popularity and new contributors. In: ICSE, pp 2116–2129. ACM
Zurück zum Zitat Gabel M, Su, Z (2010) A study of the uniqueness of source code. In: SIGSOFT FSE, pp 147–156. ACM Gabel M, Su, Z (2010) A study of the uniqueness of source code. In: SIGSOFT FSE, pp 147–156. ACM
Zurück zum Zitat Gasparini A, Abrams KR, Barrett JK, Major RW, Sweeting MJ, Brunskill NJ, Crowther MJ (2020) Mixed-effects models for health care longitudinal data with an informative visiting process: A Monte Carlo simulation study. Statistica Neerlandica 74(1):5–23MathSciNetCrossRef Gasparini A, Abrams KR, Barrett JK, Major RW, Sweeting MJ, Brunskill NJ, Crowther MJ (2020) Mixed-effects models for health care longitudinal data with an informative visiting process: A Monte Carlo simulation study. Statistica Neerlandica 74(1):5–23MathSciNetCrossRef
Zurück zum Zitat Gelman A, Hill J (2006) Data analysis using regression and multilevel/hierarchical models. Cambridge University PressCrossRef Gelman A, Hill J (2006) Data analysis using regression and multilevel/hierarchical models. Cambridge University PressCrossRef
Zurück zum Zitat Gelman A, Hill J, Vehtari A (2020) Regression and other stories. Cambridge University PressCrossRefMATH Gelman A, Hill J, Vehtari A (2020) Regression and other stories. Cambridge University PressCrossRefMATH
Zurück zum Zitat Ghaleb TA, da Costa DA, Zou Y (2019) An empirical study of the long duration of continuous integration builds. Empir Softw Eng 24(4):2102–2139CrossRef Ghaleb TA, da Costa DA, Zou Y (2019) An empirical study of the long duration of continuous integration builds. Empir Softw Eng 24(4):2102–2139CrossRef
Zurück zum Zitat Harrell FE (2015) Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis, vol 2. SpringerCrossRefMATH Harrell FE (2015) Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis, vol 2. SpringerCrossRefMATH
Zurück zum Zitat Härtel J, Lämmel R (2020) Incremental map-reduce on repository history. In: SANER, pp 320–331. IEEE Härtel J, Lämmel R (2020) Incremental map-reduce on repository history. In: SANER, pp 320–331. IEEE
Zurück zum Zitat Härtel J, Lämmel R (2022) Operationalizing threats to MSR studies by simulation-based testing. In: MSR, pp 86–97. IEEE Härtel J, Lämmel R (2022) Operationalizing threats to MSR studies by simulation-based testing. In: MSR, pp 86–97. IEEE
Zurück zum Zitat He Z, Peters F, Menzies T, Yang Y (2013) Learning from open-source projects: An empirical study on defect prediction. In: ESEM, pp 45–54. IEEE Computer society He Z, Peters F, Menzies T, Yang Y (2013) Learning from open-source projects: An empirical study on defect prediction. In: ESEM, pp 45–54. IEEE Computer society
Zurück zum Zitat Herzig K, Zeller A (2013) The impact of tangled code changes. In: MSR, pp 121–130. IEEE Computer society Herzig K, Zeller A (2013) The impact of tangled code changes. In: MSR, pp 121–130. IEEE Computer society
Zurück zum Zitat Honsel, V (2015) Statistical learning and software mining for agent based simulation of software evolution. In: ICSE (2), pp 863–866. IEEE Computer society Honsel, V (2015) Statistical learning and software mining for agent based simulation of software evolution. In: ICSE (2), pp 863–866. IEEE Computer society
Zurück zum Zitat Honsel V, Honsel D, Grabowski J (2014) Software process simulation based on mining software repositories. In: ICDM Workshops, pp 828–831. IEEE Computer society Honsel V, Honsel D, Grabowski J (2014) Software process simulation based on mining software repositories. In: ICDM Workshops, pp 828–831. IEEE Computer society
Zurück zum Zitat Honsel V, Honsel D, Herbold S, Grabowski J, Waack S (2015) Mining software dependency networks for agent-based simulation of software evolution. In: ASE Workshops, pp 102–108. IEEE Computer society Honsel V, Honsel D, Herbold S, Grabowski J, Waack S (2015) Mining software dependency networks for agent-based simulation of software evolution. In: ASE Workshops, pp 102–108. IEEE Computer society
Zurück zum Zitat Imbens GW, Rubin DB (2015) Causal inference in statistics, social, and biomedical sciences. Cambridge University PressCrossRefMATH Imbens GW, Rubin DB (2015) Causal inference in statistics, social, and biomedical sciences. Cambridge University PressCrossRefMATH
Zurück zum Zitat Iyer RN, Yun SA, Nagappan M, Hoey J (2019) Effects of personality traits on pull request acceptance. IEEE Transactions on Software Engineering Iyer RN, Yun SA, Nagappan M, Hoey J (2019) Effects of personality traits on pull request acceptance. IEEE Transactions on Software Engineering
Zurück zum Zitat Jamie DM (2002) Using computer simulation methods to teach statistics: A review of the literature. Journal of Statistics Education 10(1) Jamie DM (2002) Using computer simulation methods to teach statistics: A review of the literature. Journal of Statistics Education 10(1)
Zurück zum Zitat Jbara A, Matan A, Feitelson DG (2014) High-MCC functions in the Linux kernel. Empir Softw Eng 19(5):1261–1298CrossRef Jbara A, Matan A, Feitelson DG (2014) High-MCC functions in the Linux kernel. Empir Softw Eng 19(5):1261–1298CrossRef
Zurück zum Zitat Jiarpakdee J, Tantithamthavorn C, Hassan AE (2021) The impact of correlated metrics on the interpretation of defect models. IEEE Trans Softw Eng 47(2):320–331CrossRef Jiarpakdee J, Tantithamthavorn C, Hassan AE (2021) The impact of correlated metrics on the interpretation of defect models. IEEE Trans Softw Eng 47(2):320–331CrossRef
Zurück zum Zitat Johnson J, Lubo S, Yedla N, Aponte J, Sharif B (2019) An empirical study assessing source code readability in comprehension. In: ICSME, pp 513–523. IEEE Johnson J, Lubo S, Yedla N, Aponte J, Sharif B (2019) An empirical study assessing source code readability in comprehension. In: ICSME, pp 513–523. IEEE
Zurück zum Zitat Jolak R, Savary-Leblanc M, Dalibor M, Wortmann A, Hebig R, Vincur J, Polásek I, Pallec XL, Gérard S, Chaudron MRV (2020) Software engineering whispers: The effect of textual vs. graphical software design descriptions on software design communication. Empir Softw Eng 25(6):4427–4471CrossRef Jolak R, Savary-Leblanc M, Dalibor M, Wortmann A, Hebig R, Vincur J, Polásek I, Pallec XL, Gérard S, Chaudron MRV (2020) Software engineering whispers: The effect of textual vs. graphical software design descriptions on software design communication. Empir Softw Eng 25(6):4427–4471CrossRef
Zurück zum Zitat Kamei Y, Shihab E, Adams B, Hassan AE, Mockus A, Sinha A, Ubayashi N (2013) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng 39(6):757–773CrossRef Kamei Y, Shihab E, Adams B, Hassan AE, Mockus A, Sinha A, Ubayashi N (2013) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng 39(6):757–773CrossRef
Zurück zum Zitat Kochhar PS, Lo D (2017) Revisiting assert use in GitHub projects. In: EASE, pp 298–307. ACM Kochhar PS, Lo D (2017) Revisiting assert use in GitHub projects. In: EASE, pp 298–307. ACM
Zurück zum Zitat Martens A, Koziolek H, Prechelt L, Reussner RH (2011) From monolithic to component-based performance evaluation of software architectures - A series of experiments analysing accuracy and effort. Empir Softw Eng 16(5):587–622CrossRef Martens A, Koziolek H, Prechelt L, Reussner RH (2011) From monolithic to component-based performance evaluation of software architectures - A series of experiments analysing accuracy and effort. Empir Softw Eng 16(5):587–622CrossRef
Zurück zum Zitat McChesney IR, Bond RR (2020) Observations on the linear order of program code reading patterns in programmers with dyslexia. In: EASE, pp 81–89. ACM McChesney IR, Bond RR (2020) Observations on the linear order of program code reading patterns in programmers with dyslexia. In: EASE, pp 81–89. ACM
Zurück zum Zitat McElreath, R (2020) Statistical rethinking: A Bayesian course with examples in R and Stan. CRC press McElreath, R (2020) Statistical rethinking: A Bayesian course with examples in R and Stan. CRC press
Zurück zum Zitat Miller G (2006) A Scientist’s nightmare: Software problem leads to five retractions. Science 314(5807):1856–1857CrossRef Miller G (2006) A Scientist’s nightmare: Software problem leads to five retractions. Science 314(5807):1856–1857CrossRef
Zurück zum Zitat Mockus, A (2010) Organizational volatility and its effects on software defects. In: SIGSOFT FSE, pp 117–126. ACM Mockus, A (2010) Organizational volatility and its effects on software defects. In: SIGSOFT FSE, pp 117–126. ACM
Zurück zum Zitat Mockus A, Weiss DM (2000) Predicting risk of software changes. Bell Labs Tech J 5(2):169–180CrossRef Mockus A, Weiss DM (2000) Predicting risk of software changes. Bell Labs Tech J 5(2):169–180CrossRef
Zurück zum Zitat Morris TP, White IR, Crowther MJ (2019) Using simulation studies to evaluate statistical methods. Stat Med 38(11):2074–2102MathSciNetCrossRef Morris TP, White IR, Crowther MJ (2019) Using simulation studies to evaluate statistical methods. Stat Med 38(11):2074–2102MathSciNetCrossRef
Zurück zum Zitat Nagappan, N, Zeller, A, Zimmermann, T, Herzig, K, Murphy, B (2010) Change bursts as defect predictors. In: ISSRE, pp 309–318. IEEE Computer society Nagappan, N, Zeller, A, Zimmermann, T, Herzig, K, Murphy, B (2010) Change bursts as defect predictors. In: ISSRE, pp 309–318. IEEE Computer society
Zurück zum Zitat Nam J, Fu W, Kim S, Menzies T, Tan L (2018) Heterogeneous defect prediction. IEEE Trans Softw Eng 44(9):874–896CrossRef Nam J, Fu W, Kim S, Menzies T, Tan L (2018) Heterogeneous defect prediction. IEEE Trans Softw Eng 44(9):874–896CrossRef
Zurück zum Zitat Nam J, Pan SJ, Kim S (2013) Transfer defect learning. In: ICSE, pp 382–391. IEEE Computer society Nam J, Pan SJ, Kim S (2013) Transfer defect learning. In: ICSE, pp 382–391. IEEE Computer society
Zurück zum Zitat Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR (1996) A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol 49(12):1373–1379CrossRef Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR (1996) A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol 49(12):1373–1379CrossRef
Zurück zum Zitat Penta MD, Cerulo L, Guéhéneuc Y, Antoniol G (2008) An empirical study of the relationships between design pattern roles and class change proneness. In: ICSM, pp 217–226. IEEE Computer society Penta MD, Cerulo L, Guéhéneuc Y, Antoniol G (2008) An empirical study of the relationships between design pattern roles and class change proneness. In: ICSM, pp 217–226. IEEE Computer society
Zurück zum Zitat Posnett D, Filkov V, Devanbu, PT (2011) Ecological inference in empirical software engineering. In: ASE, pp 362–371. IEEE Computer society Posnett D, Filkov V, Devanbu, PT (2011) Ecological inference in empirical software engineering. In: ASE, pp 362–371. IEEE Computer society
Zurück zum Zitat Rahman F, Devanbu PT (2011) Ownership, experience and defects: a fine-grained study of authorship. In: ICSE, pp 491–500. ACM Rahman F, Devanbu PT (2011) Ownership, experience and defects: a fine-grained study of authorship. In: ICSE, pp 491–500. ACM
Zurück zum Zitat Rahman F, Posnett D, Devanbu PT (2012) Recalling the "imprecision" of cross-project defect prediction. In: SIGSOFT FSE, p 61. ACM Rahman F, Posnett D, Devanbu PT (2012) Recalling the "imprecision" of cross-project defect prediction. In: SIGSOFT FSE, p 61. ACM
Zurück zum Zitat Rahman MM, Roy CK, Collins JA (2016) CoRReCT: code reviewer recommendation in GitHub based on cross-project and technology experience. In: ICSE (Companion Volume), pp 222–231. ACM Rahman MM, Roy CK, Collins JA (2016) CoRReCT: code reviewer recommendation in GitHub based on cross-project and technology experience. In: ICSE (Companion Volume), pp 222–231. ACM
Zurück zum Zitat Reyes RP, Dieste O, Fonseca ER, Juristo N (2018) Statistical errors in software engineering experiments: a preliminary literature review. In: ICSE, pp 1195–1206. ACM Reyes RP, Dieste O, Fonseca ER, Juristo N (2018) Statistical errors in software engineering experiments: a preliminary literature review. In: ICSE, pp 1195–1206. ACM
Zurück zum Zitat Roberts DR, Bahn V, Ciuti S, Boyce MS, Elith J, Guillera-Arroita G, Hauenstein S, Lahoz-Monfort JJ, Schröder B, Thuiller W, Warton DI, Wintle BA, Hartig F, Dormann CF (2017) Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 40(8):913–929CrossRef Roberts DR, Bahn V, Ciuti S, Boyce MS, Elith J, Guillera-Arroita G, Hauenstein S, Lahoz-Monfort JJ, Schröder B, Thuiller W, Warton DI, Wintle BA, Hartig F, Dormann CF (2017) Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 40(8):913–929CrossRef
Zurück zum Zitat Sayagh M, Kerzazi N, Petrillo F, Bennani K, Adams B (2020) What should your run-time configuration framework do to help developers? Empir Softw Eng 25(2):1259–1293CrossRef Sayagh M, Kerzazi N, Petrillo F, Bennani K, Adams B (2020) What should your run-time configuration framework do to help developers? Empir Softw Eng 25(2):1259–1293CrossRef
Zurück zum Zitat Scholtes I, Mavrodiev P, Schweitzer F (2016) From Aristotle to Ringelmann: a large-scale analysis of team productivity and coordination in Open Source Software projects. Empir Softw Eng 21(2):642–683CrossRef Scholtes I, Mavrodiev P, Schweitzer F (2016) From Aristotle to Ringelmann: a large-scale analysis of team productivity and coordination in Open Source Software projects. Empir Softw Eng 21(2):642–683CrossRef
Zurück zum Zitat Seifer P, Härtel J, Leinberger M, Lämmel R, Staab S (2019) Empirical study on the usage of graph query languages in open source Java projects. In: SLE, pp 152–166. ACM Seifer P, Härtel J, Leinberger M, Lämmel R, Staab S (2019) Empirical study on the usage of graph query languages in open source Java projects. In: SLE, pp 152–166. ACM
Zurück zum Zitat Seo T, Lee H (2009) Agent-based simulation model for the evolution process of open source software. In: SEKE, pp 170–177. Knowledge systems institute graduate school Seo T, Lee H (2009) Agent-based simulation model for the evolution process of open source software. In: SEKE, pp 170–177. Knowledge systems institute graduate school
Zurück zum Zitat Shadish WR, Cook TD, Campbell DT (2002) Experimental and quasi-experimental designs for generalized causal inference. Houghton mifflin company Shadish WR, Cook TD, Campbell DT (2002) Experimental and quasi-experimental designs for generalized causal inference. Houghton mifflin company
Zurück zum Zitat Sjøberg DIK, Hannay JE, Hansen O, Kampenes VB, Karahasanovic A, Liborg N, Rekdal AC (2005) A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 31(9):733–753CrossRef Sjøberg DIK, Hannay JE, Hansen O, Kampenes VB, Karahasanovic A, Liborg N, Rekdal AC (2005) A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 31(9):733–753CrossRef
Zurück zum Zitat Sliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes? In: MSR. ACM Sliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes? In: MSR. ACM
Zurück zum Zitat Stodden V, Seiler J, Ma Z (2018) An empirical analysis of journal policy effectiveness for computational reproducibility. Proc Natl Acad Sci USA 115(11):2584–2589CrossRef Stodden V, Seiler J, Ma Z (2018) An empirical analysis of journal policy effectiveness for computational reproducibility. Proc Natl Acad Sci USA 115(11):2584–2589CrossRef
Zurück zum Zitat Tan M, Tan L, Dara S, Mayeux C (2015) Online defect prediction for imbalanced data. In: ICSE (2), pp 99–108. IEEE Computer society Tan M, Tan L, Dara S, Mayeux C (2015) Online defect prediction for imbalanced data. In: ICSE (2), pp 99–108. IEEE Computer society
Zurück zum Zitat Tantithamthavorn C, Hassan AE (2018) An experience report on defect modelling in practice: pitfalls and challenges. In: ICSE (SEIP), pp 286–295. ACM Tantithamthavorn C, Hassan AE (2018) An experience report on defect modelling in practice: pitfalls and challenges. In: ICSE (SEIP), pp 286–295. ACM
Zurück zum Zitat Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2017) An empirical comparison of model validation techniques for defect prediction models. IEEE Trans Softw Eng 43(1):1–18CrossRef Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2017) An empirical comparison of model validation techniques for defect prediction models. IEEE Trans Softw Eng 43(1):1–18CrossRef
Zurück zum Zitat Thongtanunam P, McIntosh S, Hassan AE, Iida H (2016) Revisiting code ownership and its relationship with software quality in the scope of modern code review. In: ICSE, pp 1039–1050. ACM Thongtanunam P, McIntosh S, Hassan AE, Iida H (2016) Revisiting code ownership and its relationship with software quality in the scope of modern code review. In: ICSE, pp 1039–1050. ACM
Zurück zum Zitat Tichy WF, Lukowicz P, Prechelt L, Heinz EA (1995) Experimental evaluation in computer science: A quantitative study. J Syst Softw 28(1):9–18CrossRef Tichy WF, Lukowicz P, Prechelt L, Heinz EA (1995) Experimental evaluation in computer science: A quantitative study. J Syst Softw 28(1):9–18CrossRef
Zurück zum Zitat Tsay J, Dabbish L, Herbsleb JD (2014) Influence of social and technical factors for evaluating contribution in GitHub. In: ICSE, pp 356–366. ACM Tsay J, Dabbish L, Herbsleb JD (2014) Influence of social and technical factors for evaluating contribution in GitHub. In: ICSE, pp 356–366. ACM
Zurück zum Zitat Tufano M, Bavota G, Poshyvanyk D, Penta MD, Oliveto R, Lucia AD (2017) An empirical study on developer-related factors characterizing fix-inducing commits. J Softw Evol Process 29(1) Tufano M, Bavota G, Poshyvanyk D, Penta MD, Oliveto R, Lucia AD (2017) An empirical study on developer-related factors characterizing fix-inducing commits. J Softw Evol Process 29(1)
Zurück zum Zitat Vasilescu B, Posnett D, Ray B, van den Brand MGJ, Serebrenik A, Devanbu PT, Filkov V (2015) Gender and tenure diversity in GitHub teams. In: CHI, pp 3789–3798. ACM Vasilescu B, Posnett D, Ray B, van den Brand MGJ, Serebrenik A, Devanbu PT, Filkov V (2015) Gender and tenure diversity in GitHub teams. In: CHI, pp 3789–3798. ACM
Zurück zum Zitat Vokác M (2004) Defect frequency and design patterns: An empirical study of industrial code. IEEE Trans Softw Eng 30(12):904–917CrossRef Vokác M (2004) Defect frequency and design patterns: An empirical study of industrial code. IEEE Trans Softw Eng 30(12):904–917CrossRef
Zurück zum Zitat Wood M (2005) The role of simulation approaches in statistics. Journal of Statistics Education 13(3) Wood M (2005) The role of simulation approaches in statistics. Journal of Statistics Education 13(3)
Zurück zum Zitat Yan M, Xia X, Fan Y, Lo D, Hassan AE, Zhang X (2020) Effort-aware just-in-time defect identification in practice: a case study at Alibaba. In: ESEC/SIGSOFT FSE, pp 1308–1319. ACM Yan M, Xia X, Fan Y, Lo D, Hassan AE, Zhang X (2020) Effort-aware just-in-time defect identification in practice: a case study at Alibaba. In: ESEC/SIGSOFT FSE, pp 1308–1319. ACM
Zurück zum Zitat Zhang F, Hassan AE, McIntosh S, Zou Y (2017) The use of summation to aggregate software metrics hinders the performance of defect prediction models. IEEE Trans Softw Eng 43(5):476–491CrossRef Zhang F, Hassan AE, McIntosh S, Zou Y (2017) The use of summation to aggregate software metrics hinders the performance of defect prediction models. IEEE Trans Softw Eng 43(5):476–491CrossRef
Zurück zum Zitat Zimmermann T, Nagappan N (2008) Predicting defects using network analysis on dependency graphs. In: ICSE, pp 531–540. ACM Zimmermann T, Nagappan N (2008) Predicting defects using network analysis on dependency graphs. In: ICSE, pp 531–540. ACM
Zurück zum Zitat Zimmermann T, Premraj R, Zeller A (2007) Predicting defects for eclipse. In: PROMISE 2007, p 76. IEEE Zimmermann T, Premraj R, Zeller A (2007) Predicting defects for eclipse. In: PROMISE 2007, p 76. IEEE
Metadaten
Titel
Operationalizing validity of empirical software engineering studies
verfasst von
Johannes Härtel
Ralf Lämmel
Publikationsdatum
01.11.2023
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 6/2023
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-023-10370-3

Weitere Artikel der Ausgabe 6/2023

Empirical Software Engineering 6/2023 Zur Ausgabe

Premium Partner