Skip to main content
Erschienen in: Empirical Software Engineering 2/2015

01.04.2015

Do topics make sense to managers and developers?

verfasst von: Abram Hindle, Christian Bird, Thomas Zimmermann, Nachiappan Nagappan

Erschienen in: Empirical Software Engineering | Ausgabe 2/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Large organizations like Microsoft tend to rely on formal requirements documentation in order to specify and design the software products that they develop. These documents are meant to be tightly coupled with the actual implementation of the features they describe. In this paper we evaluate the value of high-level topic-based requirements traceability and issue report traceability in the version control system, using Latent Dirichlet Allocation (LDA). We evaluate LDA topics on practitioners and check if the topics and trends extracted match the perception that industrial Program Managers and Developers have about the effort put into addressing certain topics. We then replicate this study again on Open Source Developers using issue reports from issue trackers instead of requirements, confirming our previous industrial conclusions. We found that efforts extracted as commits from version control systems relevant to a topic often matched the perception of the managers and developers of what actually occurred at that time. Furthermore we found evidence that many of the identified topics made sense to practitioners and matched their perception of what occurred. But for some topics, we found that practitioners had difficulty interpreting and labelling them. In summary, we investigate the high-level traceability of requirements topics and issue/bug report topics to version control commits via topic analysis and validate with the actual stakeholders the relevance of these topics extracted from requirements and issues.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Antoniol G, Canfora G, Casazza G, De Lucia A, Merlo E (2002) Recovering traceability links between code and documentation. IEEE Trans Softw Eng 28(10):970–983CrossRef Antoniol G, Canfora G, Casazza G, De Lucia A, Merlo E (2002) Recovering traceability links between code and documentation. IEEE Trans Softw Eng 28(10):970–983CrossRef
Zurück zum Zitat Asuncion A, Welling M, Smyth P, Teh YW (2009) On smoothing and inference for topic models. In: Proceedings of the 25th conference on uncertainty in artificial intelligence.AUAI Press, pp 27–34 Asuncion A, Welling M, Smyth P, Teh YW (2009) On smoothing and inference for topic models. In: Proceedings of the 25th conference on uncertainty in artificial intelligence.AUAI Press, pp 27–34
Zurück zum Zitat Asuncion HU, Asuncion AU, Taylor RN (2010) Software traceability with topic modeling. In: Proceedings of the 32nd ACM/IEEE international conference on software engineering, ICSE ’10, vol 1. ACM, New York, pp 95–104. doi:10.1145/1806799.1806817 Asuncion HU, Asuncion AU, Taylor RN (2010) Software traceability with topic modeling. In: Proceedings of the 32nd ACM/IEEE international conference on software engineering, ICSE ’10, vol 1. ACM, New York, pp 95–104. doi:10.​1145/​1806799.​1806817
Zurück zum Zitat Baldi PF, Lopes CV, Linstead EJ, Bajracharya SK (2008) A theory of aspects as latent topics. In: Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications, OOPSLA ’08. ACM, New York, pp 543–562 Baldi PF, Lopes CV, Linstead EJ, Bajracharya SK (2008) A theory of aspects as latent topics. In: Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications, OOPSLA ’08. ACM, New York, pp 543–562
Zurück zum Zitat Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH
Zurück zum Zitat Capiluppi A, Izquierdo-Cortázar D (2013) Effort estimation of floss projects: a study of the linux kernel. Empir Softw Eng 18(1):60–88CrossRef Capiluppi A, Izquierdo-Cortázar D (2013) Effort estimation of floss projects: a study of the linux kernel. Empir Softw Eng 18(1):60–88CrossRef
Zurück zum Zitat Cheng BHC, Atlee JM (2007) Research directions in requirements engineering. In: 2007 future of software engineering, FOSE ’07. IEEE Computer Society, Washington, DC, pp 285–303 Cheng BHC, Atlee JM (2007) Research directions in requirements engineering. In: 2007 future of software engineering, FOSE ’07. IEEE Computer Society, Washington, DC, pp 285–303
Zurück zum Zitat Cleland-Huang J, Settimi R, BenKhadra O, Berezhanskaya E, Christina S (2005) Goal-centric traceability for managing non-functional requirements. In: Proceedings of the 27th international conference on software engineering, ICSE ’05. ACM, New York, pp 362–371 Cleland-Huang J, Settimi R, BenKhadra O, Berezhanskaya E, Christina S (2005) Goal-centric traceability for managing non-functional requirements. In: Proceedings of the 27th international conference on software engineering, ICSE ’05. ACM, New York, pp 362–371
Zurück zum Zitat De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S (2012) Using ir methods for labeling source code artifacts: Is it worthwhile?. In: IEEE 20th international conference on program comprehension (ICPC), 2012. IEEE, pp 193–202 De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S (2012) Using ir methods for labeling source code artifacts: Is it worthwhile?. In: IEEE 20th international conference on program comprehension (ICPC), 2012. IEEE, pp 193–202
Zurück zum Zitat De Lucia A, Marcus A, Oliveto R, Poshyvanyk D (2012) Information retrieval methods for automated traceability recovery. In: Software and systems traceability. Springer, pp 71–98 De Lucia A, Marcus A, Oliveto R, Poshyvanyk D (2012) Information retrieval methods for automated traceability recovery. In: Software and systems traceability. Springer, pp 71–98
Zurück zum Zitat Ernst N, Mylopoulos J (2010) On the perception of software quality requirements during the project lifecycle. In: Wieringa R, Persson A (eds) Requirements engineering: foundation for software quality. Lecture notes in computer science, vol 6182. Springer, Berlin / Heidelberg, pp 143–157 Ernst N, Mylopoulos J (2010) On the perception of software quality requirements during the project lifecycle. In: Wieringa R, Persson A (eds) Requirements engineering: foundation for software quality. Lecture notes in computer science, vol 6182. Springer, Berlin / Heidelberg, pp 143–157
Zurück zum Zitat Gethers M, Oliveto R, Poshyvanyk D, Lucia AD (2011) On integrating orthogonal information retrieval methods to improve traceability recovery. In: 2011 27th IEEE international conference on software maintenance (ICSM). IEEE, pp 133–142 Gethers M, Oliveto R, Poshyvanyk D, Lucia AD (2011) On integrating orthogonal information retrieval methods to improve traceability recovery. In: 2011 27th IEEE international conference on software maintenance (ICSM). IEEE, pp 133–142
Zurück zum Zitat Grant S, Cordy JR (2010) Estimating the optimal number of latent concepts in source code analysis. In: Proceedings of the 2010 10th IEEE working conference on source code analysis and manipulation, SCAM ’10. IEEE Computer Society, Washington, DC, pp 65–74CrossRef Grant S, Cordy JR (2010) Estimating the optimal number of latent concepts in source code analysis. In: Proceedings of the 2010 10th IEEE working conference on source code analysis and manipulation, SCAM ’10. IEEE Computer Society, Washington, DC, pp 65–74CrossRef
Zurück zum Zitat Hindle A, Bird C, Zimmermann T, Nagappan N (2012) Relating requirements to implementation via topic analysis: Do topics extracted from requirements make sense to managers and developers? In: Proceedings of the 28th IEEE international conference on software maintenance. IEEE Hindle A, Bird C, Zimmermann T, Nagappan N (2012) Relating requirements to implementation via topic analysis: Do topics extracted from requirements make sense to managers and developers? In: Proceedings of the 28th IEEE international conference on software maintenance. IEEE
Zurück zum Zitat Hindle A, Ernst NA, Godfrey MW, Mylopoulos J (2011) Automated topic naming to support cross-project analysis of software maintenance activities. ACM, New York, pp 163–172 Hindle A, Ernst NA, Godfrey MW, Mylopoulos J (2011) Automated topic naming to support cross-project analysis of software maintenance activities. ACM, New York, pp 163–172
Zurück zum Zitat Hoffman M, Bach FR, Blei DM (2010) Online learning for latent dirichlet allocation. In: Advances in neural information processing systems. pp 856–864 Hoffman M, Bach FR, Blei DM (2010) Online learning for latent dirichlet allocation. In: Advances in neural information processing systems. pp 856–864
Zurück zum Zitat Ko AJ, DeLine R, Venolia G (2007) Information needs in collocated software development teams. In: Proceedings of the 29th international conference on software engineering, ICSE ’07. IEEE Computer Society, Washington, DC, pp 344–353 Ko AJ, DeLine R, Venolia G (2007) Information needs in collocated software development teams. In: Proceedings of the 29th international conference on software engineering, ICSE ’07. IEEE Computer Society, Washington, DC, pp 344–353
Zurück zum Zitat Koch S (2008) Effort modeling and programmer participation in open source software projects. Inf Econ Policy 20(4):345–355CrossRef Koch S (2008) Effort modeling and programmer participation in open source software projects. Inf Econ Policy 20(4):345–355CrossRef
Zurück zum Zitat Konrad S, Cheng B (2006) Automated analysis of natural language properties for uml models. In: Bruel JM (ed) Satellite events at the MoDELS 2005 conference. Lecture notes in computer science, vol 3844. Springer, Berlin / Heidelberg, pp 48–57 Konrad S, Cheng B (2006) Automated analysis of natural language properties for uml models. In: Bruel JM (ed) Satellite events at the MoDELS 2005 conference. Lecture notes in computer science, vol 3844. Springer, Berlin / Heidelberg, pp 48–57
Zurück zum Zitat Kozlenkov A, Zisman A (2002) Are their design specifications consistent with our requirements? In: Proceedings of the 10th anniversary IEEE joint international conference on requirements engineering, RE ’02. IEEE Computer Society, Washington, DC, pp 145–156CrossRef Kozlenkov A, Zisman A (2002) Are their design specifications consistent with our requirements? In: Proceedings of the 10th anniversary IEEE joint international conference on requirements engineering, RE ’02. IEEE Computer Society, Washington, DC, pp 145–156CrossRef
Zurück zum Zitat Kuhn A, Ducasse S, Gírba T (2007) Semantic clustering: identifying topics in source code. Inf Softw Technol 49(3):230–243CrossRef Kuhn A, Ducasse S, Gírba T (2007) Semantic clustering: identifying topics in source code. Inf Softw Technol 49(3):230–243CrossRef
Zurück zum Zitat Lukins SK, Kraft NA, Etzkorn LH (2008) Source code retrieval for bug localization using latent dirichlet allocation. In: Proceedings of the 2008 15th working conference on reverse engineering, WCRE ’08. IEEE Computer Society, Washington, DC, pp 155–164CrossRef Lukins SK, Kraft NA, Etzkorn LH (2008) Source code retrieval for bug localization using latent dirichlet allocation. In: Proceedings of the 2008 15th working conference on reverse engineering, WCRE ’08. IEEE Computer Society, Washington, DC, pp 155–164CrossRef
Zurück zum Zitat Marcus A, Maletic JI (2003) Recovering documentation-to-source-code traceability links using latent semantic indexing. In: Proceedings 25th international conference on software engineering, 2003. IEEE, pp 125–135 Marcus A, Maletic JI (2003) Recovering documentation-to-source-code traceability links using latent semantic indexing. In: Proceedings 25th international conference on software engineering, 2003. IEEE, pp 125–135
Zurück zum Zitat Marcus A, Sergeyev A, Rajlich V, Maletic JI (2004) An information retrieval approach to concept location in source code. In: Proceedings of the 11th working conference on reverse engineering, WCRE ’04. IEEE Computer Society, Washington, DC, pp 214–223CrossRef Marcus A, Sergeyev A, Rajlich V, Maletic JI (2004) An information retrieval approach to concept location in source code. In: Proceedings of the 11th working conference on reverse engineering, WCRE ’04. IEEE Computer Society, Washington, DC, pp 214–223CrossRef
Zurück zum Zitat McMillan C, Poshyvanyk D, Revelle M (2009) Combining textual and structural analysis of software artifacts for traceability link recovery. In: Proceedings of the 2009 ICSE workshop on traceability in emerging forms of software engineering, TEFSE ’09. IEEE Computer Society, Washington, DC, pp 41–48CrossRef McMillan C, Poshyvanyk D, Revelle M (2009) Combining textual and structural analysis of software artifacts for traceability link recovery. In: Proceedings of the 2009 ICSE workshop on traceability in emerging forms of software engineering, TEFSE ’09. IEEE Computer Society, Washington, DC, pp 41–48CrossRef
Zurück zum Zitat Murphy GC, Notkin D, Sullivan KJ (2001) Software reflexion models: bridging the gap between design and implementation. IEEE Trans Softw Eng 27(4):364–380. doi:10.1109/32.917525 CrossRef Murphy GC, Notkin D, Sullivan KJ (2001) Software reflexion models: bridging the gap between design and implementation. IEEE Trans Softw Eng 27(4):364–380. doi:10.​1109/​32.​917525 CrossRef
Zurück zum Zitat Panichella A, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2013) How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms. In: Proceedings of the 2013 international conference on software engineering. IEEE Press, pp 522–531 Panichella A, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2013) How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms. In: Proceedings of the 2013 international conference on software engineering. IEEE Press, pp 522–531
Zurück zum Zitat Poshyvanyk D (2008) Using information retrieval to support software maintenance tasks, Ph.D. thesis, Wayne State University, Detroit, MI, USA Poshyvanyk D (2008) Using information retrieval to support software maintenance tasks, Ph.D. thesis, Wayne State University, Detroit, MI, USA
Zurück zum Zitat Ramage D, Dumais ST, Liebling DJ (2010) Characterizing microblogs with topic models. In: ICWSM Ramage D, Dumais ST, Liebling DJ (2010) Characterizing microblogs with topic models. In: ICWSM
Zurück zum Zitat Sabetzadeh M, Easterbrook S (2005) Traceability in viewpoint merging: a model management perspective. In: Proceedings of the 3rd international workshop on traceability in emerging forms of software engineering, TEFSE ’05. ACM, New York, pp 44–49CrossRef Sabetzadeh M, Easterbrook S (2005) Traceability in viewpoint merging: a model management perspective. In: Proceedings of the 3rd international workshop on traceability in emerging forms of software engineering, TEFSE ’05. ACM, New York, pp 44–49CrossRef
Zurück zum Zitat Savage T, Dit B, Gethers M, Poshyvanyk D (2010) Topicxp: exploring topics in source code using latent dirichlet allocation. In: Proceedings of the 2010 IEEE international conference on software maintenance, ICSM ’10. IEEE Computer Society, Washington, DC, pp 1–6CrossRef Savage T, Dit B, Gethers M, Poshyvanyk D (2010) Topicxp: exploring topics in source code using latent dirichlet allocation. In: Proceedings of the 2010 IEEE international conference on software maintenance, ICSM ’10. IEEE Computer Society, Washington, DC, pp 1–6CrossRef
Zurück zum Zitat Shull F, Singer J, Sjberg DIK (2010) Guide to advanced empirical software engineering, 1st edn. Springer Publishing Company Incorporated Shull F, Singer J, Sjberg DIK (2010) Guide to advanced empirical software engineering, 1st edn. Springer Publishing Company Incorporated
Zurück zum Zitat Sneed HM (2007) Testing against natural language requirements. In: Proceedings of the 7th international conference on quality software, QSIC ’07. IEEE Computer Society, Washington, DC, pp 380–387 Sneed HM (2007) Testing against natural language requirements. In: Proceedings of the 7th international conference on quality software, QSIC ’07. IEEE Computer Society, Washington, DC, pp 380–387
Zurück zum Zitat Thomas SW, Adams B, Hassan AE, Blostein D (2010) Validating the use of topic models for software evolution. In: Proceedings of the 2010 10th IEEE working conference on source code analysis and manipulation, SCAM ’10. IEEE Computer Society, Washington, DC, pp 55–64CrossRef Thomas SW, Adams B, Hassan AE, Blostein D (2010) Validating the use of topic models for software evolution. In: Proceedings of the 2010 10th IEEE working conference on source code analysis and manipulation, SCAM ’10. IEEE Computer Society, Washington, DC, pp 55–64CrossRef
Zurück zum Zitat Thomas SW, Adams B, Hassan AE, Blostein D (2011) Modeling the evolution of topics in source code histories. In: Proceedings of the 8th working conference on mining software repositories, MSR ’11. ACM, New York, pp 173–182CrossRef Thomas SW, Adams B, Hassan AE, Blostein D (2011) Modeling the evolution of topics in source code histories. In: Proceedings of the 8th working conference on mining software repositories, MSR ’11. ACM, New York, pp 173–182CrossRef
Zurück zum Zitat Tillmann N., Chen F., Schulte W. (2006) Discovering likely method specifications. In: Liu Z., He J. (eds) Formal methods and software engineering. Lecture notes in computer science, vol 4260. Springer, Berlin / Heidelberg, pp 717–736 Tillmann N., Chen F., Schulte W. (2006) Discovering likely method specifications. In: Liu Z., He J. (eds) Formal methods and software engineering. Lecture notes in computer science, vol 4260. Springer, Berlin / Heidelberg, pp 717–736
Zurück zum Zitat Wiegers KE (2003) Software requirements, 2nd edn. Microsoft Press, Redmond Wiegers KE (2003) Software requirements, 2nd edn. Microsoft Press, Redmond
Zurück zum Zitat Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2000) Experimentation in software engineering: an introduction. Kluwer Academic Publishers, NorwellCrossRef Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2000) Experimentation in software engineering: an introduction. Kluwer Academic Publishers, NorwellCrossRef
Metadaten
Titel
Do topics make sense to managers and developers?
verfasst von
Abram Hindle
Christian Bird
Thomas Zimmermann
Nachiappan Nagappan
Publikationsdatum
01.04.2015
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 2/2015
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-014-9312-1

Weitere Artikel der Ausgabe 2/2015

Empirical Software Engineering 2/2015 Zur Ausgabe

Premium Partner