Skip to main content
Erschienen in: Knowledge and Information Systems 2/2016

01.11.2016 | Regular Paper

A feature location approach supported by time-aware weighting of terms associated with developer expertise profiles

verfasst von: Sima Zamani, Sai Peck Lee, Ramin Shokripour, John Anvik

Erschienen in: Knowledge and Information Systems | Ausgabe 2/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Feature location is a frequent software maintenance activity that aims to identify initial source code location pertinent to a software feature. Most of feature location approaches are based, at least in part, on text analysis methods which originate from the natural language context. However, the natural language context and the text data in software repositories have different properties that reveal the need for adaption of the methods to apply in the context of software repositories. One of the differences is the existence of a set of metadata, such as developer information and time stamp, which is associated with the data in the repositories. However, this difference has not been fully considered in previous feature location research studies. This study proposes a feature location approach that analyzes developer expertise profiles, which contain source code entities modified by the associated software developers, to identify the most similar location pertinent to a desired feature. This approach uses a time-aware term-weighting technique to determine the similarity. An experimental evaluation on four open-source projects shows an improvement in the accuracy, performance, and effectiveness up to 55, 39, and 29 %, respectively, compared to the high-performing information retrieval methods used in feature location. Moreover, the proposed time-aware technique increases the accuracy, performance, and effectiveness of the typical term-weighting technique, tf-idf, as much as 15, 11, and 13 %, respectively. Finally, the proposed approach outperforms our previous approach, noun-based feature location, as much as 17 %. These experimental results demonstrate that time-aware analysis of developers’ expertise significantly improves the feature location process.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
A changeset is an atomic set of changes of the source code files committed to the source code repository by a project developer during a maintenance activity [30].
 
2
The source code repository of software projects.
 
3
Note that the time difference is calculated in days.
 
8
The source code locations that modified to fix these change requests need to be determined. The change requests that the corresponding locations cannot be correctly determined were removed from this test set.
 
17
System properties: (Processor: Intel(R)Core(TM)i5-3470 cpu,3.20GHZ) and (Installed Memory(RAM): 12GB).
 
Literatur
1.
Zurück zum Zitat Abebe SL, Tonella P (2010) Natural language parsing of program element names for concept extraction. In: IEEE 18th international conference on program comprehension (ICPC). IEEE, pp 156–159 Abebe SL, Tonella P (2010) Natural language parsing of program element names for concept extraction. In: IEEE 18th international conference on program comprehension (ICPC). IEEE, pp 156–159
2.
Zurück zum Zitat Anvik J (2006) Automating bug report assignment. In: Proceedings of the 28th international conference on software engineering (ICSE). ACM, pp 937–940 Anvik J (2006) Automating bug report assignment. In: Proceedings of the 28th international conference on software engineering (ICSE). ACM, pp 937–940
3.
Zurück zum Zitat Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug? In: Proceedings of the 28th international conference on software engineering, ICSE ’06, New York, NY, USA. ACM, pp 361–370. ISBN: 1-59593-375-1. doi:10.1145/1134285.1134336 Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug? In: Proceedings of the 28th international conference on software engineering, ICSE ’06, New York, NY, USA. ACM, pp 361–370. ISBN: 1-59593-375-1. doi:10.​1145/​1134285.​1134336
4.
Zurück zum Zitat Bacchelli A, Lanza M, Robbes R (2010) Linking e-mails and source code artifacts. In: Proceedings of the 32nd ACM/IEEE international conference on software engineering, vol 1. ACM, pp 375–384 Bacchelli A, Lanza M, Robbes R (2010) Linking e-mails and source code artifacts. In: Proceedings of the 32nd ACM/IEEE international conference on software engineering, vol 1. ACM, pp 375–384
5.
Zurück zum Zitat Baeza-Yates RA, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley Longman Publishing Co., Inc, Boston. ISBN: 020139829X Baeza-Yates RA, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley Longman Publishing Co., Inc, Boston. ISBN: 020139829X
6.
Zurück zum Zitat Bai J, Nie J-Y, Paradis F (2004) Using language models for text classification. In: Asia information retrieval symposium (AIRS), Beijing, China Bai J, Nie J-Y, Paradis F (2004) Using language models for text classification. In: Asia information retrieval symposium (AIRS), Beijing, China
7.
Zurück zum Zitat Biggerstaff TJ, Mitbander BG, Webster D (1993) The concept assignment problem in program understanding. In: Proceedings of the 15th international conference on software engineering (ICSE). IEEE Computer Society Press, pp 482–498 Biggerstaff TJ, Mitbander BG, Webster D (1993) The concept assignment problem in program understanding. In: Proceedings of the 15th international conference on software engineering (ICSE). IEEE Computer Society Press, pp 482–498
8.
Zurück zum Zitat Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022MATH
9.
Zurück zum Zitat Butler S, Wermelinger M, Yu Y, Sharp H (2011) Improving the tokenisation of identifier names. In: ECOOP 2011-object-oriented programming, pp 130–154 Butler S, Wermelinger M, Yu Y, Sharp H (2011) Improving the tokenisation of identifier names. In: ECOOP 2011-object-oriented programming, pp 130–154
10.
Zurück zum Zitat Capobianco G, Lucia AD, Oliveto R, Panichella A, Panichella S (2013) Improving IR-based traceability recovery via noun-based indexing of software artifacts. J Softw Evol Process 25(7):743–762CrossRef Capobianco G, Lucia AD, Oliveto R, Panichella A, Panichella S (2013) Improving IR-based traceability recovery via noun-based indexing of software artifacts. J Softw Evol Process 25(7):743–762CrossRef
11.
Zurück zum Zitat Cleary B, Exton C, Buckley J, English M (2009) An empirical analysis of information retrieval based concept location techniques in software comprehension. Empir Softw Eng 14(1):93–130CrossRef Cleary B, Exton C, Buckley J, English M (2009) An empirical analysis of information retrieval based concept location techniques in software comprehension. Empir Softw Eng 14(1):93–130CrossRef
12.
Zurück zum Zitat Cunningham H, Maynard D, Bontcheva K, Tablan V (2002) Gate: an architecture for development of robust hlt applications. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, pp 168–175 Cunningham H, Maynard D, Bontcheva K, Tablan V (2002) Gate: an architecture for development of robust hlt applications. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, pp 168–175
13.
Zurück zum Zitat Dit B, Moritz E, Poshyvanyk D (2012) A tracelab-based solution for creating, conducting, and sharing feature location experiments. In: 2012 IEEE 20th international conference on program comprehension (ICPC). IEEE, pp 203–208 Dit B, Moritz E, Poshyvanyk D (2012) A tracelab-based solution for creating, conducting, and sharing feature location experiments. In: 2012 IEEE 20th international conference on program comprehension (ICPC). IEEE, pp 203–208
14.
Zurück zum Zitat Dit B, Revelle M, Gethers M, Poshyvanyk D (2013a) Feature location in source code: a taxonomy and survey. J Softw Evol Process 25(1):53–95CrossRef Dit B, Revelle M, Gethers M, Poshyvanyk D (2013a) Feature location in source code: a taxonomy and survey. J Softw Evol Process 25(1):53–95CrossRef
15.
Zurück zum Zitat Dit B, Revelle M, Poshyvanyk D (2013b) Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empir Softw Eng 18(2):277–309CrossRef Dit B, Revelle M, Poshyvanyk D (2013b) Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empir Softw Eng 18(2):277–309CrossRef
16.
Zurück zum Zitat Gay G, Haiduc S, Marcus A, Menzies T (2009) On the use of relevance feedback in IR-based concept location. In: ICSM 2009. IEEE international conference on software maintenance (ICSM). IEEE, pp 351–360 Gay G, Haiduc S, Marcus A, Menzies T (2009) On the use of relevance feedback in IR-based concept location. In: ICSM 2009. IEEE international conference on software maintenance (ICSM). IEEE, pp 351–360
17.
Zurück zum Zitat Gómez VU, Kellens A, Brichau J, D’Hondt T (2009) Time warp, an approach for reasoning over system histories. In: Proceedings of the joint international and annual ERCIM workshops on principles of software evolution (IWPSE) and software evolution (Evol) workshops. ACM, pp 79–88 Gómez VU, Kellens A, Brichau J, D’Hondt T (2009) Time warp, an approach for reasoning over system histories. In: Proceedings of the joint international and annual ERCIM workshops on principles of software evolution (IWPSE) and software evolution (Evol) workshops. ACM, pp 79–88
18.
Zurück zum Zitat Hill E, Pollock L, Vijay-Shanker K (2009) Automatically capturing source code context of nl-queries for software maintenance and reuse. In: Proceedings of the 31st international conference on software engineering (ICSE). IEEE Computer Society, pp 232–242 Hill E, Pollock L, Vijay-Shanker K (2009) Automatically capturing source code context of nl-queries for software maintenance and reuse. In: Proceedings of the 31st international conference on software engineering (ICSE). IEEE Computer Society, pp 232–242
19.
Zurück zum Zitat Hossen K, Kagdi HH, Poshyvanyk D (2014) Amalgamating source code authors, maintainers, and change proneness to triage change requests. In: ICPC, pp 130–141 Hossen K, Kagdi HH, Poshyvanyk D (2014) Amalgamating source code authors, maintainers, and change proneness to triage change requests. In: ICPC, pp 130–141
20.
Zurück zum Zitat Kagdi H, Maletic JI, Sharif B (2007) Mining software repositories for traceability links. In: ICPC’07. 15th IEEE international conference on program comprehension (ICPC). IEEE, pp 145–154 Kagdi H, Maletic JI, Sharif B (2007) Mining software repositories for traceability links. In: ICPC’07. 15th IEEE international conference on program comprehension (ICPC). IEEE, pp 145–154
21.
Zurück zum Zitat Kagdi H, Gethers M, Poshyvanyk D, Hammad M (2012) Assigning change requests to software developers. J Softw Evol Process 24(1):3–33CrossRef Kagdi H, Gethers M, Poshyvanyk D, Hammad M (2012) Assigning change requests to software developers. J Softw Evol Process 24(1):3–33CrossRef
22.
Zurück zum Zitat Liu D, Marcus A, Poshyvanyk D, Rajlich V (2007) Feature location via information retrieval based filtering of a single scenario execution trace. In: Proceedings of the twenty-second IEEE/ACM international conference on automated software engineering. ACM, pp 234–243 Liu D, Marcus A, Poshyvanyk D, Rajlich V (2007) Feature location via information retrieval based filtering of a single scenario execution trace. In: Proceedings of the twenty-second IEEE/ACM international conference on automated software engineering. ACM, pp 234–243
23.
Zurück zum Zitat Lukins SK, Kraft NA, Etzkorn LH (2010) Bug localization using latent dirichlet allocation. Inf Softw Technol 52(9):972–990CrossRef Lukins SK, Kraft NA, Etzkorn LH (2010) Bug localization using latent dirichlet allocation. Inf Softw Technol 52(9):972–990CrossRef
24.
Zurück zum Zitat Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval, vol 1. Cambridge University Press, CambridgeCrossRefMATH Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval, vol 1. Cambridge University Press, CambridgeCrossRefMATH
25.
Zurück zum Zitat Petrenko M, Rajlich V, Vanciu R (2008) Partial domain comprehension in software evolution and maintenance. In: ICPC 2008. The 16th IEEE international conference on program comprehension (ICPC). IEEE, pp 13–22 Petrenko M, Rajlich V, Vanciu R (2008) Partial domain comprehension in software evolution and maintenance. In: ICPC 2008. The 16th IEEE international conference on program comprehension (ICPC). IEEE, pp 13–22
26.
Zurück zum Zitat Poshyvanyk D, Guéhéneuc Y-G, Marcus A, Antoniol G, Rajlich V (2006) Combining probabilistic ranking and latent semantic indexing for feature identification. In: ICPC 2006. 14th IEEE international conference on program comprehension (ICPC). IEEE, pp 137–148 Poshyvanyk D, Guéhéneuc Y-G, Marcus A, Antoniol G, Rajlich V (2006) Combining probabilistic ranking and latent semantic indexing for feature identification. In: ICPC 2006. 14th IEEE international conference on program comprehension (ICPC). IEEE, pp 137–148
27.
Zurück zum Zitat Poshyvanyk D, Guéhéneuc Y-G, Marcus A, Antoniol G, Rajlich V (2007) Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Trans Softw Eng 33(6):420–432CrossRef Poshyvanyk D, Guéhéneuc Y-G, Marcus A, Antoniol G, Rajlich V (2007) Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Trans Softw Eng 33(6):420–432CrossRef
28.
Zurück zum Zitat Poshyvanyk D, Gethers M, Marcus A (2012) Concept location using formal concept analysis and information retrieval. ACM Trans Softw Eng Methodol (TOSEM) 21(4):23CrossRef Poshyvanyk D, Gethers M, Marcus A (2012) Concept location using formal concept analysis and information retrieval. ACM Trans Softw Eng Methodol (TOSEM) 21(4):23CrossRef
29.
Zurück zum Zitat Rao S, Kak A (2011) Retrieval from software libraries for bug localization: a comparative study of generic and composite text models. In: Proceeding of the 8th working conference on mining software repositories (MSR), pp 43–52 (2011) Rao S, Kak A (2011) Retrieval from software libraries for bug localization: a comparative study of generic and composite text models. In: Proceeding of the 8th working conference on mining software repositories (MSR), pp 43–52 (2011)
30.
Zurück zum Zitat Ratanotayanon S, Choi HJ, Sim SE (2010) Using transitive changesets to support feature location. In: Proceedings of the IEEE/ACM international conference on automated software engineering. ACM, pp 341–344 Ratanotayanon S, Choi HJ, Sim SE (2010) Using transitive changesets to support feature location. In: Proceedings of the IEEE/ACM international conference on automated software engineering. ACM, pp 341–344
31.
Zurück zum Zitat Ratiu D, Deissenboeck F (2007) From reality to programs and (not quite) back again. In: ICPC’07. 15th IEEE international conference on program comprehension. IEEE, pp 91–102 Ratiu D, Deissenboeck F (2007) From reality to programs and (not quite) back again. In: ICPC’07. 15th IEEE international conference on program comprehension. IEEE, pp 91–102
33.
Zurück zum Zitat Schuler D, Zimmermann T (2008) Mining usage expertise from version archives. In: Proceedings of the 2008 international working conference on mining software repositories. ACM, pp 121–124 Schuler D, Zimmermann T (2008) Mining usage expertise from version archives. In: Proceedings of the 2008 international working conference on mining software repositories. ACM, pp 121–124
34.
Zurück zum Zitat Servant F, Jones JA (2012) Whosefault: automatic developer-to-fault assignment through fault localization. In: 2012 34th International conference on software engineering (ICSE), pp 36–46 Servant F, Jones JA (2012) Whosefault: automatic developer-to-fault assignment through fault localization. In: 2012 34th International conference on software engineering (ICSE), pp 36–46
35.
Zurück zum Zitat Shepherd D, Fry ZP, Hill E, Pollock L, Vijay-Shanker K (2007) Using natural language program analysis to locate and understand action-oriented concerns. In: Proceedings of the 6th international conference on aspect-oriented software development. ACM, pp 212–224 Shepherd D, Fry ZP, Hill E, Pollock L, Vijay-Shanker K (2007) Using natural language program analysis to locate and understand action-oriented concerns. In: Proceedings of the 6th international conference on aspect-oriented software development. ACM, pp 212–224
36.
Zurück zum Zitat Shokripour R, Anvik J, Kasirun ZM, Zamani S (2013) Why so complicated? Simple term filtering and weighting for location-based bug report assignment recommendation. In: Proceedings of the tenth international workshop on mining software repositories. IEEE Press, pp 2–11 Shokripour R, Anvik J, Kasirun ZM, Zamani S (2013) Why so complicated? Simple term filtering and weighting for location-based bug report assignment recommendation. In: Proceedings of the tenth international workshop on mining software repositories. IEEE Press, pp 2–11
37.
Zurück zum Zitat Ramin S, John A, Kasirun ZM, Zamani S (2014) Improving automatic bug assignment using time-metadata in term-weighting. Institution of Engineering and Technology, IET (2014) Ramin S, John A, Kasirun ZM, Zamani S (2014) Improving automatic bug assignment using time-metadata in term-weighting. Institution of Engineering and Technology, IET (2014)
38.
Zurück zum Zitat Wang S, Lo D, Xing Z, Jiang L (2011) Concern localization using information retrieval: an empirical study on linux kernel. In: 18th Working conference on reverse engineering (WCRE2011). IEEE, pp 92–96 Wang S, Lo D, Xing Z, Jiang L (2011) Concern localization using information retrieval: an empirical study on linux kernel. In: 18th Working conference on reverse engineering (WCRE2011). IEEE, pp 92–96
39.
Zurück zum Zitat Wilde N, Scully MC (1995) Software reconnaissance: mapping program features to code. J Softw Maint Res Pract 7(1):49–62CrossRef Wilde N, Scully MC (1995) Software reconnaissance: mapping program features to code. J Softw Maint Res Pract 7(1):49–62CrossRef
40.
Zurück zum Zitat Wohlin C, Runeson P, Hst M, Ohlsson MC, Regnell B, Wessln A (2012) Experimentation in software engineering. Springer Publishing Company, Incorporated. ISBN: 3642290434, 9783642290435 Wohlin C, Runeson P, Hst M, Ohlsson MC, Regnell B, Wessln A (2012) Experimentation in software engineering. Springer Publishing Company, Incorporated. ISBN: 3642290434, 9783642290435
41.
Zurück zum Zitat Zamani S, Lee SP, Shokripour R, Anvik J (2014) A noun-based approach to feature location using time-aware term-weighting. Inf Softw Technol 56(8):991–1011CrossRef Zamani S, Lee SP, Shokripour R, Anvik J (2014) A noun-based approach to feature location using time-aware term-weighting. Inf Softw Technol 56(8):991–1011CrossRef
42.
Zurück zum Zitat Zhai Chengxiang, Lafferty John (2004) A study of smoothing methods for language models applied to information retrieval. ACM Trans Inf Syst (TOIS) 22(2):179–214CrossRef Zhai Chengxiang, Lafferty John (2004) A study of smoothing methods for language models applied to information retrieval. ACM Trans Inf Syst (TOIS) 22(2):179–214CrossRef
43.
Zurück zum Zitat Zhou J, Zhang H, Lo D (2012) Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports. In: 34th International conference on software engineering (ICSE). IEEE, pp 14–24 Zhou J, Zhang H, Lo D (2012) Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports. In: 34th International conference on software engineering (ICSE). IEEE, pp 14–24
Metadaten
Titel
A feature location approach supported by time-aware weighting of terms associated with developer expertise profiles
verfasst von
Sima Zamani
Sai Peck Lee
Ramin Shokripour
John Anvik
Publikationsdatum
01.11.2016
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 2/2016
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-015-0909-5

Weitere Artikel der Ausgabe 2/2016

Knowledge and Information Systems 2/2016 Zur Ausgabe

Premium Partner