Skip to main content

2020 | OriginalPaper | Buchkapitel

6. Mining Source Code for Component Reuse

verfasst von : Themistoklis Diamantopoulos, Andreas L. Symeonidis

Erschienen in: Mining Software Engineering Data for Software Reuse

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Although the development of code search engines has brought forth syntax-aware capabilities when searching for reusable components, these engines do not fully exploit the given context and do not assess the retrieved source code. As a result, several test-driven reuse systems have been developed to offer context-aware component search and further assess the retrieved components using test cases. However, most of these systems employ strict matching criteria and do not offer information concerning the flow and the dependencies of the retrieved components. In this chapter, we present Mantissa, a system designed to overcome the aforementioned limitations. Mantissa allows code searching in growing repositories, such as GitHub. The user provides the input query as a code snippet and Mantissa employs a mechanism that uses Information Retrieval techniques to return functional software components. Finally, we provide an example usage scenario for Mantissa and evaluate our system against popular search engines and test-driven reuse systems to illustrate its effectiveness.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The term “Mantissa” refers to a sacred woman in ancient Greece, who interpreted the signs sent by the gods and gave oracles. In a similar way, our system interprets the query of the user and provides suitable results.
 
2
See [7] for a systematic literature review on software engineering tasks that are facilitated using recommendation systems.
 
4
As already mentioned in the previous chapter, the Google Code Search Engine was discontinued in 2013.
 
9
In specific, the main difference between our scoring model and the VSM is that our model is hierarchical, i.e., in our case a vector can be composed by other vectors. Our model has levels; at the class level it is practically a VSM for class vectors, while class vectors are further analyzed at the method level.
 
10
According to the original definition of the SMP, there are N men and N women, and every person ranks the members of the opposite sex in a strict order of preference. The problem is to find a matching between men and women so that there are no two people of opposite sex who would both rather be matched to each other than their current partners.
 
15
Java supports method overloading, meaning that it allows the existence of methods with the same name that differs only in their parameters.
 
Literatur
1.
Zurück zum Zitat Walker RJ (2013) Recent advances in recommendation systems for software engineering. In: Ali M, Bosse T, Hindriks KV, Hoogendoorn M, Jonker CM, Treur J (eds) Recent trends in applied artificial intelligence, vol 7906. Lecture notes in computer science. Springer, Berlin, pp 372–381 Walker RJ (2013) Recent advances in recommendation systems for software engineering. In: Ali M, Bosse T, Hindriks KV, Hoogendoorn M, Jonker CM, Treur J (eds) Recent trends in applied artificial intelligence, vol 7906. Lecture notes in computer science. Springer, Berlin, pp 372–381
2.
Zurück zum Zitat Nurolahzade M, Walker RJ, Maurer F (2013) An assessment of test-driven reuse: promises and pitfalls. In: Favaro John, Morisio Maurizio (eds) Safe and secure software reuse, vol 7925. Lecture notes in computer science. Springer, Berlin Heidelberg, pp 65–80 Nurolahzade M, Walker RJ, Maurer F (2013) An assessment of test-driven reuse: promises and pitfalls. In: Favaro John, Morisio Maurizio (eds) Safe and secure software reuse, vol 7925. Lecture notes in computer science. Springer, Berlin Heidelberg, pp 65–80
3.
Zurück zum Zitat McIlroy MD (1968) Components mass-produced software. In: Naur P, Randell B (eds.) Software engineering; report of a conference sponsored by the nato science committee, pp 138–155. NATO Scientific Affairs Division, Brussels, Belgium, NATO Scientific Affairs Division. Belgium, Brussels McIlroy MD (1968) Components mass-produced software. In: Naur P, Randell B (eds.) Software engineering; report of a conference sponsored by the nato science committee, pp 138–155. NATO Scientific Affairs Division, Brussels, Belgium, NATO Scientific Affairs Division. Belgium, Brussels
4.
Zurück zum Zitat Mens K, Lozano A (2014) Source code-based recommendation systems. Springer, Berlin, pp 93–130 Mens K, Lozano A (2014) Source code-based recommendation systems. Springer, Berlin, pp 93–130
5.
Zurück zum Zitat Janjic W, Hummel O, Atkinson C (2014) Reuse-oriented code recommendation systems. Springer, Berlin, pp 359–386 Janjic W, Hummel O, Atkinson C (2014) Reuse-oriented code recommendation systems. Springer, Berlin, pp 359–386
6.
Zurück zum Zitat Robillard M, Walker R, Zimmermann T (2010) Recommendation systems for software engineering. IEEE Softw 27(4):80–86CrossRef Robillard M, Walker R, Zimmermann T (2010) Recommendation systems for software engineering. IEEE Softw 27(4):80–86CrossRef
7.
Zurück zum Zitat Gasparic M, Janes A (2016) What recommendation systems for software engineering recommend. J Syst Softw 113(C):101–113 Gasparic M, Janes A (2016) What recommendation systems for software engineering recommend. J Syst Softw 113(C):101–113
8.
Zurück zum Zitat Sahavechaphan N, Claypool K (2006) XSnippet: mining for sample code. SIGPLAN Not 41(10):413–430CrossRef Sahavechaphan N, Claypool K (2006) XSnippet: mining for sample code. SIGPLAN Not 41(10):413–430CrossRef
9.
Zurück zum Zitat Thummalapenta S, Xie T (2007) PARSEWeb: a programmer assistant for reusing open source code on the web. In: Proceedings of the 22nd IEEE/ACM international conference on automated software engineering, ASE ’07, pp. 204–213, New York, NY, USA. ACM Thummalapenta S, Xie T (2007) PARSEWeb: a programmer assistant for reusing open source code on the web. In: Proceedings of the 22nd IEEE/ACM international conference on automated software engineering, ASE ’07, pp. 204–213, New York, NY, USA. ACM
10.
Zurück zum Zitat Xie T, Pei J (2006) MAPO: mining API usages from open source repositories. In: Proceedings of the 2006 international workshop on mining software repositories, MSR ’06, pp 54–57, New York, NY, USA. ACM Xie T, Pei J (2006) MAPO: mining API usages from open source repositories. In: Proceedings of the 2006 international workshop on mining software repositories, MSR ’06, pp 54–57, New York, NY, USA. ACM
11.
Zurück zum Zitat Wei Y, Chandrasekaran N, Gulwani S, Hamadi Y (2015) Building bing developer assistant. Technical Report MSR-TR-2015-36, Microsoft Research Wei Y, Chandrasekaran N, Gulwani S, Hamadi Y (2015) Building bing developer assistant. Technical Report MSR-TR-2015-36, Microsoft Research
12.
Zurück zum Zitat Galenson J, Reames P, Bodik R, Hartmann B, Sen K (2014) CodeHint: dynamic and interactive synthesis of code snippets. In: Proceedings of the 36th international conference on software engineering, ICSE 2014, pp 653–663, New York, NY, USA. ACM Galenson J, Reames P, Bodik R, Hartmann B, Sen K (2014) CodeHint: dynamic and interactive synthesis of code snippets. In: Proceedings of the 36th international conference on software engineering, ICSE 2014, pp 653–663, New York, NY, USA. ACM
13.
Zurück zum Zitat Hummel O, Janjic W, Atkinson C (2008) Code conjurer: pulling reusable software out of thin air. IEEE Softw 25(5):45–52CrossRef Hummel O, Janjic W, Atkinson C (2008) Code conjurer: pulling reusable software out of thin air. IEEE Softw 25(5):45–52CrossRef
14.
Zurück zum Zitat Lemos OAL, Bajracharya SK, Ossher J, Morla RS, Masiero PC, Baldi P, Lopes CV (2007) CodeGenie: using test-cases to search and reuse source code. In: Proceedings of the Twenty-second IEEE/ACM international conference on automated software engineering, ASE ’07, pp 525–526, New York, NY, USA. ACM Lemos OAL, Bajracharya SK, Ossher J, Morla RS, Masiero PC, Baldi P, Lopes CV (2007) CodeGenie: using test-cases to search and reuse source code. In: Proceedings of the Twenty-second IEEE/ACM international conference on automated software engineering, ASE ’07, pp 525–526, New York, NY, USA. ACM
15.
Zurück zum Zitat Henninger S (1991) Retrieving software objects in an example-based programming environment. In: Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’91, pp 251–260, New York, NY, USA. ACM Henninger S (1991) Retrieving software objects in an example-based programming environment. In: Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’91, pp 251–260, New York, NY, USA. ACM
16.
Zurück zum Zitat Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web. Technical Report 1999-66, Stanford InfoLab. Previous number = SIDL-WP-1999-0120 Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web. Technical Report 1999-66, Stanford InfoLab. Previous number = SIDL-WP-1999-0120
17.
Zurück zum Zitat Michail A (2000) Data mining library reuse patterns using generalized association rules. In: Proceedings of the 22nd international conference on software engineering, ICSE ’00, pp 167–176, New York, NY, USA. ACM Michail A (2000) Data mining library reuse patterns using generalized association rules. In: Proceedings of the 22nd international conference on software engineering, ICSE ’00, pp 167–176, New York, NY, USA. ACM
18.
Zurück zum Zitat Ye Y, Fischer G (2002) Supporting reuse by delivering task-relevant and personalized information. In: Proceedings of the 24th international conference on software engineering, ICSE ’02, pp 513–523, New York, NY, USA. ACM Ye Y, Fischer G (2002) Supporting reuse by delivering task-relevant and personalized information. In: Proceedings of the 24th international conference on software engineering, ICSE ’02, pp 513–523, New York, NY, USA. ACM
19.
Zurück zum Zitat Holmes R, Murphy GC (2005) Using structural context to recommend source code examples. In: Proceedings of the 27th international conference on software engineering, ICSE ’05, pp 117–125, New York, NY, USA. ACM Holmes R, Murphy GC (2005) Using structural context to recommend source code examples. In: Proceedings of the 27th international conference on software engineering, ICSE ’05, pp 117–125, New York, NY, USA. ACM
20.
Zurück zum Zitat Mandelin D, Lin X, Bodík R, Kimelman D (2005) Jungloid mining: helping to navigate the PI jungle. SIGPLAN Not 40(6):48–61CrossRef Mandelin D, Lin X, Bodík R, Kimelman D (2005) Jungloid mining: helping to navigate the PI jungle. SIGPLAN Not 40(6):48–61CrossRef
21.
Zurück zum Zitat McMillan C, Grechanik M, Poshyvanyk D, Xie Q, Fu C (2011) Portfolio: finding relevant functions and their usage. In: Proceedings of the 33rd international conference on software engineering, ICSE ’11, pp 111–120, New York, NY, USA. ACM McMillan C, Grechanik M, Poshyvanyk D, Xie Q, Fu C (2011) Portfolio: finding relevant functions and their usage. In: Proceedings of the 33rd international conference on software engineering, ICSE ’11, pp 111–120, New York, NY, USA. ACM
22.
Zurück zum Zitat McMillan C, Grechanik M, Poshyvanyk D, Chen F, Xie Q (2012) Exemplar: a source code search engine for finding highly relevant applications. IEEE Trans Softw Eng 38(5):1069–1087CrossRef McMillan C, Grechanik M, Poshyvanyk D, Chen F, Xie Q (2012) Exemplar: a source code search engine for finding highly relevant applications. IEEE Trans Softw Eng 38(5):1069–1087CrossRef
23.
Zurück zum Zitat Wightman D, Ye Z, Brandt J, Vertegaal R (2012) SnipMatch: using source code context to enhance snippet retrieval and parameterization. In: Proceedings of the 25th annual ACM symposium on user interface software and technology, UIST ’12, pp 219–228, New York, NY, USA. ACM Wightman D, Ye Z, Brandt J, Vertegaal R (2012) SnipMatch: using source code context to enhance snippet retrieval and parameterization. In: Proceedings of the 25th annual ACM symposium on user interface software and technology, UIST ’12, pp 219–228, New York, NY, USA. ACM
24.
Zurück zum Zitat Zagalsky A, Barzilay O, Yehudai A (2012) Example overflow: using social media for code recommendation. In: Proceedings of the third international workshop on recommendation systems for software engineering, RSSE ’12, pp 38–42, Piscataway, NJ, USA. IEEE Press Zagalsky A, Barzilay O, Yehudai A (2012) Example overflow: using social media for code recommendation. In: Proceedings of the third international workshop on recommendation systems for software engineering, RSSE ’12, pp 38–42, Piscataway, NJ, USA. IEEE Press
25.
Zurück zum Zitat Beck (2002) Test driven development: by example. Addison-Wesley Longman Publishing Co., Inc., Boston Beck (2002) Test driven development: by example. Addison-Wesley Longman Publishing Co., Inc., Boston
26.
Zurück zum Zitat Hummel O, Atkinson C (2004) Extreme Harvesting: test driven discovery and reuse of software components. In: Proceedings of the 2004 IEEE international conference on information reuse and integration, IRI 2004, pp 66–72 Hummel O, Atkinson C (2004) Extreme Harvesting: test driven discovery and reuse of software components. In: Proceedings of the 2004 IEEE international conference on information reuse and integration, IRI 2004, pp 66–72
27.
Zurück zum Zitat Janjic W, Stoll D, Bostan P, Atkinson C (2009) Lowering the barrier to reuse through test-driven search. In: Proceedings of the 2009 ICSE workshop on search-driven development-users, infrastructure, tools and evaluation, SUITE ’09, pp 21–24, Washington, DC, USA. IEEE Computer Society Janjic W, Stoll D, Bostan P, Atkinson C (2009) Lowering the barrier to reuse through test-driven search. In: Proceedings of the 2009 ICSE workshop on search-driven development-users, infrastructure, tools and evaluation, SUITE ’09, pp 21–24, Washington, DC, USA. IEEE Computer Society
28.
Zurück zum Zitat Hummel O, Janjic W (2013) Test-driven reuse: key to improving precision of search engines for software reuse. In: Sim SE, Gallardo-Valencia RE (eds) Finding source code on the web for remix and reuse, pp 227–250. Springer, New York Hummel O, Janjic W (2013) Test-driven reuse: key to improving precision of search engines for software reuse. In: Sim SE, Gallardo-Valencia RE (eds) Finding source code on the web for remix and reuse, pp 227–250. Springer, New York
29.
Zurück zum Zitat Janjic W, Hummel O, Schumacher M, Atkinson (2013) An unabridged source code dataset for research in software reuse. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13, pp 339–342, Piscataway, NJ, USA. IEEE Press Janjic W, Hummel O, Schumacher M, Atkinson (2013) An unabridged source code dataset for research in software reuse. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13, pp 339–342, Piscataway, NJ, USA. IEEE Press
30.
Zurück zum Zitat Lemos OAL, Bajracharya S, Ossher J, Masiero PC, Lopes C (2009) Applying test-driven code search to the reuse of auxiliary functionality. In: Proceedings of the 2009 ACM symposium on applied computing, SAC ’09, pp 476–482, New York, NY, USA. ACM Lemos OAL, Bajracharya S, Ossher J, Masiero PC, Lopes C (2009) Applying test-driven code search to the reuse of auxiliary functionality. In: Proceedings of the 2009 ACM symposium on applied computing, SAC ’09, pp 476–482, New York, NY, USA. ACM
31.
Zurück zum Zitat Lemos OAL, Bajracharya S, Ossher J, Masiero PC, Lopes C (2011) A test-driven approach to code search and its application to the reuse of auxiliary functionality. Inf Softw Technol 53(4):294–306CrossRef Lemos OAL, Bajracharya S, Ossher J, Masiero PC, Lopes C (2011) A test-driven approach to code search and its application to the reuse of auxiliary functionality. Inf Softw Technol 53(4):294–306CrossRef
32.
Zurück zum Zitat Bajracharya S, Ngo T, Linstead E, Dou Y, Rigor P, Baldi P, Lopes C (2006) Sourcerer: a search engine for open source code supporting structure-based search. In: Companion to the 21st ACM SIGPLAN symposium on object-oriented programming systems, languages, and applications, OOPSLA ’06, pp 681–682, New York, NY, USA. ACM Bajracharya S, Ngo T, Linstead E, Dou Y, Rigor P, Baldi P, Lopes C (2006) Sourcerer: a search engine for open source code supporting structure-based search. In: Companion to the 21st ACM SIGPLAN symposium on object-oriented programming systems, languages, and applications, OOPSLA ’06, pp 681–682, New York, NY, USA. ACM
33.
Zurück zum Zitat Linstead E, Bajracharya S, Ngo T, Rigor P, Lopes C, Baldi P (2009) Sourcerer: mining and searching internet-scale software repositories. Data Min Knowl Discov 18(2):300–336MathSciNetCrossRef Linstead E, Bajracharya S, Ngo T, Rigor P, Lopes C, Baldi P (2009) Sourcerer: mining and searching internet-scale software repositories. Data Min Knowl Discov 18(2):300–336MathSciNetCrossRef
34.
Zurück zum Zitat Krug M (2007) FAST: an eclipse plug-in for test-driven reuse. Master’s thesis, University of Mannheim Krug M (2007) FAST: an eclipse plug-in for test-driven reuse. Master’s thesis, University of Mannheim
35.
Zurück zum Zitat Reiss SP (2009) Semantics-based code search. In: Proceedings of the 31st international conference on software engineering, ICSE ’09, pp 243–253, Washington, DC, USA. IEEE Computer Society Reiss SP (2009) Semantics-based code search. In: Proceedings of the 31st international conference on software engineering, ICSE ’09, pp 243–253, Washington, DC, USA. IEEE Computer Society
36.
Zurück zum Zitat Reiss SP (2009) Specifying what to search for. In: Proceedings of the 2009 ICSE workshop on search-driven development-users, infrastructure, tools and evaluation, SUITE ’09, pp 41–44, Washington, DC, USA. IEEE Computer Society Reiss SP (2009) Specifying what to search for. In: Proceedings of the 2009 ICSE workshop on search-driven development-users, infrastructure, tools and evaluation, SUITE ’09, pp 41–44, Washington, DC, USA. IEEE Computer Society
37.
Zurück zum Zitat Diamantopoulos T, Katirtzis N, Symeonidis A (2018) Mantissa: a recommendation system for test-driven code reuse. Unpublished manuscript Diamantopoulos T, Katirtzis N, Symeonidis A (2018) Mantissa: a recommendation system for test-driven code reuse. Unpublished manuscript
38.
Zurück zum Zitat Rivest R (1992) The MD5 message-digest algorithm Rivest R (1992) The MD5 message-digest algorithm
39.
Zurück zum Zitat Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, New YorkCrossRef Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, New YorkCrossRef
40.
41.
Zurück zum Zitat Bird S, Klein E, Loper E (2009) Natural language processing with python, 1st edn. O’Reilly Media, Inc., SebastopolMATH Bird S, Klein E, Loper E (2009) Natural language processing with python, 1st edn. O’Reilly Media, Inc., SebastopolMATH
42.
Zurück zum Zitat Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady 10:707MathSciNet Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady 10:707MathSciNet
43.
Zurück zum Zitat Jaccard P (1901) Étude comparative de la distribution florale dans une portion des alpes et des jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37:547–579 Jaccard P (1901) Étude comparative de la distribution florale dans une portion des alpes et des jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37:547–579
44.
Zurück zum Zitat Tanimoto TT (1957) IBM Internal Report Tanimoto TT (1957) IBM Internal Report
45.
Zurück zum Zitat Diamantopoulos T, Symeonidis AL (2015) Employing source code information to improve question-answering in stack overflow. In: Proceedings of the 12th working conference on mining software repositories, MSR ’15, pp 454–457, Piscataway, NJ, USA. IEEE Press Diamantopoulos T, Symeonidis AL (2015) Employing source code information to improve question-answering in stack overflow. In: Proceedings of the 12th working conference on mining software repositories, MSR ’15, pp 454–457, Piscataway, NJ, USA. IEEE Press
Metadaten
Titel
Mining Source Code for Component Reuse
verfasst von
Themistoklis Diamantopoulos
Andreas L. Symeonidis
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-30106-4_6

Premium Partner