Skip to main content

2020 | OriginalPaper | Buchkapitel

5. Source Code Indexing for Component Reuse

verfasst von : Themistoklis Diamantopoulos, Andreas L. Symeonidis

Erschienen in: Mining Software Engineering Data for Software Reuse

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The momentum of the open-source community has been constantly increasing, thus leading to numerous tools for writing, maintaining, and sharing source code. Several code search engines have been developed to support development tasks and facilitate reuse either directly or by functioning as information sources for code recommenders. In this chapter, we present AGORA, a code search engine that facilitates reuse in component level, snippet level, and project level. Through its Elasticsearch index, AGORA fosters advanced queries (syntax-aware, regular expressions), while the engine also integrates with popular code hosting repositories and offers a well-designed API. We provide representative examples and a usage scenario to illustrate the functionality of AGORA, and perform a comparative analysis in a code reuse context, which indicates that AGORA provides an efficient alternative to current solutions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
4
The term “agora” refers to the central spot of ancient Greek city-states. Although it is roughly translated to the English term “market”, “agora” can be better viewed as an assembly, a place where people met not only to trade goods, but also to exchange ideas. It is a good fit for our search engine since we envision it as a place where developers can (freely) distribute and exchange their source code, and subsequently their ideas.
 
7
Elasticsearch supports removing stop words for several languages. In our case, the default choice of English is adequate.
 
8
Note that the standard analyzer initially splits the filename into two parts, the filename without extension and the extension since it splits according to punctuation.
 
9
One of 040000, 100644, 100664, 100755, 120000, or 160000 which correspond to directory, regular non-executable file, regular non-executable group-writeable file, regular executable file, symbolic link, or gitlink, respectively.
 
10
One of blob, tree, commit, or tag.
 
11
The CamelCase analyzer is quite effective for fields including Java types; the types are conventionally in camelCase, while the primitives are not affected by the CamelCase tokenizer, i.e., the text “float” results in the token “float”.
 
Literatur
1.
Zurück zum Zitat Thummalapenta S, Xie T (2007) PARSEWeb: a programmer assistant for reusing open source code on the web. In: Proceedings of the 22nd IEEE/ACM international conference on automated software engineering, ASE ’07, New York, NY, USA. ACM, pp 204–213 Thummalapenta S, Xie T (2007) PARSEWeb: a programmer assistant for reusing open source code on the web. In: Proceedings of the 22nd IEEE/ACM international conference on automated software engineering, ASE ’07, New York, NY, USA. ACM, pp 204–213
2.
Zurück zum Zitat Xie T, Pei J (2006) MAPO: mining API usages from open source repositories. In: Proceedings of the 2006 international workshop on mining software repositories, MSR ’06, New York, NY, USA. ACM, pp 54–57 Xie T, Pei J (2006) MAPO: mining API usages from open source repositories. In: Proceedings of the 2006 international workshop on mining software repositories, MSR ’06, New York, NY, USA. ACM, pp 54–57
3.
Zurück zum Zitat Hummel O, Janjic W, Atkinson C (2008) Code conjurer: pulling reusable software out of thin air. IEEE Softw 25(5):45–52CrossRef Hummel O, Janjic W, Atkinson C (2008) Code conjurer: pulling reusable software out of thin air. IEEE Softw 25(5):45–52CrossRef
4.
Zurück zum Zitat Lazzarini Lemos OA, Bajracharya SK, Ossher J (2007) CodeGenie: a tool for test-driven source code search. In: Companion to the 22nd ACM SIGPLAN conference on object-oriented programming systems and applications companion, OOPSLA ’07, New York, NY, USA. ACM, pp 917–918 Lazzarini Lemos OA, Bajracharya SK, Ossher J (2007) CodeGenie: a tool for test-driven source code search. In: Companion to the 22nd ACM SIGPLAN conference on object-oriented programming systems and applications companion, OOPSLA ’07, New York, NY, USA. ACM, pp 917–918
5.
Zurück zum Zitat Diamantopoulos T, Symeonidis AL (2018) AGORA: a search engine for source code reuse. SoftwareX, page under review Diamantopoulos T, Symeonidis AL (2018) AGORA: a search engine for source code reuse. SoftwareX, page under review
6.
Zurück zum Zitat Janjic W, Hummel O, Schumacher M, Atkinson C (2013) An unabridged source code dataset for research in software reuse. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13, Piscataway, NJ, USA. IEEE Press, pp 339–342 Janjic W, Hummel O, Schumacher M, Atkinson C (2013) An unabridged source code dataset for research in software reuse. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13, Piscataway, NJ, USA. IEEE Press, pp 339–342
7.
Zurück zum Zitat Linstead E, Bajracharya S, Ngo T, Rigor P, Lopes C, Baldi P (2009) Sourcerer: mining and searching internet-scale software repositories. Data Min Knowl Discov 18(2):300–336MathSciNetCrossRef Linstead E, Bajracharya S, Ngo T, Rigor P, Lopes C, Baldi P (2009) Sourcerer: mining and searching internet-scale software repositories. Data Min Knowl Discov 18(2):300–336MathSciNetCrossRef
14.
Zurück zum Zitat Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, New York Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, New York
15.
Zurück zum Zitat Gamma E, Vlissides J, Johnson R, Helm R (1998) Design patterns: elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co. Inc, BostonMATH Gamma E, Vlissides J, Johnson R, Helm R (1998) Design patterns: elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co. Inc, BostonMATH
16.
Zurück zum Zitat Papamichail M, Diamantopoulos T, Symeonidis AL (2016) User-perceived source code quality estimation based on static analysis metrics. In: Proceedings of the 2016 IEEE international conference on software quality, reliability and security, QRS, Vienna, Austria, pp 100–107 Papamichail M, Diamantopoulos T, Symeonidis AL (2016) User-perceived source code quality estimation based on static analysis metrics. In: Proceedings of the 2016 IEEE international conference on software quality, reliability and security, QRS, Vienna, Austria, pp 100–107
17.
Zurück zum Zitat Aggarwal K, Hindle A, Stroulia E (2014) Co-evolution of project documentation and popularity within github. In: Proceedings of the 11th working conference on mining software repositories, MSR 2014, New York, NY, USA. ACM, pp 360–363 Aggarwal K, Hindle A, Stroulia E (2014) Co-evolution of project documentation and popularity within github. In: Proceedings of the 11th working conference on mining software repositories, MSR 2014, New York, NY, USA. ACM, pp 360–363
18.
Zurück zum Zitat Weber S, Luo J (2014) What makes an open source code popular on GitHub? In: 2014 IEEE international conference on data mining workshop, ICDMW, pp 851–855 Weber S, Luo J (2014) What makes an open source code popular on GitHub? In: 2014 IEEE international conference on data mining workshop, ICDMW, pp 851–855
19.
Zurück zum Zitat Borges H, Hora A, Valente MT (2016) Understanding the factors that impact the popularity of GitHub repositories. In: 2016 IEEE international conference on software maintenance and evolution (ICSME), ICSME, pp 334–344 Borges H, Hora A, Valente MT (2016) Understanding the factors that impact the popularity of GitHub repositories. In: 2016 IEEE international conference on software maintenance and evolution (ICSME), ICSME, pp 334–344
20.
Zurück zum Zitat Dimaridou V, Kyprianidis A-C, Papamichail M, Diamantopoulos T, Symeonidis A (2017) Towards modeling the user-perceived quality of source code using static analysis metrics. In: Proceedings of the 12th international conference on software technologies - volume 1, ICSOFT, Setubal, Portugal, 2017. INSTICC, SciTePress, pp 73–84 Dimaridou V, Kyprianidis A-C, Papamichail M, Diamantopoulos T, Symeonidis A (2017) Towards modeling the user-perceived quality of source code using static analysis metrics. In: Proceedings of the 12th international conference on software technologies - volume 1, ICSOFT, Setubal, Portugal, 2017. INSTICC, SciTePress, pp 73–84
21.
Zurück zum Zitat Diamantopoulos T, Thomopoulos K, Symeonidis AL (2016) QualBoa: reusability-aware recommendations of source code components. In: Proceedings of the IEEE/ACM 13th working conference on mining software repositories, MSR ’16, pp 488–491 Diamantopoulos T, Thomopoulos K, Symeonidis AL (2016) QualBoa: reusability-aware recommendations of source code components. In: Proceedings of the IEEE/ACM 13th working conference on mining software repositories, MSR ’16, pp 488–491
22.
Zurück zum Zitat Reiss SP (2009) Semantics-based code search. In: Proceedings of the 31st international conference on software engineering, ICSE ’09, Washington, DC, USA. IEEE Computer Society, pp 243–253 Reiss SP (2009) Semantics-based code search. In: Proceedings of the 31st international conference on software engineering, ICSE ’09, Washington, DC, USA. IEEE Computer Society, pp 243–253
23.
Zurück zum Zitat Sahavechaphan N, Claypool K (2006) XSnippet: mining for sample code. SIGPLAN Not. 41(10):413–430CrossRef Sahavechaphan N, Claypool K (2006) XSnippet: mining for sample code. SIGPLAN Not. 41(10):413–430CrossRef
Metadaten
Titel
Source Code Indexing for Component Reuse
verfasst von
Themistoklis Diamantopoulos
Andreas L. Symeonidis
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-30106-4_5

Premium Partner