Skip to main content
Top
Published in: Progress in Artificial Intelligence 3/2018

28-03-2018 | Regular Paper

Emerging topics in mining software repositories

Machine learning in software repositories and datasets

Authors: Diego Güemes-Peña, Carlos López-Nozal, Raúl Marticorena-Sánchez, Jesús Maudes-Raedo

Published in: Progress in Artificial Intelligence | Issue 3/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

A software process is a set of related activities that culminates in the production of a software package: specification, design, implementation, testing, evolution into new versions, and maintenance. There are also other supporting activities such as configuration and change management, quality assurance, project management, evaluation of user experience, etc. Software repositories are infrastructures to support all these activities. They can be composed with several systems that include code change management, bug tracking, code review, build system, release binaries, wikis, forums, etc. This position paper on mining software repositories presents a review and a discussion of research in this field over the past decade. We also identify applied machine learning strategies, current working topics, and future challenges for the improvement of company decision-making systems. Machine learning is defined as the process of discovering patterns in data. It can be applied to software repositories, since every change is recorded as data. Companies can then use these patterns as the basis for their decision-making systems and for knowledge discovery.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
4.
go back to reference Brown, W.H., Malveau, R.C., McCormick, H.W.S., Mowbray, T.J.: AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis, 1st edn. Wiley, New York, NY (1998) Brown, W.H., Malveau, R.C., McCormick, H.W.S., Mowbray, T.J.: AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis, 1st edn. Wiley, New York, NY (1998)
16.
go back to reference Guana, V., Rocha, F., Hindle, A., Stroulia, E.: Do the stars align? Multidimensional analysis of android’s layered architecture. In: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), pp. 124–127 (2012) Guana, V., Rocha, F., Hindle, A., Stroulia, E.: Do the stars align? Multidimensional analysis of android’s layered architecture. In: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), pp. 124–127 (2012)
17.
go back to reference Guzman, E., Azócar, D., Li, Y.: Sentiment analysis of commit comments in GitHub: an empirical study. In: Proceedings of the 11th Working Conference on Mining Software Repositories, MSR 2014, pp. 352–355. ACM, New York, NY, USA (2014). https://doi.org/10.1145/2597073.2597118 Guzman, E., Azócar, D., Li, Y.: Sentiment analysis of commit comments in GitHub: an empirical study. In: Proceedings of the 11th Working Conference on Mining Software Repositories, MSR 2014, pp. 352–355. ACM, New York, NY, USA (2014). https://​doi.​org/​10.​1145/​2597073.​2597118
18.
go back to reference Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRef Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)CrossRef
23.
go back to reference Holmes, R., Walker, R.J.: A newbie’s guide to eclipse APIs. In: Proceedings of the 2008 International Working Conference on Mining Software Repositories, MSR 2008 (Co-located with ICSE), Leipzig, Germany, May 10–11, 2008, Proceedings, pp. 149–152 (2008). https://doi.org/10.1145/1370750.1370787 Holmes, R., Walker, R.J.: A newbie’s guide to eclipse APIs. In: Proceedings of the 2008 International Working Conference on Mining Software Repositories, MSR 2008 (Co-located with ICSE), Leipzig, Germany, May 10–11, 2008, Proceedings, pp. 149–152 (2008). https://​doi.​org/​10.​1145/​1370750.​1370787
25.
go back to reference Jacobson, I., Booch, G., Rumbaugh, J.: The Unified Software Development Process. Addison-Wesley Longman Publishing Co., Inc., Boston, MA (1999) Jacobson, I., Booch, G., Rumbaugh, J.: The Unified Software Development Process. Addison-Wesley Longman Publishing Co., Inc., Boston, MA (1999)
31.
go back to reference Krinke, J., Gold, N., Jia, Y., Binkley, D.: Cloning and copying between gnome projects. In: 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), pp. 98–101 (2010) Krinke, J., Gold, N., Jia, Y., Binkley, D.: Cloning and copying between gnome projects. In: 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), pp. 98–101 (2010)
32.
go back to reference Kumaresh, S., Baskaran, R.: Mining software repositories for defect categorization. J. Commun. Softw. Syst. 11(1), 31–36 (2015)CrossRef Kumaresh, S., Baskaran, R.: Mining software repositories for defect categorization. J. Commun. Softw. Syst. 11(1), 31–36 (2015)CrossRef
33.
go back to reference Lehman, M.M., Belady, L.A. (eds.): Program Evolution: Processes of Software Change, 1st edn. Academic Press Professional Inc, San Diego, CA (1985) Lehman, M.M., Belady, L.A. (eds.): Program Evolution: Processes of Software Change, 1st edn. Academic Press Professional Inc, San Diego, CA (1985)
36.
go back to reference Linstead, E., Rigor, P., Bajracharya, S., Lopes, C., Baldi, P.: Mining eclipse developer contributions via author-topic models. In: Fourth International Workshop on Mining Software Repositories (MSR’07:ICSE Workshops 2007), pp. 30–30 (2007) Linstead, E., Rigor, P., Bajracharya, S., Lopes, C., Baldi, P.: Mining eclipse developer contributions via author-topic models. In: Fourth International Workshop on Mining Software Repositories (MSR’07:ICSE Workshops 2007), pp. 30–30 (2007)
42.
44.
go back to reference Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Burlington (1993) Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Burlington (1993)
45.
go back to reference Rebouças, M., Santos, R.O., Pinto, G., Castor, F.: How does contributors’ involvement influence the build status of an open-source software project? In: Proceedings of the 14th International Conference on Mining Software Repositories, MSR ’17, pp. 475–478. IEEE Press, Piscataway, NJ, USA (2017). https://doi.org/10.1109/MSR.2017.32 Rebouças, M., Santos, R.O., Pinto, G., Castor, F.: How does contributors’ involvement influence the build status of an open-source software project? In: Proceedings of the 14th International Conference on Mining Software Repositories, MSR ’17, pp. 475–478. IEEE Press, Piscataway, NJ, USA (2017). https://​doi.​org/​10.​1109/​MSR.​2017.​32
47.
go back to reference Santos, E.A., Hindle, A.: Judging a commit by its cover: Correlating commit message entropy with build status on travis-ci. In: Proceedings of the 13th International Conference on Mining Software Repositories, MSR ’16, pp. 504–507. ACM, New York, NY, USA (2016). https://doi.org/10.1145/2901739.2903493 Santos, E.A., Hindle, A.: Judging a commit by its cover: Correlating commit message entropy with build status on travis-ci. In: Proceedings of the 13th International Conference on Mining Software Repositories, MSR ’16, pp. 504–507. ACM, New York, NY, USA (2016). https://​doi.​org/​10.​1145/​2901739.​2903493
49.
go back to reference Shihab, E., Jiang, Z.M., Hassan, A.E.: On the use of internet relay chat (IRC) meetings by developers of the gnome gtk+ project. In: 2009 6th IEEE International Working Conference on Mining Software Repositories, pp. 107–110 (2009) Shihab, E., Jiang, Z.M., Hassan, A.E.: On the use of internet relay chat (IRC) meetings by developers of the gnome gtk+ project. In: 2009 6th IEEE International Working Conference on Mining Software Repositories, pp. 107–110 (2009)
Metadata
Title
Emerging topics in mining software repositories
Machine learning in software repositories and datasets
Authors
Diego Güemes-Peña
Carlos López-Nozal
Raúl Marticorena-Sánchez
Jesús Maudes-Raedo
Publication date
28-03-2018
Publisher
Springer Berlin Heidelberg
Published in
Progress in Artificial Intelligence / Issue 3/2018
Print ISSN: 2192-6352
Electronic ISSN: 2192-6360
DOI
https://doi.org/10.1007/s13748-018-0147-7

Other articles of this Issue 3/2018

Progress in Artificial Intelligence 3/2018 Go to the issue

Premium Partner