Skip to main content
Top

2019 | OriginalPaper | Chapter

Can Commit Change History Reveal Potential Fault Prone Classes? A Study on GitHub Repositories

Authors : Chun Yong Chong, Sai Peck Lee

Published in: Software Technologies

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Various studies had successfully utilized graph theory analysis as a way to gain a high-level abstraction view of the software systems, such as constructing the call graph to visualize the dependencies among software components. The level of granularity and information shown by the graph usually depends on the input such as variable, method, class, package, or combination of multiple levels. However, there are very limited studies that investigated how software evolution and change history can be used as a basis to model software-based complex network. It is a common understanding that stable and well-designed source code will have less update throughout a software development lifecycle. It is only those code that were badly design tend to get updated due to broken dependencies, high coupling, or dependencies with other classes. This paper put forward an approach to model a commit change-based weighted complex network based on historical software change and evolution data captured from GitHub repositories with the aim to identify potential fault prone classes. Four well-established graph centrality metrics were used as a proxy metric to discover fault prone classes. Experiments on ten open-source projects discovered that when all centrality metrics are used together, it can yield reasonably good precision when compared against the ground truth.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Ma, Y.T., He, K.Q., Li, B., Liu, J., Zhou, X.Y.: A hybrid set of complexity metrics for large-scale object-oriented software systems. J. Comput. Sci. Technol. 25, 1184–1201 (2010)CrossRef Ma, Y.T., He, K.Q., Li, B., Liu, J., Zhou, X.Y.: A hybrid set of complexity metrics for large-scale object-oriented software systems. J. Comput. Sci. Technol. 25, 1184–1201 (2010)CrossRef
2.
go back to reference Concas, G., Marchesi, M., Murgia, A., Tonelli, R., Turnu, I.: On the Distribution of Bugs in the Eclipse System. IEEE T Softw. Eng. 37, 872–877 (2011)CrossRef Concas, G., Marchesi, M., Murgia, A., Tonelli, R., Turnu, I.: On the Distribution of Bugs in the Eclipse System. IEEE T Softw. Eng. 37, 872–877 (2011)CrossRef
3.
go back to reference Turnu, I., Concas, G., Marchesi, M., Tonelli, R.: The fractal dimension of software networks as a global quality metric. Inform. Sci. 245, 290–303 (2013)MathSciNetCrossRef Turnu, I., Concas, G., Marchesi, M., Tonelli, R.: The fractal dimension of software networks as a global quality metric. Inform. Sci. 245, 290–303 (2013)MathSciNetCrossRef
4.
go back to reference Zimmermann, T., Nagappan, N.: Predicting defects using network analysis on dependency graphs. In: Proceedings of the 30th International Conference on Software Engineering, pp. 531–540. ACM (2008) Zimmermann, T., Nagappan, N.: Predicting defects using network analysis on dependency graphs. In: Proceedings of the 30th International Conference on Software Engineering, pp. 531–540. ACM (2008)
5.
go back to reference Hyland-Wood, D., Carrington, D., Kaplan, S.: Scale-free nature of java software package, class and method collaboration graphs. In: Proceedings of the 5th International Symposium on Empirical Software Engineering, Rio de Janeiro, Brasil (2006) Hyland-Wood, D., Carrington, D., Kaplan, S.: Scale-free nature of java software package, class and method collaboration graphs. In: Proceedings of the 5th International Symposium on Empirical Software Engineering, Rio de Janeiro, Brasil (2006)
6.
go back to reference Chong, C.Y., Lee, S.P.: Analyzing maintainability and reliability of object-oriented software using weighted complex network. J. Syst. Softw. 110, 28–53 (2015)CrossRef Chong, C.Y., Lee, S.P.: Analyzing maintainability and reliability of object-oriented software using weighted complex network. J. Syst. Softw. 110, 28–53 (2015)CrossRef
7.
go back to reference Chong, C.Y., Lee, S.P.: Automatic clustering constraints derivation from object-oriented software using weighted complex network with graph theory analysis. J. Syst. Softw. 133, 28–53 (2017)CrossRef Chong, C.Y., Lee, S.P.: Automatic clustering constraints derivation from object-oriented software using weighted complex network with graph theory analysis. J. Syst. Softw. 133, 28–53 (2017)CrossRef
8.
go back to reference Myers, C.R.: Software systems as complex networks: structure, function, and evolvability of software collaboration graphs. Phys. Rev. E 68, 046116 (2003)CrossRef Myers, C.R.: Software systems as complex networks: structure, function, and evolvability of software collaboration graphs. Phys. Rev. E 68, 046116 (2003)CrossRef
9.
go back to reference Kalliamvakou, E., Gousios, G., Blincoe, K., Singer, L., German, D.M., Damian, D.: An in-depth study of the promises and perils of mining GitHub. Empirical Softw. Eng. 21(5), 2035–2071 (2016)CrossRef Kalliamvakou, E., Gousios, G., Blincoe, K., Singer, L., German, D.M., Damian, D.: An in-depth study of the promises and perils of mining GitHub. Empirical Softw. Eng. 21(5), 2035–2071 (2016)CrossRef
10.
go back to reference Begel, A., Bosch, J., Storey, M.A.: Social networking meets software development: perspectives from GitHub, MSDN, stack exchange, and TopCoder. Softw. IEEE 30, 52–66 (2013)CrossRef Begel, A., Bosch, J., Storey, M.A.: Social networking meets software development: perspectives from GitHub, MSDN, stack exchange, and TopCoder. Softw. IEEE 30, 52–66 (2013)CrossRef
11.
go back to reference Gousios, G., Pinzger, M., Deursen, A.V.: An exploratory study of the pull-based software development model. In: Proceedings of the 36th International Conference on Software Engineering, pp. 345–355. ACM, Hyderabad (2014) Gousios, G., Pinzger, M., Deursen, A.V.: An exploratory study of the pull-based software development model. In: Proceedings of the 36th International Conference on Software Engineering, pp. 345–355. ACM, Hyderabad (2014)
12.
go back to reference Nagappan, N., Zeller, A., Zimmermann, T., Herzig, K., Murphy, B.: Change bursts as defect predictors. In: 2010 IEEE 21st International Symposium on Software Reliability Engineering (ISSRE), pp. 309–318. IEEE (2010) Nagappan, N., Zeller, A., Zimmermann, T., Herzig, K., Murphy, B.: Change bursts as defect predictors. In: 2010 IEEE 21st International Symposium on Software Reliability Engineering (ISSRE), pp. 309–318. IEEE (2010)
13.
go back to reference Chong, C.Y., Lee, S.P.: A commit change-based weighted complex network approach to identify potential fault prone classes. In: 13th International Conference on Software Technologies, pp. 471–482 (2018) Chong, C.Y., Lee, S.P.: A commit change-based weighted complex network approach to identify potential fault prone classes. In: 13th International Conference on Software Technologies, pp. 471–482 (2018)
14.
go back to reference Potanin, A., Noble, J., Frean, M., Biddle, R.: Scale-free geometry in OO programs. Commun. ACM 48, 99–103 (2005)CrossRef Potanin, A., Noble, J., Frean, M., Biddle, R.: Scale-free geometry in OO programs. Commun. ACM 48, 99–103 (2005)CrossRef
15.
go back to reference Concas, G., Marchesi, M., Pinna, S., Serra, N.: Power-laws in a large object-oriented software system. IEEE Trans. Softw. Eng. 33, 687–708 (2007)CrossRef Concas, G., Marchesi, M., Pinna, S., Serra, N.: Power-laws in a large object-oriented software system. IEEE Trans. Softw. Eng. 33, 687–708 (2007)CrossRef
16.
go back to reference Louridas, P., Spinellis, D., Vlachos, V.: Power laws in software. ACM Trans. Softw. Eng. Methodol. 18, 1–26 (2008)CrossRef Louridas, P., Spinellis, D., Vlachos, V.: Power laws in software. ACM Trans. Softw. Eng. Methodol. 18, 1–26 (2008)CrossRef
17.
go back to reference Pang, T.Y., Maslov, S.: Universal distribution of component frequencies in biological and technological systems. Proc. Nat. Acad. Sci. 110(15), 6235–6239 (2013)MathSciNetCrossRef Pang, T.Y., Maslov, S.: Universal distribution of component frequencies in biological and technological systems. Proc. Nat. Acad. Sci. 110(15), 6235–6239 (2013)MathSciNetCrossRef
18.
go back to reference Baxter, G., et al.: Understanding the shape of Java software. In: Sigplan Notices, vol. 41, pp. 397–412 (2006)CrossRef Baxter, G., et al.: Understanding the shape of Java software. In: Sigplan Notices, vol. 41, pp. 397–412 (2006)CrossRef
19.
20.
go back to reference Oyetoyan, T.D., Falleri, J.R., Dietrich, J., Jezek, K.: Circular dependencies and change-proneness: an empirical study. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 241–250 (2015) Oyetoyan, T.D., Falleri, J.R., Dietrich, J., Jezek, K.: Circular dependencies and change-proneness: an empirical study. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 241–250 (2015)
22.
go back to reference Zhang, B., Huang, G., Zheng, Z., Ren, J., Hu, C.: Approach to mine the modularity of software network based on the most vital nodes. IEEE Access (2018) Zhang, B., Huang, G., Zheng, Z., Ren, J., Hu, C.: Approach to mine the modularity of software network based on the most vital nodes. IEEE Access (2018)
23.
go back to reference Muthukumaran, K., Choudhary, A., Murthy, N.L.B.: Mining GitHub for novel change metrics to predict buggy files in software systems. In: 2015 International Conference on Computational Intelligence and Networks, pp. 15–20 (2015) Muthukumaran, K., Choudhary, A., Murthy, N.L.B.: Mining GitHub for novel change metrics to predict buggy files in software systems. In: 2015 International Conference on Computational Intelligence and Networks, pp. 15–20 (2015)
24.
go back to reference Hassan, A.E.: Predicting faults using the complexity of code changes. In: Proceedings of the 31st International Conference on Software Engineering, pp. 78–88. IEEE Computer Society (2009) Hassan, A.E.: Predicting faults using the complexity of code changes. In: Proceedings of the 31st International Conference on Software Engineering, pp. 78–88. IEEE Computer Society (2009)
25.
go back to reference Wiese, I.S., Kuroda, R.T., Re, R., Oliva, G.A., Gerosa, M.A.: An empirical study of the relation between strong change coupling and defects using history and social metrics in the apache aries project. In: Damiani, E., Frati, F., Riehle, D., Wasserman, Anthony I. (eds.) OSS 2015. IAICT, vol. 451, pp. 3–12. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17837-0_1CrossRef Wiese, I.S., Kuroda, R.T., Re, R., Oliva, G.A., Gerosa, M.A.: An empirical study of the relation between strong change coupling and defects using history and social metrics in the apache aries project. In: Damiani, E., Frati, F., Riehle, D., Wasserman, Anthony I. (eds.) OSS 2015. IAICT, vol. 451, pp. 3–12. Springer, Cham (2015). https://​doi.​org/​10.​1007/​978-3-319-17837-0_​1CrossRef
26.
go back to reference Ambros, M.D., Lanza, M., Robbes, R.: On the relationship between change coupling and software defects. In: 2009 16th Working Conference on Reverse Engineering, pp. 135–144 (2009) Ambros, M.D., Lanza, M., Robbes, R.: On the relationship between change coupling and software defects. In: 2009 16th Working Conference on Reverse Engineering, pp. 135–144 (2009)
27.
go back to reference Ajienka, N., Capiluppi, A.: Understanding the interplay between the logical and structural coupling of software classes. J. Syst. Softw. 134, 120–137 (2017)CrossRef Ajienka, N., Capiluppi, A.: Understanding the interplay between the logical and structural coupling of software classes. J. Syst. Softw. 134, 120–137 (2017)CrossRef
28.
go back to reference Zimmermann, T., Weisgerber, P., Diehl, S., Zeller, A.: Mining version histories to guide software changes. In: Proceedings of the 26th International Conference on Software Engineering, pp. 563–572. IEEE Computer Society (2004) Zimmermann, T., Weisgerber, P., Diehl, S., Zeller, A.: Mining version histories to guide software changes. In: Proceedings of the 26th International Conference on Software Engineering, pp. 563–572. IEEE Computer Society (2004)
29.
go back to reference Kagdi, H., Gethers, M., Poshyvanyk, D.: Integrating conceptual and logical couplings for change impact analysis in software. Empirical Softw. Eng. 18, 933–969 (2013)CrossRef Kagdi, H., Gethers, M., Poshyvanyk, D.: Integrating conceptual and logical couplings for change impact analysis in software. Empirical Softw. Eng. 18, 933–969 (2013)CrossRef
30.
go back to reference Yang, X., Lo, D., Xia, X., Sun, J.: TLEL: a two-layer ensemble learning approach for just-in-time defect prediction. Inf. Softw. Technol. 87, 206–220 (2017)CrossRef Yang, X., Lo, D., Xia, X., Sun, J.: TLEL: a two-layer ensemble learning approach for just-in-time defect prediction. Inf. Softw. Technol. 87, 206–220 (2017)CrossRef
31.
go back to reference Xia, X., Lo, D., Pan, S.J., Nagappan, N., Wang, X.: HYDRA: massively compositional model for cross-project defect prediction. IEEE T. Softw. Eng. 42, 977–998 (2016)CrossRef Xia, X., Lo, D., Pan, S.J., Nagappan, N., Wang, X.: HYDRA: massively compositional model for cross-project defect prediction. IEEE T. Softw. Eng. 42, 977–998 (2016)CrossRef
32.
go back to reference Huang, Q., Xia, X., Lo, D.: Supervised vs unsupervised models: a holistic look at effort-aware just-in-time defect prediction. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 159–170 (2017) Huang, Q., Xia, X., Lo, D.: Supervised vs unsupervised models: a holistic look at effort-aware just-in-time defect prediction. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 159–170 (2017)
33.
go back to reference Guerrouj, L., et al.: Investigating the relation between lexical smells and change-and fault-proneness: an empirical study. Softw. Qual. J. 25, 641–670 (2017)CrossRef Guerrouj, L., et al.: Investigating the relation between lexical smells and change-and fault-proneness: an empirical study. Softw. Qual. J. 25, 641–670 (2017)CrossRef
34.
go back to reference Arnaoudova, V., Di Penta, M., Antoniol, G.: Linguistic antipatterns: what they are and how developers perceive them. Empirical Softw. Eng. 21, 104–158 (2016)CrossRef Arnaoudova, V., Di Penta, M., Antoniol, G.: Linguistic antipatterns: what they are and how developers perceive them. Empirical Softw. Eng. 21, 104–158 (2016)CrossRef
Metadata
Title
Can Commit Change History Reveal Potential Fault Prone Classes? A Study on GitHub Repositories
Authors
Chun Yong Chong
Sai Peck Lee
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-29157-0_12

Premium Partner