Skip to main content
Erschienen in: Automated Software Engineering 1/2022

01.05.2022

Towards automatic detection and prioritization of pre-logging overhead: a case study of hadoop ecosystem

Erschienen in: Automated Software Engineering | Ausgabe 1/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The pre-logging overhead refers to the overhead produced by pre-logging statements (PLS), which are used to construct the parameters of logging calls. In practice, these logging-related statements are usually guarded by conditional statements, known as logging guards, to ensure that they are only executed when necessary (e.g., at the debugging phase). However, developers occasionally forget to add logging guards to costly PLS, resulting in missing logging guards (MLG) issues, which can have a significant impact on performance, particularly for high-throughput software, and lead to critical performance issues. In this paper, (1) we conduct the first empirical study of 137 commits addressing MLG issues in five popular open-source software of the Hadoop ecosystem. Based on the results, we reveal five findings of the current practice of logging guards. (2) We devise an accurate algorithm to detect PLSs (over 95% in precision and recall) and find out 16 problematic partially guarded logging calls (10 of them are confirmed and fixed by developers). (3) We investigate two metric-based ranking approaches using six software metrics to prioritize PLSs based on their impact on performance. We discover that the execution frequency of PLSs achieves the best performance, and combining multiple software metrics can improve performance (7.8% on average).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Arnold, M., Ryder, B.G.: A framework for reducing the cost of instrumented code. In: Proceedings of 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 168–179 (2001) Arnold, M., Ryder, B.G.: A framework for reducing the cost of instrumented code. In: Proceedings of 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 168–179 (2001)
Zurück zum Zitat Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., Hullender, G.: Learning to rank using gradient descent. In: Proceedings of 2005 International Conference on Machine Learning, pp. 89–96 (2005) Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., Hullender, G.: Learning to rank using gradient descent. In: Proceedings of 2005 International Conference on Machine Learning, pp. 89–96 (2005)
Zurück zum Zitat Busjaeger, B., Xie, T.: Learning for test prioritization: an industrial case study. In: Proceedings of 2016 ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 975–980 (2016) Busjaeger, B., Xie, T.: Learning for test prioritization: an industrial case study. In: Proceedings of 2016 ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 975–980 (2016)
Zurück zum Zitat Cai, H., Zhou, X., Lou, S., Zhang, Y., Huang, G.: LogPruner: a tool for pruning logging call in android apps. In: Proceedings of 2017 Asia-Pacific Symposium on Internetware, pp. 2:1–2:10 (2017) Cai, H., Zhou, X., Lou, S., Zhang, Y., Huang, G.: LogPruner: a tool for pruning logging call in android apps. In: Proceedings of 2017 Asia-Pacific Symposium on Internetware, pp. 2:1–2:10 (2017)
Zurück zum Zitat Cao, Z., Qin, T., Liu, T.-Y., Tsai, M.-F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: Proceedings of 2007 International Conference on Machine Learning, pp. 129–136 (2007) Cao, Z., Qin, T., Liu, T.-Y., Tsai, M.-F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: Proceedings of 2007 International Conference on Machine Learning, pp. 129–136 (2007)
Zurück zum Zitat Chen, B., Jiang, Z.M.J.: Characterizing logging practices in java-based open source software projects—a replication study in apache software foundation. Empir. Softw. Eng. 22(1), 330–374 (2017)CrossRef Chen, B., Jiang, Z.M.J.: Characterizing logging practices in java-based open source software projects—a replication study in apache software foundation. Empir. Softw. Eng. 22(1), 330–374 (2017)CrossRef
Zurück zum Zitat Chen, B., Jiang, Z.M.: Studying the use of Java logging utilities in the wild. In: Proceedings of 2020 IEEE International Conference on Software Engineering (2020) Chen, B., Jiang, Z.M.: Studying the use of Java logging utilities in the wild. In: Proceedings of 2020 IEEE International Conference on Software Engineering (2020)
Zurück zum Zitat Ding, R., Zhou, H., Lou, J.-G., Zhang, H., Lin, Q., Fu, Q., Zhang, D., Xie, T.: Log2: a cost-aware logging mechanism for performance diagnosis. In: Proceedings of 2015 USENIX Annual Technical Conference, pp. 139–150 (2015) Ding, R., Zhou, H., Lou, J.-G., Zhang, H., Lin, Q., Fu, Q., Zhang, D., Xie, T.: Log2: a cost-aware logging mechanism for performance diagnosis. In: Proceedings of 2015 USENIX Annual Technical Conference, pp. 139–150 (2015)
Zurück zum Zitat Ding, Z., Chen, J., Shang, W.: Towards the use of the readily available tests from the release pipeline as performance tests: are we there yet? In Proceedings of 2020 ACM/IEEE International Conference on Software Engineering, pp. 1435–1446 (2020) Ding, Z., Chen, J., Shang, W.: Towards the use of the readily available tests from the release pipeline as performance tests: are we there yet? In Proceedings of 2020 ACM/IEEE International Conference on Software Engineering, pp. 1435–1446 (2020)
Zurück zum Zitat Freund, Y., Iyer, R., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4(Nov), 933–969 (2003)MathSciNetMATH Freund, Y., Iyer, R., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4(Nov), 933–969 (2003)MathSciNetMATH
Zurück zum Zitat Fu, Q., Zhu, J., Hu, W., Lou, J.-G., Ding, R., Lin, Q., Zhang, D., Xie, T.: Where do developers log? An empirical study on logging practices in industry. In: Companion Proceedings of 2014 International Conference on Software Engineering, pp. 24–33 (2014) Fu, Q., Zhu, J., Hu, W., Lou, J.-G., Ding, R., Lin, Q., Zhang, D., Xie, T.: Where do developers log? An empirical study on logging practices in industry. In: Companion Proceedings of 2014 International Conference on Software Engineering, pp. 24–33 (2014)
Zurück zum Zitat Gay, D., Steensgaard, B.: Fast escape analysis and stack allocation for object-based programs. In: Proceedings of 2000 International Conference on Compiler Construction, pp. 82–93 (2000) Gay, D., Steensgaard, B.: Fast escape analysis and stack allocation for object-based programs. In: Proceedings of 2000 International Conference on Compiler Construction, pp. 82–93 (2000)
Zurück zum Zitat Hassani, M., Shang, W., Shihab, E., Tsantalis, N.: Studying and detecting log-related issues. Empir. Softw. Eng. 23(6), 3248–3280 (2018)CrossRef Hassani, M., Shang, W., Shihab, E., Tsantalis, N.: Studying and detecting log-related issues. Empir. Softw. Eng. 23(6), 3248–3280 (2018)CrossRef
Zurück zum Zitat He, P., Chen, Z., He, S., Lyu, M.R.: Characterizing the natural language descriptions in software logging statements. In: Proceedings of 2018 ACM/IEEE International Conference on Automated Software Engineering, pp. 178–189 (2018) He, P., Chen, Z., He, S., Lyu, M.R.: Characterizing the natural language descriptions in software logging statements. In: Proceedings of 2018 ACM/IEEE International Conference on Automated Software Engineering, pp. 178–189 (2018)
Zurück zum Zitat Jia, Z., Li, S., Liu, X., Liao, X., Liu, Y.: SMARTLOG: place error log statement by deep understanding of log intention. In: Proceedings of 2018 IEEE International Conference on Software Analysis, Evolution and Reengineering, pp. 61–71 (2018) Jia, Z., Li, S., Liu, X., Liao, X., Liu, Y.: SMARTLOG: place error log statement by deep understanding of log intention. In: Proceedings of 2018 IEEE International Conference on Software Analysis, Evolution and Reengineering, pp. 61–71 (2018)
Zurück zum Zitat Johnson, B., Song, Y., Murphy-Hill, E., Bowdidge, R.: Why don’t software developers use static analysis tools to find bugs? In: Proceedings of 2013 International Conference on Software Engineering, pp. 672–681 (2013) Johnson, B., Song, Y., Murphy-Hill, E., Bowdidge, R.: Why don’t software developers use static analysis tools to find bugs? In: Proceedings of 2013 International Conference on Software Engineering, pp. 672–681 (2013)
Zurück zum Zitat Lam, P., Bodden, E., Lhoták, O., Hendren, L.: The soot framework for java program analysis: a retrospective. In: Cetus Users and Compiler Infastructure Workshop, vol. 15, p. 35 (2011) Lam, P., Bodden, E., Lhoták, O., Hendren, L.: The soot framework for java program analysis: a retrospective. In: Cetus Users and Compiler Infastructure Workshop, vol. 15, p. 35 (2011)
Zurück zum Zitat Lhoták, O., Hendren, L.: Scaling java points-to analysis using s park. In: International Conference on Compiler Construction, pp. 153–169 (2003) Lhoták, O., Hendren, L.: Scaling java points-to analysis using s park. In: International Conference on Compiler Construction, pp. 153–169 (2003)
Zurück zum Zitat Heng, L., Weiyi, S., Ahmed, E.H.: Which log level should developers choose for a new logging statement? Empir. Softw. Eng. 22(4), 1684–1716 (2017)CrossRef Heng, L., Weiyi, S., Ahmed, E.H.: Which log level should developers choose for a new logging statement? Empir. Softw. Eng. 22(4), 1684–1716 (2017)CrossRef
Zurück zum Zitat Li, Z., Chen, T.-H., Yang, J., Shang, W.: DLFinder: characterizing and detecting duplicate logging code smells. In: Proceedings of 2019 International Conference on Software Engineering, pp. 152–163 (2019) Li, Z., Chen, T.-H., Yang, J., Shang, W.: DLFinder: characterizing and detecting duplicate logging code smells. In: Proceedings of 2019 International Conference on Software Engineering, pp. 152–163 (2019)
Zurück zum Zitat Li, Z., Chen, T.-H., Shang, W.: Where shall we log? Studying and suggesting logging locations in code blocks. In: Proceedings of 2020 IEEE/ACM International Conference on Automated Software Engineering, pp. 361–372. IEEE (2020) Li, Z., Chen, T.-H., Shang, W.: Where shall we log? Studying and suggesting logging locations in code blocks. In: Proceedings of 2020 IEEE/ACM International Conference on Automated Software Engineering, pp. 361–372. IEEE (2020)
Zurück zum Zitat Li, Z., Li, H., Chen, T.-H., Shang, W.: Deeplv: suggesting log levels using ordinal based neural networks. In: Proceedings of 2021 IEEE/ACM International Conference on Software Engineering, pp. 1461–1472 (2021) Li, Z., Li, H., Chen, T.-H., Shang, W.: Deeplv: suggesting log levels using ordinal based neural networks. In: Proceedings of 2021 IEEE/ACM International Conference on Software Engineering, pp. 1461–1472 (2021)
Zurück zum Zitat Liu, Z., Xia, X., Lo, D., Xing, Z., Hassan, A.E., Li, S.: Which variables should i log? IEEE Trans. Softw. Eng. 47(9), 2012–2031 (2021) Liu, Z., Xia, X., Lo, D., Xing, Z., Hassan, A.E., Li, S.: Which variables should i log? IEEE Trans. Softw. Eng. 47(9), 2012–2031 (2021)
Zurück zum Zitat Lyu, Y., Li, D., Halfond, W.G.J.: Remove RATs from your code: automated optimization of resource inefficient database writes for mobile applications. In: Proceedings of 2018 ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 310–321 (2018) Lyu, Y., Li, D., Halfond, W.G.J.: Remove RATs from your code: automated optimization of resource inefficient database writes for mobile applications. In: Proceedings of 2018 ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 310–321 (2018)
Zurück zum Zitat Mizouchi, T., Shimari, K., Ishio, T., Inoue, K.: PADLA: a dynamic log level adapter using online phase detection. In: Proceedings of 2019 International Conference on Program Comprehension, pp. 135–138 (2019) Mizouchi, T., Shimari, K., Ishio, T., Inoue, K.: PADLA: a dynamic log level adapter using online phase detection. In: Proceedings of 2019 International Conference on Program Comprehension, pp. 135–138 (2019)
Zurück zum Zitat Mostafa, S., Wang, X., Xie, T.: PerfRanker: prioritization of performance regression tests for collection-intensive software. In: Proceedings of 2017 ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 23–34 (2017) Mostafa, S., Wang, X., Xie, T.: PerfRanker: prioritization of performance regression tests for collection-intensive software. In: Proceedings of 2017 ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 23–34 (2017)
Zurück zum Zitat Nistor, A., Chang, P.-C., Radoi, C., Lu, S.: Caramel: detecting and fixing performance problems that have non-intrusive fixes. In: Proceedings of 2015 ACM/IEEE International Conference on Software Engineering, pp. 902–912 (2015) Nistor, A., Chang, P.-C., Radoi, C., Lu, S.: Caramel: detecting and fixing performance problems that have non-intrusive fixes. In: Proceedings of 2015 ACM/IEEE International Conference on Software Engineering, pp. 902–912 (2015)
Zurück zum Zitat Pecchia, A., Cinque, M., Carrozza, G., Cotroneo, D.: Industry practices and event logging: assessment of a critical software development process. In: Proceedings of 2015 IEEE/ACM International Conference on Software Engineering, pp. 169–178 (2015) Pecchia, A., Cinque, M., Carrozza, G., Cotroneo, D.: Industry practices and event logging: assessment of a critical software development process. In: Proceedings of 2015 IEEE/ACM International Conference on Software Engineering, pp. 169–178 (2015)
Zurück zum Zitat Qian, J., Zhou, Y., Xu, B.: Improving side-effect analysis with lazy access path resolving. In: Proceedings of 2009 IEEE International Working Conference on Source Code Analysis and Manipulation, pp. 35–44 (2009) Qian, J., Zhou, Y., Xu, B.: Improving side-effect analysis with lazy access path resolving. In: Proceedings of 2009 IEEE International Working Conference on Source Code Analysis and Manipulation, pp. 35–44 (2009)
Zurück zum Zitat Rong, G., Xu, Y., Gu, S., Zhang, H., Shao, D.: Can you capture information as you intend to? A case study on logging practice in industry. In: Proceedings of 2020 IEEE International Conference on Software Maintenance and Evolution, pp. 12–22 (2020) Rong, G., Xu, Y., Gu, S., Zhang, H., Shao, D.: Can you capture information as you intend to? A case study on logging practice in industry. In: Proceedings of 2020 IEEE International Conference on Software Maintenance and Evolution, pp. 12–22 (2020)
Zurück zum Zitat Smaragdakis, Y., Balatsouras, G.: Pointer analysis. Found. Trends Program. Lang. 2(1), 1–69 (2015)CrossRef Smaragdakis, Y., Balatsouras, G.: Pointer analysis. Found. Trends Program. Lang. 2(1), 1–69 (2015)CrossRef
Zurück zum Zitat Snelting, G., Robschink, T., Krinke, J.: Efficient path conditions in dependence graphs for software safety analysis. ACM Trans. Softw. Eng. Methodol. 15(4), 410–457 (2006)CrossRef Snelting, G., Robschink, T., Krinke, J.: Efficient path conditions in dependence graphs for software safety analysis. ACM Trans. Softw. Eng. Methodol. 15(4), 410–457 (2006)CrossRef
Zurück zum Zitat Weiser, M.: Program slicing. IEEE Trans. Softw. Eng. 4, 352–357 (1984)CrossRef Weiser, M.: Program slicing. IEEE Trans. Softw. Eng. 4, 352–357 (1984)CrossRef
Zurück zum Zitat Yang, N., Cuijpers, P., Schiffelers, R., Lukkien, J., Serebrenik, A.: An interview study of how developers use execution logs in embedded software engineering. In: Proceedings of 2021 IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, pp. 61–70 (2021) Yang, N., Cuijpers, P., Schiffelers, R., Lukkien, J., Serebrenik, A.: An interview study of how developers use execution logs in embedded software engineering. In: Proceedings of 2021 IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, pp. 61–70 (2021)
Zurück zum Zitat Yuan, D., Park, S., Zhou, Y.: Characterizing logging practices in open-source software. In: Proceedings of 2012 IEEE/ACM International Conference on Software Engineering, pp. 102–112 (2012) Yuan, D., Park, S., Zhou, Y.: Characterizing logging practices in open-source software. In: Proceedings of 2012 IEEE/ACM International Conference on Software Engineering, pp. 102–112 (2012)
Zurück zum Zitat Zhao, G., Alencar da Costa, D., Zou, Y.: Improving the pull requests review process using learning-to-rank algorithms. Empir. Softw. Eng. 24(4), 2140–2170 (2019)CrossRef Zhao, G., Alencar da Costa, D., Zou, Y.: Improving the pull requests review process using learning-to-rank algorithms. Empir. Softw. Eng. 24(4), 2140–2170 (2019)CrossRef
Zurück zum Zitat Zhao, X., Rodrigues, K., Luo, Y., Stumm, M., Yuan, D., Zhou, Y.: Log20: fully automated optimal placement of log printing statements under specified overhead threshold. In: Proceedings of 2017 Symposium on Operating Systems Principles, pp. 565–581 (2017) Zhao, X., Rodrigues, K., Luo, Y., Stumm, M., Yuan, D., Zhou, Y.: Log20: fully automated optimal placement of log printing statements under specified overhead threshold. In: Proceedings of 2017 Symposium on Operating Systems Principles, pp. 565–581 (2017)
Zurück zum Zitat Zhi, C., Yin, J., Deng, S., Ye, M., Fu, M., Xie, T.: An exploratory study of logging configuration practice in Java. In: Proceedings of 2019 IEEE International Conference on Software Maintenance and Evolution, pp. 459–469 (2019) Zhi, C., Yin, J., Deng, S., Ye, M., Fu, M., Xie, T.: An exploratory study of logging configuration practice in Java. In: Proceedings of 2019 IEEE International Conference on Software Maintenance and Evolution, pp. 459–469 (2019)
Zurück zum Zitat Zhu, J., He, P., Fu, Q., Zhang, H., Lyu, M.R., Zhang, D.: Learning to log: helping developers make informed logging decisions. In: Proceedings of 2015 IEEE/ACM IEEE International Conference on Software Engineering, pp. 415–425 (2015) Zhu, J., He, P., Fu, Q., Zhang, H., Lyu, M.R., Zhang, D.: Learning to log: helping developers make informed logging decisions. In: Proceedings of 2015 IEEE/ACM IEEE International Conference on Software Engineering, pp. 415–425 (2015)
Metadaten
Titel
Towards automatic detection and prioritization of pre-logging overhead: a case study of hadoop ecosystem
Publikationsdatum
01.05.2022
Erschienen in
Automated Software Engineering / Ausgabe 1/2022
Print ISSN: 0928-8910
Elektronische ISSN: 1573-7535
DOI
https://doi.org/10.1007/s10515-021-00317-7

Weitere Artikel der Ausgabe 1/2022

Automated Software Engineering 1/2022 Zur Ausgabe