skip to main content
10.1145/3092703.3092725acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article
Public Access
Artifacts Available
Artifacts Evaluated & Functional

PerfRanker: prioritization of performance regression tests for collection-intensive software

Published:10 July 2017Publication History

ABSTRACT

Regression performance testing is an important but time/resource-consuming phase during software development. Developers need to detect performance regressions as early as possible to reduce their negative impact and fixing cost. However, conducting regression performance testing frequently (e.g., after each commit) is prohibitively expensive. To address this issue, in this paper, we propose PerfRanker, the first approach to prioritizing test cases in performance regression testing for collection-intensive software, a common type of modern software heavily using collections. Our test prioritization is based on performance impact analysis that estimates the performance impact of a given code revision on a given test execution. Evaluation shows that our approach can cover top 3 test cases whose performance is most affected within top 30% to 37% prioritized test cases, in contrast to top 65% to 79% by 3 baseline techniques.

Skip Supplemental Material Section

Supplemental Material

References

  1. Apache Commons Maths. http://commons.apache.org/proper/commons-math/Google ScholarGoogle Scholar
  2. Xalan for Java. https://xml.apache.org/xalan-j/ 2017. PerfRanker Project Web. (2017).Google ScholarGoogle Scholar
  3. https://sites.google.com/site/ perfranker2017/Google ScholarGoogle Scholar
  4. Marcos K. Aguilera, Jeffrey C. Mogul, Janet L. Wiener, Patrick Reynolds, and Athicha Muthitacharoen. 2003. Performance Debugging for Distributed Systems of Black Boxes. In Proceedings of the 19th ACM Symposium on Operating Systems Principles. 74–89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Azzah Al-Maskari, Mark Sanderson, and Paul Clough. 2007. The Relationship Between IR Effectiveness Measures and User Satisfaction. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 773–774. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Taweesup Apiwattanapong, Alessandro Orso, and Mary J. Harrold. 2005. Efficient and Precise Dynamic Impact Analysis Using Execute-after Sequences. In Proceedings of the 27th International Conference on Software Engineering. 432–441. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Robert S. Arnold. 1996. Software Change Impact Analysis. IEEE Computer Society Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cornel Barna, Marin Litoiu, and Hamoun Ghanbari. 2011. Model-based Performance Testing (NIER Track). In Proceedings of the 33rd International Conference on Software Engineering. 872–875. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jennifer Black, Emanuel Melachrinoudis, and David Kaeli. 2004. Bi-Criteria Models for All-Uses Test Suite Reduction. In Proceedings of the 26th International Conference on Software Engineering. 106–115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khang, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanović, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications. 169–190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Eric Bodden. 2012. Inter-procedural Data-flow Analysis with IFDS/IDE and Soot. In Proceedings of the ACM SIGPLAN International Workshop on State of the Art in Java Program Analysis. 3–8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Bihuan Chen, Yang Liu, and Wei Le. 2016. Generating Performance Distributions via Probabilistic Symbolic Execution. In Proceedings of the 38th International Conference on Software Engineering. 49–60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Yih-Farn Chen, David S. Rosenblum, and Kiem-Phong Vo. 1994. TESTTUBE: A System for Selective Regression Testing. In Proceedings of 16th International Conference on Software Engineering. 211–220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Giovanni Denaro, Andrea Polini, and Wolfgang Emmerich. 2004. Early Performance Testing of Distributed Software Applications. In Proceedings of the 4th International Workshop on Software and Performance. 94–103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sebastian Elbaum, Alexey G. Malishevsky, and Gregg Rothermel. 2000. Prioritizing Test Cases for Regression Testing. In Proceedings of the 2000 ACM SIGSOFT International Symposium on Software Testing and Analysis. 102–112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. King Chun Foo. 2011. Automated Discovery of Performance Regressions in Enterprise Applications. In Master’s Thesis.Google ScholarGoogle Scholar
  17. King Chun Foo, Zhen Ming Jiang, Bram Adams, Ahmed E. Hassan, Ying Zou, and Parminder Flora. 2010. Mining Performance Regression Testing Repositories for Automated Performance Analysis. In 2010 10th International Conference on Quality Software. 32–41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Gregory Fox. 1989. Performance Engineering As a Part of the Development Life Cycle for Large-scale Software Systems. In Proceedings of the 11th International Conference on Software Engineering (ICSE ’89). 85–94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Mark Grechanik, Chen Fu, and Qing Xie. 2012. Automatically Finding Performance Problems with Feedback-directed Learning Software Testing. In Proceedings of the 34th International Conference on Software Engineering. 156–166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Sumit Gulwani, Krishna K. Mehra, and Trishul Chilimbi. 2009. SPEED: Precise and Efficient Static Estimation of Program Computational Complexity. In Proceedings of the 36th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 127–139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shi Han, Yingnong Dang, Song Ge, Dongmei Zhang, and Tao Xie. 2012. Performance Debugging in the Large via Mining Millions of Stack Traces. In Proceedings of the 34th International Conference on Software Engineering (ICSE ’12). 145–155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Dan Hao, Tao Xie, Lu Zhang, Xiaoyin Wang, Jiasu Sun, and Hong Mei. 2009. Test Input Reduction for Result Inspection to Facilitate Fault Localization. Automated Software Engineering 17, 1 (2009), 5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Jean Harrold, Rajiv Gupta, and Mary Lou Soffa. 1993. A Methodology for Controlling the Size of a Test Suite. ACM Trans. Softw. Eng. Methodol. 2, 3 (July 1993), 270–285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Peng Huang, Xiao Ma, Dongcai Shen, and Yuanyuan Zhou. 2014. Performance Regression Testing Target Prioritization via Performance Risk Analysis. In Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). 60–71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Guoliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu. 2012. Understanding and Detecting Real-world Performance Bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation. 77–88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Milan Jovic, Andrea Adamoli, and Matthias Hauswirth. 2011. Catch Me if You Can: Performance Bug Detection in the Wild. In Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications. 155–170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Tomas Kalibera, Lubomir Bulej, and Petr Tuma. 2005. Automated Detection of Performance Regressions: The Mono Experience. In 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems. 183–190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Charles Killian, Karthik Nagaraj, Salman Pervez, Ryan Braud, James W. Anderson, and Ranjit Jhala. 2010. Finding Latent Performance Bugs in Systems Implementations. In Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering. 17–26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jung-Min Kim and Adam Porter. 2002. A History-based Test Prioritization Technique for Regression Testing in Resource Constrained Environments. In Proceedings of the 24th International Conference on Software Engineering. 119–129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Yongin Kwon, Sangmin Lee, Hayoon Yi, Donghyun Kwon, Seungjun Yang, Byung-Gon Chun, Ling Huang, Petros Maniatis, Mayur Naik, and Yunheung Paek. 2013.Google ScholarGoogle Scholar
  31. Mantis: Automatic Performance Prediction for Smartphone Applications. In Proceedings of the 2013 USENIX Conference on Annual Technical Conference. 297– 308. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. James Law and Gregg Rothermel. 2003. Whole Program Path-Based Dynamic Impact Analysis. In Proceedings of the 25th International Conference on Software Engineering. 308–318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Andrew W. Leung, Eric Lalonde, Jacob Telleen, James Davis, and Carlos Maltzahn. 2007. Using Comprehensive Analysis for Performance Debugging in Distributed Storage Systems. In 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007). 281–286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ondřej Lhoták and Laurie Hendren. 2003. Scaling Java Points-to Analysis Using Spark. In Compiler Construction: 12th International Conference, CC 2003. 153–169. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Zheng Li, Mark Harman, and Robert M. Hierons. 2007. Search Algorithms for Regression Test Case Prioritization. IEEE Trans. Softw. Eng. 33, 4 (April 2007), 225–237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Tongping Liu, Charlie Curtsinger, and Emery D. Berger. 2016. DoubleTake: Fast and Precise Error Detection via Evidence-based Dynamic Analysis. In Proceedings of the 38th International Conference on Software Engineering. 911–922. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Tongping Liu and Xu Liu. 2016. Cheetah: Detecting False Sharing Efficiently and Effectively. In Proceedings of the 2016 International Symposium on Code Generation and Optimization. 1–11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Qi Luo. 2016. Automatic Performance Testing Using Input-sensitive Profiling. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 1139–1141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Qi Luo, Denys Poshyvanyk, and Mark Grechanik. 2016. Mining Performance Regression Inducing Code Changes in Evolving Software. In Proceedings of the 13th International Conference on Mining Software Repositories. 25–36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Alexey G. Malishevsky, Joseph R. Ruthruff, Gregg Rothermel, and Sebastian Elbaum. 2006. Cost-Cognizant Test Case Prioritization. Technical Report.Google ScholarGoogle Scholar
  41. Marissa Mayer. In Search of a Better, Faster, Stronger Web. http://goo.gl/m4fXxGoogle ScholarGoogle Scholar
  42. Sichen Meng, Xiaoyin Wang, Lu Zhang, and Hong Mei. 2012. A History-Based Matching Approach to Identification of Framework Evolution. In 2012 34th International Conference on Software Engineering (ICSE). 353–363. Google ScholarGoogle ScholarCross RefCross Ref
  43. M. MITCHELL. GCC Performance Regression Testing Discussion. http://gcc.gnu. org/ml/gcc/2005-11/msg01306Google ScholarGoogle Scholar
  44. Shaikh Mostafa and Xiaoyin Wang. 2014. An Empirical Study on the Usage of Mocking Frameworks in Software Testing. In 14th International Conference on Quality Software. 127–132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Adrian Nistor, Po-Chun Chang, Cosmin RaGoogle ScholarGoogle Scholar
  46. Adrian Nistor, Linhai Song, Darko Marinov, and Shan Lu. 2013. Toddler: Detecting Performance Problems via Similar Memory-Access Patterns. In 2013 35th International Conference on Software Engineering (ICSE). 562–571. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Alessandro Orso, Taweesup Apiwattanapong, and Mary J. Harrold. 2003. Leveraging Field Data for Impact Analysis and Regression Testing. In Proceedings of the 9th European Software Engineering Conference Held Jointly with 11th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 128– 137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Rohan Padhye and Koushik Sen. 2017. TRAVIOLI: A Dynamic Analysis for Detecting Data-Structure Traversals. In Proceedings of the International Conference on Software Engineering. 473–483. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Heekwon Park, Seungjae Baek, Jongmoo Choi, Donghee Lee, and Sam H. Noh. 2013. Regularities Considered Harmful: Forcing Randomness to Memory Accesses to Reduce Row Buffer Conflicts for Multi-core, Multi-bank Systems. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems. 181–192. ISSTA’17, July 2017, Santa Barbara, CA, USA Shaikh Mostafa, Xiaoyin Wang, and Tao Xie Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Michael Pradel, Markus Huggler, and Thomas R. Gross. 2014. Performance Regression Testing of Concurrent Classes. In Proceedings of the International Symposium on Software Testing and Analysis. 13–25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. C. Pyo, S. Pae, and G. Lee. 2009. DRAM as Source of Randomness. Electronics Letters 45, 1 (2009), 26–27.Google ScholarGoogle ScholarCross RefCross Ref
  52. Thomas Reps, Susan Horwitz, and Mooly Sagiv. 1995. Precise Interprocedural Dataflow Analysis via Graph Reachability. In Proceedings of the 22Nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 49–61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Gregg Rothermel and Mary J. Harrold. 1997. A Safe, Efficient Regression Test Selection Technique. ACM Trans. Softw. Eng. Methodol. 6, 2 (April 1997), 173–210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Gregg Rothermel, Roland H. Untch, Chengyun Chu, and Mary J. Harrold. 1999. Test Case Prioritization: An Empirical Study. In Software Maintenance, 1999. (ICSM ’99) Proceedings. IEEE International Conference on. 179–188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Kai Shen, Ming Zhong, and Chuanpeng Li. 2005. I/O System Performance Debugging Using Model-driven Anomaly Characterization. In Proceedings of the 4th Conference on USENIX Conference on File and Storage Technologies - Volume 4. 23–23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Mark Sherriff and Laurie Williams. 2008. Empirical Software Change Impact Analysis Using Singular Value Decomposition. In Proceedings of the 2008 International Conference on Software Testing, Verification, and Validation. 268–277. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Stoyan Stefanov. 2008. Yslow 2.0. In CSDN Software Development 2.0 Conference.Google ScholarGoogle Scholar
  58. Richard J. Turver and Malcolm Munro. 1994. An early impact analysis technique for software maintenance. Journal of Software Maintenance: Research and Practice 6, 1 (1994), 35–52.Google ScholarGoogle ScholarCross RefCross Ref
  59. Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie Hendren, Patrick Lam, and Vijay Sundaresan. 1999. Soot - a Java Bytecode Optimization Framework. In Proceedings of the 1999 Conference of the Centre for Advanced Studies on Collaborative Research (CASCON ’99). 13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Xiaoyin Wang, David Lo, Jiefeng Cheng, Lu Zhang, Hong Mei, and Jeffrey Xu Yu. 2010. Matching Dependence-related Queries in the System Dependence Graph. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering. 457–466. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Elaine J. Weyuker and Filippos I. Vokolos. 2000. Experience with Performance Testing of Software Systems: Issues, an Approach, and Case Study. IEEE Trans. Softw. Eng. 26, 12 (Dec. 2000), 1147–1156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Elain J. Weyuker and Filippos I. Vokolos. 2000. Experience with Performance Testing of Software Systems: Issues, An Approach, and Case Study. IEEE Transactions on Software Engineering 26, 12 (2000), 1147–1156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Paul R. Wilson. 1992. Uniprocessor Garbage Collection Techniques. 1–42.Google ScholarGoogle Scholar
  64. Dacong Yan, Guoqing Xu, and Atanas Rountev. 2012. Uncovering Performance Problems in Java Applications with Reference Propagation Profiling. In Proceedings of the 34th International Conference on Software Engineering. 134–144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Shahed Zaman, Bram Adams, and Ahmed E. Hassan. 2012. A Qualitative Study on Performance Bugs. In Proceedings of the 9th IEEE Working Conference on Mining Software Repositories. 199–208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Lingming Zhang, Dan Hao, Lu Zhang, Gregg Rothermel, and Hong Mei. 2013. Bridging the Gap Between the Total and Additional Test-case Prioritization Strategies. In Proceedings of the 2013 International Conference on Software Engineering (ICSE ’13). 192–201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Pingyu Zhang, Sebastian Elbaum, and Matthew B. Dwyer. 2011. Automatic Generation of Load Tests. In Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering. 43–52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Hao Zhong, Lu Zhang, and Hong Mei. 2006. An Experimental Comparison of Four Test Suite Reduction Techniques. In Proceedings of the 28th International Conference on Software Engineering. 636–640. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. PerfRanker: prioritization of performance regression tests for collection-intensive software

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis
      July 2017
      447 pages
      ISBN:9781450350761
      DOI:10.1145/3092703
      • General Chair:
      • Tevfik Bultan,
      • Program Chair:
      • Koushik Sen

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 July 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate58of213submissions,27%

      Upcoming Conference

      ISSTA '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader