skip to main content
research-article

Static detection of asymptotic performance bugs in collection traversals

Published:03 June 2015Publication History
Skip Abstract Section

Abstract

This paper identifies and formalizes a prevalent class of asymptotic performance bugs called redundant traversal bugs and presents a novel static analysis for automatically detecting them. We evaluate our technique by implementing it in a tool called CLARITY and applying it to widely-used software packages such as the Google Core Collections Library, the Apache Common Collections, and the Apache Ant build tool. Across 1.6M lines of Java code, CLARITY finds 92 instances of redundant traversal bugs, including 72 that have never been previously reported, with just 5 false positives. To evaluate the performance impact of these bugs, we manually repair these programs and find that for an input size of 50,000, all repaired programs are at least 2.45 faster than their original code.

References

  1. A. Aiken, S. Bugrara, I. Dillig, T. Dillig, B. Hackett, and P. Hawkins. An overview of the Saturn Project. In PASTE, pages 43–48, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Z. Altucher and W. Landi. An extended form of must alias analysis for dynamic allocation. In PLDI, pages 74–84. ACM, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Attariyan, M. Chow, and J. Flinn. X-ray: Automating root-cause diagnosis of performance anomalies in production software. In OSDI, pages 307–320, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. T. Ball and S. Rajamani. The SLAM project: debugging system software via static analysis. In POPL, pages 1–3, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. W. Bush, J. Pincus, and D. Sielaff. A static analyzer for finding dynamic programming errors. Software: Practice and Experience, 30 (7):775–802, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Cadar, D. Dunbar, and D. R. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, volume 8, pages 209–224, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. ˇ Cern`y and R. Alur. Automated analysis of Java methods for confidentiality. In CAV, pages 173–187. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Chen, W. Shang, Z. Jiang, A. Hassan, M. Nasser, and P. Flora. Detecting performance anti-patterns for applications developed using object-relational mapping. In ICSE, pages 1013–1024. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Cousot, R. Cousot, M. Fähndrich, and F. Logozzo. Automatic inference of necessary preconditions. In Verification, Model Checking, and Abstract Interpretation, pages 128–148. Springer, 2013.Google ScholarGoogle Scholar
  10. A. Deutsch. Interprocedural may-alias analysis for pointers: Beyond k-limiting. In PLDI, pages 230–241. ACM, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. I. Dillig, T. Dillig, and A. Aiken. Sound, complete and scalable pathsensitive analysis. In PLDI, volume 43, pages 270–280, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. I. Dillig, T. Dillig, and A. Aiken. Fluid updates: Beyond strong vs. weak updates. In ESOP, pages 246–266. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. I. Dillig, T. Dillig, and A. Aiken. Precise reasoning for programs using containers. In ACM SIGPLAN Notices, volume 46, pages 187–200. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. I. Dillig, T. Dillig, A. Aiken, and M. Sagiv. Precise and compact modular procedure summaries for heap manipulating programs. In PLDI, pages 567–577. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Diwan, M. Hauswirth, T. Mytkowicz, and P. F. Sweeney. Traceanalyzer: a system for processing performance traces. Software: Practice and Experience, 41(3):267–282, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. B. Dufour, B. G. Ryder, and G. Sevitsky. A scalable technique for characterizing the usage of temporaries in framework-intensive Java applications. In FSE, pages 59–70, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. J. Fink, E. Yahav, N. Dor, G. Ramalingam, and E. Geay. Effective typestate verification in the presence of aliasing. ACM Transactions on Software Engineering and Methodology (TOSEM), 17(2):9, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Godefroid, N. Klarlund, and K. Sen. DART: directed automated random testing. In ACM SIGPLAN Notices, volume 40, pages 213– 223. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. F. Goldsmith, A. S. Aiken, and D. S. Wilkerson. Measuring empirical computational complexity. In FSE, pages 395–404, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Gulwani. Speed: Symbolic complexity bound analysis. In Computer Aided Verification, pages 51–62. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Gulwani, K. K. Mehra, and T. Chilimbi. Speed: precise and efficient static estimation of program computational complexity. In ACM SIGPLAN Notices, volume 44, pages 127–139. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Software verification with BLAST. In Model Checking Software, pages 235– 239. Springer, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Hoffmann, K. Aehlig, and M. Hofmann. Multivariate amortized resource analysis. In ACM SIGPLAN Notices, volume 46, pages 357– 370. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Jagannathan, P. Thiemann, S. Weeks, and A. Wright. Single and loving it: Must-alias analysis for higher-order languages. In PLDI, pages 329–341. ACM, 1998.Google ScholarGoogle Scholar
  25. G. Jin, L. Song, X. Shi, J. Scherpelz, and S. Lu. Understanding and detecting real-world performance bugs. ACM SIGPLAN Notices, 47 (6):77–88, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Knoop, L. Kovács, and J. Zwirchmayr. Symbolic loop bound computation for WCET analysis. In Perspectives of Systems Informatics, pages 227–242. Springer, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. W. Landi and B. G. Ryder. A safe approximate algorithm for interprocedural aliasing. ACM SIGPLAN Notices, 27(7):235–248, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Liu, C. Xu, and S. Cheung. Characterizing and detecting performance bugs for smartphone applications. In ICSE, pages 1013–1024, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Marinov and S. Khurshid. TestEra: a novel framework for automated testing of Java programs. In 16th IEEE Conference on Automated Software Engineering, page 22, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Naik and A. Aiken. Conditional must not aliasing for static race detection. ACM SIGPLAN Notices, 42(1):327–338, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. K. Nguyen and G. Xu. Cachetor: Detecting cacheable data to remove bloat. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pages 268–278. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. Nistor, L. Song, D. Marinov, and S. Lu. Toddler: Detecting performance problems via similar memory-access patterns. In ICSE, pages 562–571. IEEE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. K. Sen, D. Marinov, and G. Agha. CUTE: a concolic unit testing engine for c. In Proceedings of the 10th European Software Engineering Conference, pages 263–272, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. O. Shacham, M. Vechev, and E. Yahav. Chameleon: adaptive selection of collections. In ACM SIGPLAN Notices, volume 44, pages 408–418. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. A. Shankar, M. Arnold, and R. Bodik. Jolt: lightweight dynamic analysis and removal of object churn. In ACM SIGPLAN Notices, volume 43, pages 127–142. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. L. Song and S. Lu. Statistical debugging for real-world performance problems. In OOPSLA, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Sridharan and R. Bod´ık. Refinement-based context-sensitive points-to analysis for Java. ACM SIGPLAN Notices, 41(6):387–400, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. B. Steensgaard. Points-to analysis in almost linear time. In PLDI, pages 32–41. ACM, 1996.Google ScholarGoogle Scholar
  39. R. Vallée-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sundaresan. Soot-a Java bytecode optimization framework. In Proceedings of the 1999 Conference of the Centre for Advanced Studies on Collaborative research, page 13. IBM Press, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. R. P. Wilson and M. S. Lam. Efficient context-sensitive pointer analysis for c programs. ACM SIGPLAN Notices, 30(6):1–12, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. G. Xu. CoCo: sound and adaptive replacement of Java collections. In ECOOP 2013–Object-Oriented Programming, pages 1–26. Springer, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. G. Xu and A. Rountev. Detecting inefficiently-used containers to avoid bloat. ACM SIGPLAN Notices, 45(6):160–173, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. G. Xu, M. Arnold, N. Mitchell, A. Rountev, and G. Sevitsky. Go with the flow: profiling copies to find runtime bloat. In ACM SIGPLAN Notices, volume 44, pages 419–430. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. G. Xu, M. Arnold, A. Rountev, and G. Sevitsky. Finding low utility data structures. In PLDI, pages 174–186. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. X. Yu, S. Han, D. Zhang, and T. Xie. Comprehending performance from real-world execution traces: A device-driver case. In ASPLOS, pages 193–206. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. S. Zaman, B. Adams, and A. E. Hassan. A qualitative study on performance bugs. In Mining Software Repositories. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Static detection of asymptotic performance bugs in collection traversals

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                • Published in

                  cover image ACM SIGPLAN Notices
                  ACM SIGPLAN Notices  Volume 50, Issue 6
                  PLDI '15
                  June 2015
                  630 pages
                  ISSN:0362-1340
                  EISSN:1558-1160
                  DOI:10.1145/2813885
                  • Editor:
                  • Andy Gill
                  Issue’s Table of Contents
                  • cover image ACM Conferences
                    PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation
                    June 2015
                    630 pages
                    ISBN:9781450334686
                    DOI:10.1145/2737924

                  Copyright © 2015 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 3 June 2015

                  Check for updates

                  Qualifiers

                  • research-article

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader