Abstract
This paper identifies and formalizes a prevalent class of asymptotic performance bugs called redundant traversal bugs and presents a novel static analysis for automatically detecting them. We evaluate our technique by implementing it in a tool called CLARITY and applying it to widely-used software packages such as the Google Core Collections Library, the Apache Common Collections, and the Apache Ant build tool. Across 1.6M lines of Java code, CLARITY finds 92 instances of redundant traversal bugs, including 72 that have never been previously reported, with just 5 false positives. To evaluate the performance impact of these bugs, we manually repair these programs and find that for an input size of 50,000, all repaired programs are at least 2.45 faster than their original code.
- A. Aiken, S. Bugrara, I. Dillig, T. Dillig, B. Hackett, and P. Hawkins. An overview of the Saturn Project. In PASTE, pages 43–48, 2007. Google ScholarDigital Library
- R. Z. Altucher and W. Landi. An extended form of must alias analysis for dynamic allocation. In PLDI, pages 74–84. ACM, 1995. Google ScholarDigital Library
- M. Attariyan, M. Chow, and J. Flinn. X-ray: Automating root-cause diagnosis of performance anomalies in production software. In OSDI, pages 307–320, 2012. Google ScholarDigital Library
- T. Ball and S. Rajamani. The SLAM project: debugging system software via static analysis. In POPL, pages 1–3, 2002. Google ScholarDigital Library
- W. Bush, J. Pincus, and D. Sielaff. A static analyzer for finding dynamic programming errors. Software: Practice and Experience, 30 (7):775–802, 2000. Google ScholarDigital Library
- C. Cadar, D. Dunbar, and D. R. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, volume 8, pages 209–224, 2008. Google ScholarDigital Library
- P. ˇ Cern`y and R. Alur. Automated analysis of Java methods for confidentiality. In CAV, pages 173–187. Springer, 2009. Google ScholarDigital Library
- T. Chen, W. Shang, Z. Jiang, A. Hassan, M. Nasser, and P. Flora. Detecting performance anti-patterns for applications developed using object-relational mapping. In ICSE, pages 1013–1024. ACM, 2014. Google ScholarDigital Library
- P. Cousot, R. Cousot, M. Fähndrich, and F. Logozzo. Automatic inference of necessary preconditions. In Verification, Model Checking, and Abstract Interpretation, pages 128–148. Springer, 2013.Google Scholar
- A. Deutsch. Interprocedural may-alias analysis for pointers: Beyond k-limiting. In PLDI, pages 230–241. ACM, 1994. Google ScholarDigital Library
- I. Dillig, T. Dillig, and A. Aiken. Sound, complete and scalable pathsensitive analysis. In PLDI, volume 43, pages 270–280, 2008. Google ScholarDigital Library
- I. Dillig, T. Dillig, and A. Aiken. Fluid updates: Beyond strong vs. weak updates. In ESOP, pages 246–266. 2010. Google ScholarDigital Library
- I. Dillig, T. Dillig, and A. Aiken. Precise reasoning for programs using containers. In ACM SIGPLAN Notices, volume 46, pages 187–200. ACM, 2011. Google ScholarDigital Library
- I. Dillig, T. Dillig, A. Aiken, and M. Sagiv. Precise and compact modular procedure summaries for heap manipulating programs. In PLDI, pages 567–577. ACM, 2011. Google ScholarDigital Library
- A. Diwan, M. Hauswirth, T. Mytkowicz, and P. F. Sweeney. Traceanalyzer: a system for processing performance traces. Software: Practice and Experience, 41(3):267–282, 2011. Google ScholarDigital Library
- B. Dufour, B. G. Ryder, and G. Sevitsky. A scalable technique for characterizing the usage of temporaries in framework-intensive Java applications. In FSE, pages 59–70, 2008. Google ScholarDigital Library
- S. J. Fink, E. Yahav, N. Dor, G. Ramalingam, and E. Geay. Effective typestate verification in the presence of aliasing. ACM Transactions on Software Engineering and Methodology (TOSEM), 17(2):9, 2008. Google ScholarDigital Library
- P. Godefroid, N. Klarlund, and K. Sen. DART: directed automated random testing. In ACM SIGPLAN Notices, volume 40, pages 213– 223. ACM, 2005. Google ScholarDigital Library
- S. F. Goldsmith, A. S. Aiken, and D. S. Wilkerson. Measuring empirical computational complexity. In FSE, pages 395–404, 2007. Google ScholarDigital Library
- S. Gulwani. Speed: Symbolic complexity bound analysis. In Computer Aided Verification, pages 51–62. Springer, 2009. Google ScholarDigital Library
- S. Gulwani, K. K. Mehra, and T. Chilimbi. Speed: precise and efficient static estimation of program computational complexity. In ACM SIGPLAN Notices, volume 44, pages 127–139. ACM, 2009. Google ScholarDigital Library
- T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Software verification with BLAST. In Model Checking Software, pages 235– 239. Springer, 2003. Google ScholarDigital Library
- J. Hoffmann, K. Aehlig, and M. Hofmann. Multivariate amortized resource analysis. In ACM SIGPLAN Notices, volume 46, pages 357– 370. ACM, 2011. Google ScholarDigital Library
- S. Jagannathan, P. Thiemann, S. Weeks, and A. Wright. Single and loving it: Must-alias analysis for higher-order languages. In PLDI, pages 329–341. ACM, 1998.Google Scholar
- G. Jin, L. Song, X. Shi, J. Scherpelz, and S. Lu. Understanding and detecting real-world performance bugs. ACM SIGPLAN Notices, 47 (6):77–88, 2012. Google ScholarDigital Library
- J. Knoop, L. Kovács, and J. Zwirchmayr. Symbolic loop bound computation for WCET analysis. In Perspectives of Systems Informatics, pages 227–242. Springer, 2012. Google ScholarDigital Library
- W. Landi and B. G. Ryder. A safe approximate algorithm for interprocedural aliasing. ACM SIGPLAN Notices, 27(7):235–248, 1992. Google ScholarDigital Library
- Y. Liu, C. Xu, and S. Cheung. Characterizing and detecting performance bugs for smartphone applications. In ICSE, pages 1013–1024, 2014. Google ScholarDigital Library
- D. Marinov and S. Khurshid. TestEra: a novel framework for automated testing of Java programs. In 16th IEEE Conference on Automated Software Engineering, page 22, 2001. Google ScholarDigital Library
- M. Naik and A. Aiken. Conditional must not aliasing for static race detection. ACM SIGPLAN Notices, 42(1):327–338, 2007. Google ScholarDigital Library
- K. Nguyen and G. Xu. Cachetor: Detecting cacheable data to remove bloat. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, pages 268–278. ACM, 2013. Google ScholarDigital Library
- A. Nistor, L. Song, D. Marinov, and S. Lu. Toddler: Detecting performance problems via similar memory-access patterns. In ICSE, pages 562–571. IEEE, 2013. Google ScholarDigital Library
- K. Sen, D. Marinov, and G. Agha. CUTE: a concolic unit testing engine for c. In Proceedings of the 10th European Software Engineering Conference, pages 263–272, 2005. Google ScholarDigital Library
- O. Shacham, M. Vechev, and E. Yahav. Chameleon: adaptive selection of collections. In ACM SIGPLAN Notices, volume 44, pages 408–418. ACM, 2009. Google ScholarDigital Library
- A. Shankar, M. Arnold, and R. Bodik. Jolt: lightweight dynamic analysis and removal of object churn. In ACM SIGPLAN Notices, volume 43, pages 127–142. ACM, 2008. Google ScholarDigital Library
- L. Song and S. Lu. Statistical debugging for real-world performance problems. In OOPSLA, NY, USA, 2014. ACM. Google ScholarDigital Library
- M. Sridharan and R. Bod´ık. Refinement-based context-sensitive points-to analysis for Java. ACM SIGPLAN Notices, 41(6):387–400, 2006. Google ScholarDigital Library
- B. Steensgaard. Points-to analysis in almost linear time. In PLDI, pages 32–41. ACM, 1996.Google Scholar
- R. Vallée-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sundaresan. Soot-a Java bytecode optimization framework. In Proceedings of the 1999 Conference of the Centre for Advanced Studies on Collaborative research, page 13. IBM Press, 1999. Google ScholarDigital Library
- R. P. Wilson and M. S. Lam. Efficient context-sensitive pointer analysis for c programs. ACM SIGPLAN Notices, 30(6):1–12, 1995. Google ScholarDigital Library
- G. Xu. CoCo: sound and adaptive replacement of Java collections. In ECOOP 2013–Object-Oriented Programming, pages 1–26. Springer, 2013. Google ScholarDigital Library
- G. Xu and A. Rountev. Detecting inefficiently-used containers to avoid bloat. ACM SIGPLAN Notices, 45(6):160–173, 2010. Google ScholarDigital Library
- G. Xu, M. Arnold, N. Mitchell, A. Rountev, and G. Sevitsky. Go with the flow: profiling copies to find runtime bloat. In ACM SIGPLAN Notices, volume 44, pages 419–430. ACM, 2009. Google ScholarDigital Library
- G. Xu, M. Arnold, A. Rountev, and G. Sevitsky. Finding low utility data structures. In PLDI, pages 174–186. ACM, 2010. Google ScholarDigital Library
- X. Yu, S. Han, D. Zhang, and T. Xie. Comprehending performance from real-world execution traces: A device-driver case. In ASPLOS, pages 193–206. ACM, 2014. Google ScholarDigital Library
- S. Zaman, B. Adams, and A. E. Hassan. A qualitative study on performance bugs. In Mining Software Repositories. ACM, 2012. Google ScholarDigital Library
Index Terms
- Static detection of asymptotic performance bugs in collection traversals
Recommendations
Understanding and detecting real-world performance bugs
PLDI '12Developers frequently use inefficient code sequences that could be fixed by simple patches. These inefficient code sequences can cause significant performance degradation and resource waste, referred to as performance bugs. Meager increases in single ...
Static detection of asymptotic performance bugs in collection traversals
PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and ImplementationThis paper identifies and formalizes a prevalent class of asymptotic performance bugs called redundant traversal bugs and presents a novel static analysis for automatically detecting them. We evaluate our technique by implementing it in a tool called ...
A qualitative study on performance bugs
MSR '12: Proceedings of the 9th IEEE Working Conference on Mining Software RepositoriesSoftware performance is one of the important qualities that makes software stand out in a competitive market. However, in earlier work we found that performance bugs take more time to fix, need to be fixed by more experienced developers and require ...
Comments