research-article

On the recall of static call graph construction in practice

Authors:
Li Sui

Massey University, Palmerston North, New Zealand

Massey University, Palmerston North, New Zealand
View Profile

,
Jens Dietrich

Victoria University of Wellington, Wellington, New Zealand

Victoria University of Wellington, Wellington, New Zealand
View Profile

,
Amjed Tahir

Massey University, Palmerston North, New Zealand

Massey University, Palmerston North, New Zealand
View Profile

,
George Fourtounis

University of Athens, Athens, Greece

University of Athens, Athens, Greece
View Profile

ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software EngineeringJune 2020Pages 1049–1060https://doi.org/10.1145/3377811.3380441

Published:01 October 2020Publication History

ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

Pages 1049–1060

ABSTRACT

Static analyses have problems modelling dynamic language features soundly while retaining acceptable precision. The problem is well-understood in theory, but there is little evidence on how this impacts the analysis of real-world programs. We have studied this issue for call graph construction on a set of 31 real-world Java programs using an oracle of actual program behaviour recorded from executions of built-in and synthesised test cases with high coverage, have measured the recall that is being achieved by various static analysis algorithms and configurations, and investigated which language features lead to static analysis false negatives.

We report that (1) the median recall is 0.884 suggesting that standard static analyses have significant gaps with respect to the proportion of the program modelled (2) built-in tests are significantly better to expose dynamic program behaviour than synthesised tests (3) adding precision to the static analysis has little impact on recall indicating that those are separate concerns (4) state-of-the-art support for dynamic language features can significantly improve recall (the median observed is 0.935), but it comes with a hefty performance penalty, and (5) the main sources of unsoundness are not reflective method invocations, but objects allocated or accessed via native methods, and invocations initiated by the JVM, without matching call sites in the program under analysis.

These results provide some novel insights into the interaction between static and dynamic program analyses that can be used to assess the utility of static analysis results and to guide the development of future static and hybrid analyses.

References

[n. d.]. Invokedynamic Rectifier / Project Serializer. http://www.opal-project.de/DeveloperTools.html, accessed 14 Jan 2019.Google Scholar
[n. d.]. Standard Performance Evaluation Corporation: SPEC JVM98 Benchmarks.Google Scholar
Karim Ali and Ondřej Lhoták. 2013. Application-only call graph construction. In Proc. ECOOP'13. Springer.Google Scholar
Karim Ali and Ondřej Lhoták. 2013. Averroes: Whole-program analysis without the whole program. In Proc. ECOOP'13. Springer.Google Scholar
Nicholas Allen, Padmanabhan Krishnan, and Bernhard Scholz. 2015. Combining type-analysis with points-to analysis for analyzing Java library source-code. In Proc. SOAP'15. ACM, 13--18.Google ScholarDigital Library
Glenn Ammons, Thomas Ball, and James R Larus. 1997. Exploiting hardware performance counters with flow and context sensitive profiling. ACM Sigplan Notices 32, 5 (1997), 85--96.Google ScholarDigital Library
Lance Andersen. 2006. JDBC™4.0 Specification. JSR 221 (2006), 1--126.Google Scholar
Andrea Arcuri, Gordon Fraser, and Juan Pablo Galeotti. 2014. Automated unit test generation for classes with environment dependencies. In Proc. ASE'12. ACM.Google ScholarDigital Library
David F Bacon and Peter F Sweeney. 1996. Fast static analysis of C++ virtual function calls. In Proc. OOPSLA'96. ACM.Google ScholarDigital Library
Stephen M Blackburn, Robin Garner, Chris Hoffmann, Asjad M Khang, Kathryn S McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z Guyer, et al. 2006. The DaCapo benchmarks: Java benchmarking development and analysis. In Proc. OOPSLA'06. ACM.Google ScholarDigital Library
Eric Bodden. 2012. InvokeDynamic Support in Soot. In Proc. SOAP'12. ACM.Google ScholarDigital Library
Eric Bodden, Andreas Sewe, Jan Sinschek, Hela Oueslati, and Mira Mezini. 2011. Taming reflection: Aiding static analysis in the presence of reflection and custom class loaders. In Proc. ICSE'11. ACM.Google ScholarDigital Library
Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly Declarative Specification of Sophisticated Points-to Analyses. In Proc. OOPSLA'09. ACM.Google ScholarDigital Library
Eric Bruneton, Romain Lenglet, and Thierry Coupaye. 2002. ASM: a code manipulation tool to implement adaptable systems. Adaptable and extensible component systems 30, 19 (2002).Google Scholar
Christoph Csallner and Yannis Smaragdakis. 2004. JCrasher: an automatic robustness tester for Java. Software: Practice and Experience 34, 11 (2004), 1025--1050.Google ScholarDigital Library
Jens Dietrich, Henrik Schole, Li Sui, and Ewan Tempero. 2017. XCorpus-An executable Corpus of Java Programs. JOT 16, 4 (2017), 1:1--24.Google Scholar
Jens Dietrich, Li Sui, Shawn Rasheed, and Amjed Tahir. 2017. On the construction of soundness oracles. In Proc. SOAP'17. ACM.Google ScholarDigital Library
Michael D Ernst. 2003. Static and dynamic analysis: Synergy and duality. In Proc. WODA'03.Google Scholar
Stephen Fink and Julian Dolby. 2012. WALA-The TJ Watson Libraries for Analysis. https://github.com/wala/WALA.Google Scholar
Brian Foote and Ralph E Johnson. 1989. Reflective facilities in Smalltalk-80. In Proc. OOPSLA'89. ACM.Google ScholarDigital Library
George Fourtounis, George Kastrinis, and Yannis Smaragdakis. 2018. Static Analysis of Java Dynamic Proxies. In Proc. ISSTA'18. ACM.Google ScholarDigital Library
George Fourtounis and Yannis Smaragdakis. 2020. Deep Static Modeling of invokedynamic. In Proc. ECOOP'19.Google Scholar
Gordon Fraser and Andrea Arcuri. 2011. Evosuite: automatic test suite generation for object-oriented software. In Proc. ESEC/FSE'11. ACM.Google ScholarDigital Library
Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. 1993. Design patterns: Abstraction and reuse of object-oriented design. In Proc. ECOOP'93. Springer.Google ScholarCross Ref
Brian Goetz. 2012. Translation of Lambda Expressions. http://cr.openjdk.java.net/~briangoetz/lambda/lambda-translation.html.Google Scholar
Neville Grech, George Fourtounis, Adrian Francalanza, and Yannis Smaragdakis. 2017. Heaps Don't Lie: Countering Unsoundness with Heap Snapshots. In Proc. OOPSLA'17. ACM.Google ScholarDigital Library
Neville Grech, George Fourtounis, Adrian Francalanza, and Yannis Smaragdakis. 2018. Shooting from the heap: Ultra-scalable static analysis with heap snapshots. In Proc. ISSTA'18. ACM.Google ScholarDigital Library
Neville Grech, George Kastrinis, and Yannis Smaragdakis. 2018. Efficient reflection string analysis via graph coloring. In Proc. ECOOP'18. Schloss Dagstuhl-Leibniz-Zentrum für Informatik.Google Scholar
Neville Grech and Yannis Smaragdakis. 2017. P/Taint: unified points-to and taint analysis. In Proc. OOPSLA'17. ACM.Google ScholarDigital Library
David Grove, Greg DeFouw, Jeffrey Dean, and Craig Chambers. 1997. Call graph construction in object-oriented languages. In Proc. OOPSLA'97. ACM.Google ScholarDigital Library
Jens Knoop, Oliver Rüthing, and Bernhard Steffen. 1994. Partial dead code elimination. In Proc. PLDI'94. ACM.Google ScholarDigital Library
Patrick Lam, Eric Bodden, Ondrej Lhoták, and Laurie Hendren. 2011. The Soot framework for Java program analysis: a retrospective. In Proc. CETUS'11.Google Scholar
Davy Landman, Alexander Serebrenik, and Jurgen J Vinju. 2017. Challenges for Static Analysis of Java Reflection-Literature Review and Empirical Study. In Proc. ICSE'17. IEEE.Google ScholarDigital Library
Thomas Lengauer and Robert Endre Tarjan. 1979. A fast algorithm for finding dominators in a flowgraph. ACM Transactions on Programming Languages and Systems (TOPLAS) 1, 1 (1979), 121--141.Google ScholarDigital Library
Ondrej Lhoták. 2007. Comparing call graphs. In Proc. PASTE '07. ACM.Google ScholarDigital Library
Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis. 2018. Precision-guided context sensitivity for pointer analysis. In Proc. OOPSLA'18. ACM.Google ScholarDigital Library
Yue Li, Tian Tan, Anders Møller, and Yannis Smaragdakis. 2018. Scalability-first pointer analysis with self-tuning context-sensitivity. In Proc. ESEC/FSE'18. ACM.Google ScholarDigital Library
Yue Li, Tian Tan, Yulei Sui, and Jingling Xue. 2014. Self-inferencing reflection resolution for Java. In Proc. ECOOP'14. Springer.Google ScholarDigital Library
Yue Li, Tian Tan, and Jingling Xue. 2019. Understanding and Analyzing Java Reflection. ACM Transactions on Software Engineering and Methodology (TOSEM) 28, 2 (2019), 7.Google ScholarDigital Library
Tim Lindholm, Frank Yellin, Gilad Bracha, and Alex Buckley. 2015. The Java virtual machine specification: Java SE 8 edition, 2015. (2015). https://docs.oracle.com/javase/specs/jvms/se8/html/index.html.Google Scholar
Jie Liu, Yue Li, Tian Tan, and Jingling Xue. 2017. Reflection Analysis for Java: Uncovering More Reflective Targets Precisely. In Proc. ISSRE'17. IEEE.Google ScholarCross Ref
Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondřej Lhoták, J Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z Guyer, Uday P Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In defense of soundiness: a manifesto. CACM 58, 2 (2015), 44--46.Google ScholarDigital Library
Benjamin Livshits, John Whaley, and Monica S Lam. 2005. Reflection analysis for Java. In Proc. APLAS'05. Springer.Google ScholarDigital Library
Qingzhou Luo, Farah Hariri, Lamyaa Eloussi, and Darko Marinov. 2014. An empirical analysis of flaky tests. In Proc. FSE'14. ACM.Google ScholarDigital Library
Luis Mastrangelo, Luca Ponzanelli, Andrea Mocci, Michele Lanza, Matthias Hauswirth, and Nathaniel Nystrom. 2015. Use at your own risk: the Java unsafe API in the wild. In Proc. OOPSLA'15. ACM.Google ScholarDigital Library
Gail C Murphy, David Notkin, William G Griswold, and Erica S Lan. 1998. An empirical study of static call graph extractors. ACM TOSEM 7, 2 (1998), 158--191.Google ScholarDigital Library
James Newsome and Dawn Xiaodong Song. 2005. Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software. In Proc. NDSS'05. Internet Society.Google Scholar
Carlos Pacheco and Michael D Ernst. 2007. Randoop: feedback-directed random testing for Java. In Proc. OOPSLA'07. ACM.Google ScholarDigital Library
Felipe Pontes, Rohit Gheyi, Sabrina Souto, Alessandro Garcia, and Márcio Ribeiro. 2019. Java reflection API: revealing the dark side of the mirror. In Proc. FSE'19. ACM.Google ScholarDigital Library
Michael Reif, Michael Eichberg, Ben Hermann, Johannes Lerch, and Mira Mezini. 2016. Call graph construction for java libraries. In Proc. FSE'16. ACM, 474--486.Google ScholarDigital Library
Michael Reif, Florian Kübler, Michael Eichberg, Dominik Helm, and Mira Mezini. 2019. Judge: Identifying, Understanding, and Evaluating Sources of Unsoundness in Call Graphs. In Proc. ISSTA'19. ACM.Google ScholarDigital Library
Michael Reif, Florian Kübler, Michael Eichberg, and Mira Mezini. 2018. Systematic Evaluation of the Unsoundness of Call Graph Construction Algorithms for Java. In Proc. SOAP'18. ACM.Google ScholarDigital Library
Henry Gordon Rice. 1953. Classes of recursively enumerable sets and their decision problems. Trans. Amer. Math. Soc. 74, 2 (1953), 358--366.Google ScholarCross Ref
August Shi, Milica Hadzi-Tanovic, Lingming Zhang, Marinov, and Owolabi Legunsen. 2019. Reflection-aware static regression test selection. In Proc. OOPSLA'19. ACM.Google ScholarDigital Library
Olin Shivers. 1991. Control-flow analysis of higher-order languages. Ph.D. Dissertation. Carnegie Mellon University.Google ScholarDigital Library
Yannis Smaragdakis, George Balatsouras, George Kastrinis, and Martin Bravenboer. 2015. More sound static handling of Java reflection. In Proc. APLAS'15. Springer.Google ScholarCross Ref
Yannis Smaragdakis and George Kastrinis. 2018. Defensive Points-To Analysis: Effective Soundness via Laziness. In Proc. ECOOP'18. Schloss Dagstuhl-Leibniz-Zentrum für Informatik.Google Scholar
Brian Cantwell Smith. 1984. Reflection and semantics in Lisp. In Proc. POPL'84. ACM.Google ScholarDigital Library
Li Sui, Jens Dietrich, Michael Emery, Shawn Rasheed, and Amjed Tahir. 2018. On the Soundness of Call Graph Construction in the Presence of Dynamic Language Features-A Benchmark and Tool Evaluation. In Proc. APLAS'18. Springer.Google ScholarCross Ref
Li Sui, Jens Dietrich, and Amjed Tahir. 2017. On the Use of Mined Stack Traces to Improve the Soundness of Statically Constructed Call Graphs. In Proc. APSEC'17. IEEE.Google ScholarCross Ref
Vijay Sundaresan, Laurie Hendren, Chrislain Razafimahefa, Raja Vallée-Rai, Patrick Lam, Etienne Gagnon, and Charles Godin. 2000. Practical virtual method call resolution for Java. In Proc. OOPSLA'00. ACM.Google ScholarDigital Library
Ewan Tempero, Craig Anslow, Jens Dietrich, Ted Han, Jing Li, Markus Lumpe, Hayden Melton, and James Noble. 2010. Qualitas Corpus: A Curated Collection of Java Code for Empirical Studies. In Proc. APSEC'10. IEEE.Google ScholarDigital Library
Frank Tip and Jens Palsberg. 2000. Scalable Propagation-based Call Graph Construction Algorithms. In Proc. OOPSLA'00. ACM.Google ScholarDigital Library

Index Terms

On the recall of static call graph construction in practice
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
  2. Software organization and properties
    1. Software functional properties
      1. Formal methods
        Automated static analysis
        Dynamic analysis

Recommendations

Call graph construction for Java libraries
FSE 2016: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering

Today, every application uses software libraries. Yet, while a lot of research exists w.r.t. analyzing applications, research that targets the analysis of libraries independent of any application is scarce. This is unfortunate, because, for developers ...
Read More
Combined Static and Dynamic Analysis

Static analysis is usually faster than dynamic analysis but less precise. Therefore it is often desirable to retain information from static analysis for run-time verification, or to compare the results of both techniques. However, this requires writing ...
Read More
Static analysis for dynamic updates
CEE-SECR '13: Proceedings of the 9th Central & Eastern European Software Engineering Conference in Russia

Dynamic system update problem, DSU, is about replacing a running application with another version without interrupting its work or disrupting its users. Some solutions for DSU support only certain kinds of updates, which makes it dangerous to apply them ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering
June 2020
1640 pages
ISBN:9781450371216
DOI:10.1145/3377811
General Chairs:
Gregg Rothermel
North Carolina State University
,
Doo-Hwan Bae
KAIST, South Korea
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 October 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Available / v1.1
- Artifacts Evaluated & Reusable / v1.1
Author Tags
Java
call graph construction
soundness
static program analysis
test case generation
testing
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate276of1,856submissions,15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 21
  Total Citations
  View Citations
- 458
  Total Downloads
- Downloads (Last 12 months)171
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

On the recall of static call graph construction in practice

ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Call graph construction for Java libraries

Combined Static and Dynamic Analysis

Static analysis for dynamic updates