Abstract
Many program analysis problems can be formulated as graph reachability problems. In the literature, context-free language (CFL) reachability has been the most popular formulation and can be computed in subcubic time. The context-sensitive data-dependence analysis is a fundamental abstraction that can express a broad range of program analysis problems. It essentially describes an interleaved matched-parenthesis language reachability problem. The language is not context-free, and the problem is well-known to be undecidable. In practice, many program analyses adopt CFL-reachability to exactly model the matched parentheses for either context-sensitivity or structure-transmitted data-dependence, but not both. Thus, the CFL-reachability formulation for context-sensitive data-dependence analysis is inherently an approximation.
To support more precise and scalable analyses, this paper introduces linear conjunctive language (LCL) reachability, a new, expressive class of graph reachability. LCL not only contains the interleaved matched-parenthesis language, but is also closed under all set-theoretic operations. Given a graph with n nodes and m edges, we propose an O(mn) time approximation algorithm for solving all-pairs LCL-reachability, which is asymptotically better than known CFL-reachability algorithms. Our formulation and algorithm offer a new perspective on attacking the aforementioned undecidable problem - the LCL-reachability formulation is exact, while the LCL-reachability algorithm yields a sound approximation. We have applied the LCL-reachability framework to two existing client analyses. The experimental results show that the LCL-reachability framework is both more precise and scalable than the traditional CFL-reachability framework. This paper opens up the opportunity to exploit LCL-reachability in program analysis.
- DaCapo benchmark suite. http://dacapobench.org/.Google Scholar
- R. Alur and P. Madhusudan. Visibly pushdown languages. In STOC, pages 202–211, 2004. Google ScholarDigital Library
- O. Bastani, S. Anand, and A. Aiken. Specification inference using context-free language reachability. In POPL, pages 553–566, 2015. Google ScholarDigital Library
- S. Chaudhuri. Subcubic algorithms for recursive state machines. In POPL, pages 159–169, 2008. Google ScholarDigital Library
- K. Culik II, J. Gruska, and A. Salomaa. Systolic trellis automata I. International Journal of Computer Mathematics, 15:195–212, 1984.Google ScholarCross Ref
- K. Culik II, J. Gruska, and A. Salomaa. Systolic trellis automatat II. International Journal of Computer Mathematics, 16:3–22, 1984.Google ScholarCross Ref
- M. A. Harrison. Introduction to Formal Language Theory. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1978. Google ScholarDigital Library
- N. Hollingum and B. Scholz. Towards a scalable framework for contextfree language reachability. In CC, pages 193–211, 2015.Google Scholar
- J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley, 1979. Google ScholarDigital Library
- W. Huang, Y. Dong, A. Milanova, and J. Dolby. Scalable and precise taint analysis for Android. In ISSTA, pages 106–117, 2015. Google ScholarDigital Library
- O. H. Ibarra and S. M. Kim. Characterizations and computational complexity of systolic trellis automata. Theor. Comput. Sci., 29:123– 153, 1984.Google ScholarCross Ref
- O. H. Ibarra, M. A. Palis, and S. M. Kim. Designing systolic algorithms using sequential machines. In FOCS, pages 46–55, 1984. Google ScholarDigital Library
- O. H. Ibarra, S. M. Kim, and S. Moran. Sequential machine characterizations of trellis and cellular automata and applications. SIAM J. Comput., 14(2):426–447, 1985.Google ScholarDigital Library
- A. K. Joshi, L. S. Levy, and M. Takahashi. Tree adjunct grammars. Journal of Computer and System Sciences, 10(1):136–163, 1975. Google ScholarDigital Library
- V. Kahlon. Boundedness vs. unboundedness of lock chains: Characterizing decidability of pairwise CFL-reachability for threads communicating via locks. In LICS, pages 27–36, 2009. Google ScholarDigital Library
- J. Kodumal and A. Aiken. The set constraint/CFL reachability connection in practice. In PLDI, pages 207–218, 2004. Google ScholarDigital Library
- J. Kodumal and A. Aiken. Regularly annotated set constraints. In PLDI, pages 331–341, 2007. Google ScholarDigital Library
- S. La Torre and M. Napoli. Reachability of multistack pushdown systems with scope-bounded matching relations. In CONCUR, pages 203–218, 2011. Google ScholarDigital Library
- S. La Torre, P. Madhusudan, and G. Parlato. A robust class of contextsensitive languages. In LICS, pages 161–170, 2007. Google ScholarDigital Library
- S. La Torre, M. Napoli, and G. Parlato. Scope-bounded pushdown languages. Int. J. Found. Comput. Sci., 27(2):215–234, 2016.Google ScholarCross Ref
- M. Might, Y. Smaragdakis, and D. V. Horn. Resolving and exploiting the k-CFA paradox: illuminating functional vs. object-oriented program analysis. In PLDI, pages 305–315, 2010. Google ScholarDigital Library
- A. Milanova, W. Huang, and Y. Dong. CFL-reachability and contextsensitive integrity types. In PPPJ, pages 99–109, 2014. Google ScholarCross Ref
- M. Nederhof. Practical experiments with regular approximation of context-free languages. Computational Linguistics, 26(1):17–44, 2000. Google ScholarDigital Library
- A. Okhotin. On the closure properties of linear conjunctive languages. Theor. Comput. Sci., 1-3(299):663–685, 2003. Google ScholarDigital Library
- A. Okhotin. On the equivalence of linear conjunctive grammars and trellis automata. Informatique Théorique et Applications, 38(1):69–88, 2004.Google ScholarCross Ref
- P. Pratikakis, J. S. Foster, and M. Hicks. Existential label flow inference via CFL reachability. In SAS, pages 88–106, 2006. Google ScholarDigital Library
- G. Ramalingam. Context-sensitive synchronization-sensitive analysis is undecidable. ACM Trans. Program. Lang. Syst., 22(2):416–430, 2000. Google ScholarDigital Library
- J. Rehof and M. Fähndrich. Type-base flow analysis: from polymorphic subtyping to CFL-reachability. In POPL, pages 54–66, 2001. Google ScholarDigital Library
- T. W. Reps. Shape analysis as a generalized path problem. In PEPM, pages 1–11, 1995. Google ScholarDigital Library
- T. W. Reps. Program analysis via graph reachability. Information & Software Technology, 40(11-12):701–726, 1998.Google ScholarCross Ref
- T. W. Reps. Undecidability of context-sensitive data-dependence analysis. ACM Trans. Program. Lang. Syst., 22(1):162–186, 2000. Google ScholarDigital Library
- T. W. Reps, S. Horwitz, S. Sagiv, and G. Rosay. Speeding up slicing. In SIGSOFT FSE, pages 11–20, 1994. Google ScholarDigital Library
- T. W. Reps, S. Horwitz, and S. Sagiv. Precise interprocedural dataflow analysis via graph reachability. In POPL, pages 49–61, 1995. Google ScholarDigital Library
- M. Sharir and A. Pnueli. Two approaches to interprocedural data flow analysis. In S. S. Muchnick and N. D. Jones, editors, Program Flow Analysis: Theory and Applications, pages 189–234. Prentice-Hall, 1981.Google Scholar
- M. Sridharan and R. Bod´ık. Refinement-based context-sensitive pointsto analysis for java. In PLDI, pages 387–400, 2006. Google ScholarDigital Library
- H. Tang, X. Wang, L. Zhang, B. Xie, L. Zhang, and H. Mei. Summarybased context-sensitive data-dependence analysis in presence of callbacks. In POPL, pages 83–95, 2015. Google ScholarDigital Library
- L. G. Valiant. Regularity and related problems for deterministic pushdown automata. J. ACM, 22(1):1–10, 1975. Google ScholarDigital Library
- X. Xiao, Q. Zhang, J. Zhou, and C. Zhang. Persistent pointer information. In PLDI, pages 463–474, 2014. Google ScholarDigital Library
- G. Xu, A. Rountev, and M. Sridharan. Scaling CFL-reachability-based points-to analysis using context-sensitive must-not-alias analysis. In ECOOP, pages 98–122, 2009. Google ScholarDigital Library
- D. Yan, G. H. Xu, and A. Rountev. Demand-driven context-sensitive alias analysis for Java. In ISSTA, pages 155–165, 2011. Google ScholarDigital Library
- Q. Zhang, M. R. Lyu, H. Yuan, and Z. Su. Fast algorithms for Dyck-CFL-reachability with applications to alias analysis. In PLDI, pages 435–446, 2013. Google ScholarDigital Library
- Q. Zhang, X. Xiao, C. Zhang, H. Yuan, and Z. Su. Efficient subcubic alias analysis for C. In OOPSLA, pages 829–845, 2014. Google ScholarDigital Library
- X. Zheng and R. Rugina. Demand-driven alias analysis for C. In POPL, pages 197–208, 2008. Google ScholarDigital Library
Index Terms
- Context-sensitive data-dependence analysis via linear conjunctive language reachability
Recommendations
Context-sensitive data-dependence analysis via linear conjunctive language reachability
POPL '17: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming LanguagesMany program analysis problems can be formulated as graph reachability problems. In the literature, context-free language (CFL) reachability has been the most popular formulation and can be computed in subcubic time. The context-sensitive data-...
Interprocedural pointer alias analysis
We present practical approximation methods for computing and representing interprocedural aliases for a program written in a language that includes pointers, reference parameters, and recursion. We present the following contributions: (1) a framework ...
Undecidability of context-sensitive data-dependence analysis
A number of program-analysis problems can be tackled by transforming them into certain kinds of graph-reachability problems in labeled directed graphs. The edge labels can be used to filter out paths that are not interest: a path P from vertex s to ...
Comments