skip to main content
research-article

Context-sensitive data-dependence analysis via linear conjunctive language reachability

Published:01 January 2017Publication History
Skip Abstract Section

Abstract

Many program analysis problems can be formulated as graph reachability problems. In the literature, context-free language (CFL) reachability has been the most popular formulation and can be computed in subcubic time. The context-sensitive data-dependence analysis is a fundamental abstraction that can express a broad range of program analysis problems. It essentially describes an interleaved matched-parenthesis language reachability problem. The language is not context-free, and the problem is well-known to be undecidable. In practice, many program analyses adopt CFL-reachability to exactly model the matched parentheses for either context-sensitivity or structure-transmitted data-dependence, but not both. Thus, the CFL-reachability formulation for context-sensitive data-dependence analysis is inherently an approximation.

To support more precise and scalable analyses, this paper introduces linear conjunctive language (LCL) reachability, a new, expressive class of graph reachability. LCL not only contains the interleaved matched-parenthesis language, but is also closed under all set-theoretic operations. Given a graph with n nodes and m edges, we propose an O(mn) time approximation algorithm for solving all-pairs LCL-reachability, which is asymptotically better than known CFL-reachability algorithms. Our formulation and algorithm offer a new perspective on attacking the aforementioned undecidable problem - the LCL-reachability formulation is exact, while the LCL-reachability algorithm yields a sound approximation. We have applied the LCL-reachability framework to two existing client analyses. The experimental results show that the LCL-reachability framework is both more precise and scalable than the traditional CFL-reachability framework. This paper opens up the opportunity to exploit LCL-reachability in program analysis.

References

  1. DaCapo benchmark suite. http://dacapobench.org/.Google ScholarGoogle Scholar
  2. R. Alur and P. Madhusudan. Visibly pushdown languages. In STOC, pages 202–211, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. O. Bastani, S. Anand, and A. Aiken. Specification inference using context-free language reachability. In POPL, pages 553–566, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Chaudhuri. Subcubic algorithms for recursive state machines. In POPL, pages 159–169, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. Culik II, J. Gruska, and A. Salomaa. Systolic trellis automata I. International Journal of Computer Mathematics, 15:195–212, 1984.Google ScholarGoogle ScholarCross RefCross Ref
  6. K. Culik II, J. Gruska, and A. Salomaa. Systolic trellis automatat II. International Journal of Computer Mathematics, 16:3–22, 1984.Google ScholarGoogle ScholarCross RefCross Ref
  7. M. A. Harrison. Introduction to Formal Language Theory. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Hollingum and B. Scholz. Towards a scalable framework for contextfree language reachability. In CC, pages 193–211, 2015.Google ScholarGoogle Scholar
  9. J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. Huang, Y. Dong, A. Milanova, and J. Dolby. Scalable and precise taint analysis for Android. In ISSTA, pages 106–117, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. O. H. Ibarra and S. M. Kim. Characterizations and computational complexity of systolic trellis automata. Theor. Comput. Sci., 29:123– 153, 1984.Google ScholarGoogle ScholarCross RefCross Ref
  12. O. H. Ibarra, M. A. Palis, and S. M. Kim. Designing systolic algorithms using sequential machines. In FOCS, pages 46–55, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. O. H. Ibarra, S. M. Kim, and S. Moran. Sequential machine characterizations of trellis and cellular automata and applications. SIAM J. Comput., 14(2):426–447, 1985.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. K. Joshi, L. S. Levy, and M. Takahashi. Tree adjunct grammars. Journal of Computer and System Sciences, 10(1):136–163, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Kahlon. Boundedness vs. unboundedness of lock chains: Characterizing decidability of pairwise CFL-reachability for threads communicating via locks. In LICS, pages 27–36, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Kodumal and A. Aiken. The set constraint/CFL reachability connection in practice. In PLDI, pages 207–218, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Kodumal and A. Aiken. Regularly annotated set constraints. In PLDI, pages 331–341, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. La Torre and M. Napoli. Reachability of multistack pushdown systems with scope-bounded matching relations. In CONCUR, pages 203–218, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. La Torre, P. Madhusudan, and G. Parlato. A robust class of contextsensitive languages. In LICS, pages 161–170, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. La Torre, M. Napoli, and G. Parlato. Scope-bounded pushdown languages. Int. J. Found. Comput. Sci., 27(2):215–234, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  21. M. Might, Y. Smaragdakis, and D. V. Horn. Resolving and exploiting the k-CFA paradox: illuminating functional vs. object-oriented program analysis. In PLDI, pages 305–315, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Milanova, W. Huang, and Y. Dong. CFL-reachability and contextsensitive integrity types. In PPPJ, pages 99–109, 2014. Google ScholarGoogle ScholarCross RefCross Ref
  23. M. Nederhof. Practical experiments with regular approximation of context-free languages. Computational Linguistics, 26(1):17–44, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Okhotin. On the closure properties of linear conjunctive languages. Theor. Comput. Sci., 1-3(299):663–685, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Okhotin. On the equivalence of linear conjunctive grammars and trellis automata. Informatique Théorique et Applications, 38(1):69–88, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  26. P. Pratikakis, J. S. Foster, and M. Hicks. Existential label flow inference via CFL reachability. In SAS, pages 88–106, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. G. Ramalingam. Context-sensitive synchronization-sensitive analysis is undecidable. ACM Trans. Program. Lang. Syst., 22(2):416–430, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Rehof and M. Fähndrich. Type-base flow analysis: from polymorphic subtyping to CFL-reachability. In POPL, pages 54–66, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. W. Reps. Shape analysis as a generalized path problem. In PEPM, pages 1–11, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. T. W. Reps. Program analysis via graph reachability. Information & Software Technology, 40(11-12):701–726, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  31. T. W. Reps. Undecidability of context-sensitive data-dependence analysis. ACM Trans. Program. Lang. Syst., 22(1):162–186, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. T. W. Reps, S. Horwitz, S. Sagiv, and G. Rosay. Speeding up slicing. In SIGSOFT FSE, pages 11–20, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. T. W. Reps, S. Horwitz, and S. Sagiv. Precise interprocedural dataflow analysis via graph reachability. In POPL, pages 49–61, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Sharir and A. Pnueli. Two approaches to interprocedural data flow analysis. In S. S. Muchnick and N. D. Jones, editors, Program Flow Analysis: Theory and Applications, pages 189–234. Prentice-Hall, 1981.Google ScholarGoogle Scholar
  35. M. Sridharan and R. Bod´ık. Refinement-based context-sensitive pointsto analysis for java. In PLDI, pages 387–400, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. H. Tang, X. Wang, L. Zhang, B. Xie, L. Zhang, and H. Mei. Summarybased context-sensitive data-dependence analysis in presence of callbacks. In POPL, pages 83–95, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. L. G. Valiant. Regularity and related problems for deterministic pushdown automata. J. ACM, 22(1):1–10, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. X. Xiao, Q. Zhang, J. Zhou, and C. Zhang. Persistent pointer information. In PLDI, pages 463–474, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. G. Xu, A. Rountev, and M. Sridharan. Scaling CFL-reachability-based points-to analysis using context-sensitive must-not-alias analysis. In ECOOP, pages 98–122, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. D. Yan, G. H. Xu, and A. Rountev. Demand-driven context-sensitive alias analysis for Java. In ISSTA, pages 155–165, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Q. Zhang, M. R. Lyu, H. Yuan, and Z. Su. Fast algorithms for Dyck-CFL-reachability with applications to alias analysis. In PLDI, pages 435–446, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Q. Zhang, X. Xiao, C. Zhang, H. Yuan, and Z. Su. Efficient subcubic alias analysis for C. In OOPSLA, pages 829–845, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. X. Zheng and R. Rugina. Demand-driven alias analysis for C. In POPL, pages 197–208, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Context-sensitive data-dependence analysis via linear conjunctive language reachability

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM SIGPLAN Notices
              ACM SIGPLAN Notices  Volume 52, Issue 1
              POPL '17
              January 2017
              901 pages
              ISSN:0362-1340
              EISSN:1558-1160
              DOI:10.1145/3093333
              Issue’s Table of Contents
              • cover image ACM Conferences
                POPL '17: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages
                January 2017
                901 pages
                ISBN:9781450346603
                DOI:10.1145/3009837

              Copyright © 2017 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 January 2017

              Check for updates

              Qualifiers

              • research-article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader