ABSTRACT
We address the problem of computing semantic differences between a program and a patched version of the program. Our goal is to obtain a precise characterization of the difference between program versions, or establish their equivalence. We focus on infinite-state numerical programs, and use abstract interpretation to compute an over-approximation of program differences.
Computing differences and establishing equivalence under abstraction requires abstracting relationships between variables in the two programs. Towards that end, we use a correlating abstract domain to compute a sound approximation of these relationships which captures semantic difference. This approximation can be computed over any interleaving of the two programs. However, the choice of interleaving can significantly affect precision. We present a speculative search algorithm that aims to find an interleaving of the two programs with minimal abstract semantic difference. This method is unique as it allows the analysis to dynamically alternate between several interleavings.
We have implemented our approach and applied it to real-world examples including patches from Git, GNU Coreutils, as well as a few handpicked patches from the Linux kernel and the Mozilla Firefox web browser. Our evaluation shows that we compute precise approximations of semantic differences, and report few false differences.
- Github has surpassed sourceforge and google code in popularity. http://readwrite.com/2011/06/02/github-has-passed-sourceforge.Google Scholar
- D. Amit, N. Rinetzky, T. W. Reps, M. Sagiv, and E. Yahav. Comparison under abstraction for verifying linearizability. In CAV, pages 477--490, 2007. Google ScholarDigital Library
- R. Bagnara, P. M. Hill, and E. Zaffanella. Widening operators for powerset domains. STTT, 8(4-5):449--466, 2006. Google ScholarDigital Library
- M. Barnett, B.-Y. E. Chang, R. DeLine, B. Jacobs, and K. R. M. Leino. Boogie: A modular reusable verifier for object-oriented programs. In FMCO, pages 364--387, 2005. Google ScholarDigital Library
- N. Benton. Simple relational correctness proofs for static analyses and program transformations. In POPL, pages 14--25, 2004. Google ScholarDigital Library
- D. Brumley, P. Poosankam, D. X. Song, and J. Zheng. Automatic patch-based exploit generation is possible: Techniques and implications. In IEEE Symposium on Security and Privacy, pages 143--157, 2008. Google ScholarDigital Library
- C. Cadar, D. Dunbar, and D. R. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, pages 209--224, 2008. Google ScholarDigital Library
- S. Chaki, A. Gurfinkel, and O. Strichman. Regression verification for multi-threaded programs. In VMCAI, pages 119--135, 2012. Google ScholarDigital Library
- E. M. Clarke and D. Kroening. Hardware verification using ansi-c programs as a reference. In ASP-DAC, pages 308--311, 2003. Google ScholarDigital Library
- E. M. Clarke, D. Kroening, N. Sharygina, and K. Yorav. Predicate abstraction of ansi-c programs using sat. Formal Methods in System Design, 25(2-3):105--127, 2004. Google ScholarDigital Library
- P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL, pages 238--252, 1977. Google ScholarDigital Library
- P. Cousot and N. Halbwachs. Automatic discovery of linear restraints among variables of a program. In POPL, pages 84--96, 1978. Google ScholarDigital Library
- Y. David and E. Yahav. Tracelet-based code search in executables. In PLDI, page 37, 2014. Google ScholarDigital Library
- P. Godefroid, N. Klarlund, and K. Sen. Dart: directed automated random testing. In PLDI, pages 213--223, 2005. Google ScholarDigital Library
- B. Godlin and O. Strichman. Regression verification. In DAC, pages 466--471, 2009. Google ScholarDigital Library
- C. Hawblitzel, S. K. Lahiri, K. Pawar, H. Hashmi, S. Gokbulut, L. Fernando, D. Detlefs, and S. Wadsworth. Will you still compile me tomorrow? static cross-version compiler validation. In ESEC/FSE 2013, 2013. Google ScholarDigital Library
- C. A. R. Hoare. An axiomatic basis for computer programming. Commun. ACM, 12(10):576--580, 1969. Google ScholarDigital Library
- S. Horwitz. Identifying the semantic and textual differences between two versions of a program. In PLDI, pages 234--245, 1990. Google ScholarDigital Library
- S. Horwitz, J. Prins, and T. W. Reps. Integrating noninterfering versions of programs. ACM Trans. Program. Lang. Syst., 11(3):345--387, 1989. Google ScholarDigital Library
- J. W. Hunt and M. D. McIlroy. An algorithm for differential file comparison. Technical report, Bell Laboratories, 1975.Google Scholar
- D. Jackson and D. A. Ladd. Semantic diff: A tool for summarizing the effects of modifications. In ICSM, pages 243--252, 1994. Google ScholarDigital Library
- W. Jin, A. Orso, and T. Xie. Bert: a tool for behavioral regression testing. In SIGSOFT FSE, pages 361--362, 2010. Google ScholarDigital Library
- A. Kuehlmann and F. Krohm. Equivalence checking using cuts and heaps. In DAC, pages 263--268, 1997. Google ScholarDigital Library
- S. K. Lahiri, K. Vaswani, and C. A. R. Hoare. Differential static analysis: opportunities, applications, and challenges. In FoSER, pages 201--204, 2010. Google ScholarDigital Library
- S. K. Lahiri, C. Hawblitzel, M. Kawaguchi, and H. Rebêlo. Symdiff: A language-agnostic semantic diff tool for imperative programs. In CAV, pages 712--717, 2012. Google ScholarDigital Library
- S. K. Lahiri, K. L. McMillan, R. Sharma, and C. Hawblitzel. Differential assertion checking. In ESEC/FSE 2013, 2013. Google ScholarDigital Library
- A. Miné. The octagon abstract domain. Higher-Order and Symbolic Computation, 19(1):31--100, 2006. Google ScholarDigital Library
- A. Mishchenko, S. Chatterjee, R. K. Brayton, and N. Eén. Improvements to combinational equivalence checking. In ICCAD, pages 836--843, 2006. Google ScholarDigital Library
- G. C. Necula. Translation validation for an optimizing compiler. In PLDI, pages 83--94, 2000. Google ScholarDigital Library
- N. Partush and E. Yahav. Abstract semantic differencing for numerical programs. In SAS, pages 238--258, 2013.Google ScholarCross Ref
- D. Peled. All from one, one for all: on model checking using representatives. In CAV, pages 409--423, 1993. Google ScholarDigital Library
- S. Person, M. B. Dwyer, S. G. Elbaum, and C. S. Pasareanu. Differential symbolic execution. In SIGSOFT FSE, pages 226--237, 2008. Google ScholarDigital Library
- A. Pnueli, M. Siegel, and E. Singerman. Translation validation. In TACAS, pages 151--166, 1998. Google ScholarDigital Library
- D. A. Ramos and D. R. Engler. Practical, low-effort equivalence verification of real code. In CAV, pages 669--685, 2011. Google ScholarDigital Library
- X. Rival and L. Mauborgne. The trace partitioning abstract domain. ACM Trans. Program. Lang. Syst., 29(5), 2007. Google ScholarDigital Library
- R. Sharma, E. Schkufza, B. R. Churchill, and A. Aiken. Data-driven equivalence checking. In OOPSLA, pages 391--406, 2013. Google ScholarDigital Library
- Y. Song, Y. Zhang, and Y. Sun. Automatic vulnerability locating in binary patches. In CIS (2), pages 474--477, 2009. Google ScholarDigital Library
- T. Terauchi and A. Aiken. Secure information flow as a safety problem. In SAS, pages 352--367, 2005. Google ScholarDigital Library
- A. Valmari. Stubborn sets for reduced state space generation. In Applications and Theory of Petri Nets, pages 491--515, 1989. Google ScholarDigital Library
- P. Wolper and P. Godefroid. Partial-order methods for temporal verification. In CONCUR, pages 233--246, 1993. Google ScholarDigital Library
- L. D. Zuck, A. Pnueli, Y. Fang, B. Goldberg, and Y. Hu. Translation and run-time validation of optimized code. Electr. Notes Theor. Comput. Sci., 70(4):179--200, 2002.Google ScholarCross Ref
Index Terms
- Abstract semantic differencing via speculative correlation
Recommendations
Abstract semantic differencing via speculative correlation
OOPSLA '14We address the problem of computing semantic differences between a program and a patched version of the program. Our goal is to obtain a precise characterization of the difference between program versions, or establish their equivalence. We focus on ...
Numerical static analysis with Soot
SOAP '13: Proceedings of the 2nd ACM SIGPLAN International Workshop on State Of the Art in Java Program analysisNumerical static analysis computes an approximation of all the possible values that a numeric variable may assume, in any execution of the program. Many numerical static analyses have been proposed exploiting the theory of abstract interpretation, which ...
Redesigning Soot's data-flow analysis framework for abstract interpretation
ISSTA '18: Companion Proceedings for the ISSTA/ECOOP 2018 WorkshopsThe goal of a program analysis framework is to decrease the effort required of a program analysis developer to implement a new analysis. The Soot Java optimization framework provides analysis developers with several types of abstract analyses, which ...
Comments