ABSTRACT
We develop a method for matching dynamic histories of program executions of two program versions. The matches produced can be useful in many applications including software piracy detection and several debugging scenarios. Unlike some static approaches for matching program versions, our approach does not require access to source code of the two program versions because dynamic histories can be collected by running instrumented versions of program binaries. We base our matching algorithm on comparison of rich program execution histories which include: control flow taken, values produced, addresses referenced, as well as data dependences exercised. In developing a matching algorithm we had two goals: producing an accurate match and producing it quickly. By using rich execution history, we are able to compare the program versions across many behavioral dimensions. The result is a fast and highly precise matching algorithm. Our algorithm first uses individual histories of instructions to identify multiple potential matches and then it refines the set of matches by matching the data dependence structure established by the matching instructions. To test our algorithm we attempted matching of execution histories of unoptimized and optimized program versions. Our results show that our algorithm produces highly accurate matches which are highly effective when used in comparison checking approach to debugging optimized code.
- T. Apiwattanapong, A. Orso, M.J. Harrold, "A Differencing Algorithm for Object-Oriented Programs," IEEE International Conf. on Automated Software Engineering, pages 2--13, 2004. Google ScholarDigital Library
- C. Collberg, C. Thomborson, and D. Low, "Breaking Abstractions and Unstructuring Data Structures," IEEE International Conference on Computer Languages, pages 28--38, Chicago, IL, 1998. Google ScholarDigital Library
- D. Jackson and D.A. Ladd, "Semantic Diff: A Tool for Summarizing the Effects of Modifications," IEEE Conference on Software Maintenance, pages 243--252, Nov. 1994. Google ScholarDigital Library
- C. Jaramillo, R. Gupta, and M.L. Soffa, "Comparison Checking: An Approach to Avoid Debugging of Optimized Code," 7th European Software Engineering Conference and ACM SIGSOFT 7th Symposium on Foundations of Software Engineering, LNCS 1687, Springer Verlag, pages 268--284, Toulouse, France, September 1999. Google ScholarDigital Library
- C. Jaramillo, R. Gupta, and M.L. Soffa, "Debugging and Testing Optimizers through Comparison Checking," International Workshop on Compiler Optimization Meets Compiler Verification, Electronic Notes in Theoretical Computer Science 65 No. 2 (2002), held in conjunction with ETAPS, Grenoble, France, April 2002.Google Scholar
- R. Komondoor and S. Horwitz, "Semantics-Preserving Procedure Extraction," 27th ACM SIGPLAN-SIGACT on Principles of Programming Languages, pages 155--169, 2000. Google ScholarDigital Library
- J.R. Larus and E. Schnarr, "EEL: Machine-Independent Executable Editing," SIGPLAN Conference on Programming Language Design and Implementation, pages 291--300, 1995. Google ScholarDigital Library
- J. Laski and W. Szermer, "Identification of Program Modifications and its Applications to Software Maintenance," IEEE Conference on Software Maintenance, pages 282--290, Nov. 1992.Google Scholar
- E.W. Myers, "An O(ND) Difference Algorithm and its Variations," Algorithmica, 1(2):251--266, 1986.Google ScholarCross Ref
- T. Reps, T. Ball, M. Das, and J. Larus, "The Use of Program Profiling for Software Maintenance with Applications to the Year 2000 Problem," 6th European Software Engineering Conference and ACM SIGSOFT 5th Symposium on Foundations of Software Engineering, pages 432--449, 1997. Google ScholarDigital Library
- A. Srivastava and A. Eustace, "ATOM - A System for Building Customized Program Analysis Tools," SIGPLAN Conference on Programming Language Design and Implementation, pages 196--205, 1994. Google ScholarDigital Library
- Z. Wang, K. Pierce, and S. McFarling, "BMAT - A Binary Matching Tool for Stale Profile Propagation," The Journal of Instruction Level Parallelism, 2, May 2000.Google Scholar
- C. Wang, J. Davidson, J. hill, and J. Knight, "Protection of Software-based Survivability Mechansims," International Conference of Dependable Systems and Networks, pages 193--202, Goteborg, Sweden, July 2001. Google ScholarDigital Library
- N. Wilde, "Faster Reuse and Maintenance Using Software Reconnaissance," Technical Report SERC-TR-75F, SERC, Univ. of Florida, CIS Department, Gainesville, FL, July 1994.Google Scholar
- A. Zeller, "Isolating Cause-Effect Chains from Computer Programs," ACM SIGSOFT 10th International Symposium on the Foundations of Software Engineering, Charleston, South Carolina, November 2002. Google ScholarDigital Library
- X. Zhang and R. Gupta, "Whole Execution Traces," IEEE/ACM 37th International Symposium on Microarchitecture, Portland, Oregan, December 2004. Google ScholarDigital Library
- The Trimaran Compiler Research Infrastructure. Nov. 1997.Google Scholar
- S.S. Muchnick. Advanced Compiler Design and Implementation, Morgan Kaufmann, 1997. Google ScholarDigital Library
Index Terms
- Matching execution histories of program versions
Recommendations
Matching execution histories of program versions
We develop a method for matching dynamic histories of program executions of two program versions. The matches produced can be useful in many applications including software piracy detection and several debugging scenarios. Unlike some static approaches ...
DoubleTake: fast and precise error detection via evidence-based dynamic analysis
ICSE '16: Proceedings of the 38th International Conference on Software EngineeringPrograms written in unsafe languages like C and C++ often suffer from errors like buffer overflows, dangling pointers, and memory leaks. Dynamic analysis tools like Valgrind can detect these errors, but their overhead---primarily due to the cost of ...
Dynamic points-to sets: a comparison with static analyses and potential applications in program understanding and optimization
PASTE '01: Proceedings of the 2001 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineeringIn this paper, we compare the behavior of pointers in C programs, as approximated by static pointer analysis algorithms, with the actual behavior of pointers when these programs are run. In order to perform this comparison, we have implemented several ...
Comments