Article

Program element matching for multi-version program analyses

Authors:
Miryung Kim

University of Washington, Seattle, WA

University of Washington, Seattle, WA
View Profile

,
David Notkin

University of Washington, Seattle, WA

University of Washington, Seattle, WA
View Profile

MSR '06: Proceedings of the 2006 international workshop on Mining software repositoriesMay 2006Pages 58–64https://doi.org/10.1145/1137983.1137999

Published:22 May 2006Publication History

MSR '06: Proceedings of the 2006 international workshop on Mining software repositories

Pages 58–64

ABSTRACT

Multi-version program analyses require that elements of one version of a program be mapped to the elements of other versions of that program. Matching program elements between two versions of a program is a fundamental building block for multi-version program analyses and other software evolution research such as profile propagation, regression testing, and software version merging.In this paper, we survey matching techniques that can be used for multi-version program analyses and evaluate them based on hypothetical change scenarios. This paper also lists challenges of the matching problem, identifies open problems, and proposes future directions.

References

subversion.tigris.org.]]Google Scholar
www.cvshome.org]]Google Scholar
A. Aiken. A system for detecting software plagiarism.]]Google Scholar
G. Antoniol, M. D. Penta, and E. Merlo. An automatic approach to identify class evolution discontinuities. In IWPSE, pages 31--40, 2004.]] Google ScholarDigital Library
T. Apiwattanapong, A. Orso, and M. J. Harrold. A differencing algorithm for object-oriented programs. In ASE, pages 2--13. IEEE Computer Society, 2004.]] Google ScholarDigital Library
T. Apiwattanapong, A. Orso, and M. J. Harrold. Efficient and precise dynamic impact analysis using execute-after sequences. In ICSE, pages 432--441, 2005.]] Google ScholarDigital Library
A. Apostolico and Z. Galil, editors. Pattern matching algorithms. Oxford University Press, UK, 1997.]] Google ScholarDigital Library
B. S. Baker. A program for identifying duplicated code. Computing Science and Statistics, 24:49--57, 1992.]]Google Scholar
I. D. Baxter, A. Yahin, L. M. de Moura, M. Sant'Anna, and L. Bier. Clone detection using abstract syntax trees. In ICSM, pages 368--377, 1998.]] Google ScholarDigital Library
J. Bevan, J. E. James Whitehead, S. Kim, and M. Godfrey. Facilitating software evolution research with Kenyon. In ESEC/FSE, pages 177--186, 2005.]] Google ScholarDigital Library
J. Bevan and E. J. W. Jr. Identification of software instabilities. In WCRE, pages 134--145, 2003.]] Google ScholarDigital Library
D. Binkley, S. Horwitz, and T. Reps. Program integration for languages with procedure calls. ACM TOSEM, 4(1):3--35, 1995.]] Google ScholarDigital Library
J. R. Cordy. Comprehending reality: Practical barriers to industrial adoption of software maintenance automation. In IWPC '03, page 196, 2003.]] Google ScholarDigital Library
S. Demeyer, S. Ducasse, and O. Nierstrasz. Finding refactorings via change metrics. In OOPSLA '00, pages 166--177, 2000.]] Google ScholarDigital Library
S. G. Eick, T. L. Graves, A. F. Karr, J. S. Marron, and A. Mockus. Does code decay? Assessing the evidence from change management data. IEEE Trans. Softw. Eng., 27(1):1--12, 2001.]] Google ScholarDigital Library
M. D. Ernst. Dynamically Discovering Likely Program Invariants. Ph.D. Disseratation, University of Washington, Seattle, Washington, Aug. 2000.]] Google ScholarDigital Library
H. Gall, K. Hajek, and M. Jazayeri. Detection of logical coupling based on product release history. In ICSM, pages 190--197, 1998.]] Google ScholarDigital Library
C. Görg and P. Weißgerber. Error detection by refactoring reconstruction. In MSR '05, pages 29--35.]]Google Scholar
C. Görg and P. Weißgerber. Detecting and visualizing refactorings from software archives. In IWPC, pages 205--214, 2005.]] Google ScholarDigital Library
T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy. Predicting fault incidence using software change history. IEEE Trans. Softw. Eng., 26(7):653--661, 2000.]] Google ScholarDigital Library
M. J. Harrold, J. A. Jones, T. Li, D. Liang, A. Orso, M. Pennings, S. Sinha, S. A. Spoon, and A. Gujarathi. Regression test selection for Java software. In OOPSLA '01, pages 312--326, 2001.]] Google ScholarDigital Library
J. Henkel and A. Diwan. Catchup!: capturing and replaying refactorings to support API evolution. In ICSE '05, pages 274--283, 2005.]] Google ScholarDigital Library
S. Horwitz. Identifying the semantic and textual differences between two versions of a program. In PLDI'90, volume 25, pages 234--245, June 1990.]] Google ScholarDigital Library
J. J. Hunt and W. F. Tichy. Extensible language-aware merging. In ICSM, pages 511--520, 2002.]] Google ScholarDigital Library
J. W. Hunt and T. G. Szymanski. A fast algorithm for computing longest common subsequences. Commun. ACM, 20(5):350--353, 1977.]] Google ScholarDigital Library
D. Jackson and D. A. Ladd. SemanticDiff: A tool for summarizing the effects of modifications. In ICSM '94, pages 243--252, 1994.]] Google ScholarDigital Library
J. H. Johnson. Identifying redundancy in source code using fingerprints. In CASCON, pages 171--183. IBM Press, 1993.]] Google ScholarDigital Library
T. Kamiya, S. Kusumoto, and K. Inoue. CCFinder: A multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng., 28(7):654--670, 2002.]] Google ScholarDigital Library
M. Kim, V. Sazawal, D. Notkin, and G. C. Murphy. An empirical study of code clone genealogies. In ESEC/SIGSOFT FSE, pages 187--196, 2005.]] Google ScholarDigital Library
S. Kim, K. Pan, and J. E. James Whitehead. When functions change their names: Automatic detection of origin relationships. In WCRE, 2005.]] Google ScholarDigital Library
S. Kim, E. J. Whitehead, and J. Bevan. Analysis of signature change patterns. In MSR '05, pages 64--68.]] Google ScholarDigital Library
R. Komondoor and S. Horwitz. Using slicing to identify duplication in source code. In SAS, pages 40--56, 2001.]] Google ScholarDigital Library
J. Laski and W. Szermer. Identification of program modifications and its applications in software maintenance. In ICSM, 1992.]]Google ScholarCross Ref
E. Lippe and N. van Oosterom. Operation-based merging. In SDE'92, pages 78--87, 1992.]] Google ScholarDigital Library
G. Malpohl, J. J. Hunt, and W. F. Tichy. Renaming detection. Autom. Softw. Eng., 10(2):183--202, 2000.]] Google ScholarDigital Library
T. Mens. A state-of-the-art survey on software merging. IEEE Trans. Softw. Eng., 28(5):449--462, 2002.]] Google ScholarDigital Library
N. Nagappan and T. Ball. Use of relative code churn measures to predict system defect density. In ICSE, pages 284--292, 2005.]] Google ScholarDigital Library
I. Neamtiu, J. S. Foster, and M. Hicks. Understanding source code evolution using abstract syntax tree matching. In MSR'05, pages 2--6.]] Google ScholarDigital Library
G. C. Necula. Translation validation for an optimizing compiler. In PLDI '00, pages 83--94, 2000.]] Google ScholarDigital Library
A. Orso, N. Shi, and M. J. Harrold. Scaling regression testing to large software systems. In SIGSOFT '04/FSE-12, pages 241--251, 2004.]] Google ScholarDigital Library
D. L. Parnas. On the criteria to be used in decomposing systems into modules. Commun. ACM, 15(12):1053--1058, 1972.]] Google ScholarDigital Library
G. Rothermel and M. J. Harrold.A safe, efficient regression test selection technique. ACM TOSEM, 6(2):173--210, 1997.]] Google ScholarDigital Library
J. Sliwerski, T. Zimmermann, and A. Zeller. When do changes induce fixes? In MSR '05, pages 24--28, 2005.]] Google ScholarDigital Library
A. Srivastava and J. Thiagarajan. Effectively prioritizing tests in development environment. In ISSTA '02, pages 97--106, 2002.]] Google ScholarDigital Library
W. F. Tichy. The string-to-string correction problem with block moves. ACM Trans. Comput. Syst., 2(4):309--321, 1984.]] Google ScholarDigital Library
Z. Wang, K. Pierce, and S. McFarling. BMAT - a binary matching tool for stale profile propagation. J. Instruction-Level Parallelism, 2, 2000.]]Google Scholar
W. Yang. Identifying syntactic differences between two programs. Software - Practice and Experience, 21(7):739--755, 1991.]] Google ScholarDigital Library
A. T. T. Ying, G. C. Murphy, R. Ng, and M. Chu-Carroll. Predicting source code changes by mining change history. IEEE Trans. Softw. Eng., 30(9):574--586, 2004.]] Google ScholarDigital Library
X. Zhang and R. Gupta. Matching execution histories of program versions. In ESEC/SIGSOFT FSE, pages 197--206, 2005.]] Google ScholarDigital Library
T. Zimmermann and P. Weißgerber. Preprocessing CVS data for fine-grained analysis. In MSR'04, pages 2--6.]]Google Scholar
T. Zimmermann, P. Weiß;gerber, S. Diehl, and A. Zeller. Mining version histories to guide software changes. IEEE Trans. Softw. Eng., 31(6):429--445, 2005.]] Google ScholarDigital Library
L. Zou and M. W. Godfrey. Using origin analysis to detect merging and splitting of source code entities. IEEE Trans. Softw. Eng., 31(2):166--181, 2005.]] Google ScholarDigital Library

Index Terms

Program element matching for multi-version program analyses
1. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
      1. Software reverse engineering

Recommendations

Precise Version Control of Trees with Line-Based Version Control Systems
Proceedings of the 20th International Conference on Fundamental Approaches to Software Engineering - Volume 10202

Version control of tree structures, ubiquitous in software engineering, is typically performed on a textual encoding of the trees, rather than the trees directly. Applying standard line-based diff and merge algorithms to such encodings leads to ...
Read More
Matching execution histories of program versions

We develop a method for matching dynamic histories of program executions of two program versions. The matches produced can be useful in many applications including software piracy detection and several debugging scenarios. Unlike some static approaches ...
Read More
Matching execution histories of program versions
ESEC/FSE-13: Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering

We develop a method for matching dynamic histories of program executions of two program versions. The matches produced can be useful in many applications including software piracy detection and several debugging scenarios. Unlike some static approaches ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MSR '06: Proceedings of the 2006 international workshop on Mining software repositories
May 2006
191 pages
ISBN:1595933972
DOI:10.1145/1137983
General Chairs:
Stephan Diehl
University Trier, Germany
,
Harald Gall
University of Zurich, Switzerland
,
Ahmed E. Hassan
Research in Motion RIM, Canada
Copyright © 2006 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 May 2006
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
matching
multi-version analysis
software evolution
Qualifiers
- Article
Conference

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 76
  Total Citations
  View Citations
- 968
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.