skip to main content
article
Free Access

YAP3: improved detection of similarities in computer program and other texts

Published:01 March 1996Publication History
Skip Abstract Section

Abstract

In spite of years of effort, plagiarism in student assignment submissions still causes considerable difficulties for course designers; if students' work is not their own, how can anyone be certain they have learnt anything? YAP is a system for detecting suspected plagiarism in computer programs and other texts submitted by students. The paper reviews YAP3, the third version of YAP, focusing on its novel underlying algorithm - Running-Karp-Rabin Greedy-String-Tiling (or RKS-GST), whose development arose from the observation with YAP and other systems that students shuffle independent code segments. YAP3 is able to detect transposed subsequences, and is less perturbed by spurious additional statements. The paper concludes with a discussion of recent extension of YAP to English texts, further illustrating the flexibility of the YAP approach.

References

  1. 1 FAIDHI. J. A. W. AND S. K. ROBINSON, "'An Empirical Approach for Detecting Program S~milarity within a University Programming Environment", Computers and Education 11(1), pp. 11-19 (1987). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2 GONNET, G. H. AND R. BAEZA-YATES, Handbook of Algorithms attd Data Stnwtures (Second Edition), Addison-Wesley ( 1991). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3 GRIER, SAM, "A Tool that Detects Plagiarism in Pascal Programs", Twelfth SIGCSE Technical Symposium, St Louis, Missouri, pp. 15-20 (February 26-27, 1981) (SIGCSE Bulletin Vol. 13, No. 1, February 1981). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4 HECKEL, PAUL, "A Technique for Isolating Differences Between Files", Communications of the ACM 21(4), pp, 264-268 (April 1978). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5 KARP, RICHARD M. AND MICHAEL O. RABIN, "Efficient Randomized Pattern-Matching Algorithms", IBM Journal of Research and Development 31(2), pp. 249-260 (March 1987). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6 KRUSKAL, JOSEPH B., "An Overview of Sequence Comparison", Time Warps, String Edits and Macromolecules: The'Theory and Practice of Sequence Comparison, ed. David Sankoff and Joseph B. Kruskal, pp. 1--44, Addison Wesley (I983) (Chapter 1).Google ScholarGoogle Scholar
  7. 7 VERCO, KRISTINA L. AND MICHAEL J. WISE, "A Comparison of Structure-Metric and Counting Metric Plagiarism Detection Systems", Twenty-Seventh SIGCSE Technical Symposium, Philadelphia, U.S.A. (February 15-17, 1996) (Submitted to conference).Google ScholarGoogle Scholar
  8. 8 WHALE, G., "Identification of Program Similarity in Large Populations", The Computer Journal 33(2), pp. 140-146 (1990). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9 WISE, MICHAEL J, "Detection of Similarities in Student Programs: YAP'ing may be Preferable to Plague'ing", TwenO,- Third SIGCSE Technical Symposium, Kansas City, USA, pp. 268-271 (March 5-6, 1992). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 WISE, MICHAEL J, "Running Karp-Rabin Matching and Greedy String Tiling", Basser Department of Computer Science Technical Report, Sydney University(1994)(ftp://ftp.cs.su. oz.au/michaelw/rkr_gst.ps Revises Basset Technical Report 463, March 1993).Google ScholarGoogle Scholar
  11. 11 WISE, MICHAEL J, "Neweye~'. A System for Comparing Biological Sequences Using the Running Karp-Rabin Greedy String-Tiling Algorithm", Third hzternational Conference on Intelligent Systems for Molecular Biology, ambridge,England., pp. 393-401 (July 16-19, 1995).Google ScholarGoogle Scholar

Index Terms

  1. YAP3: improved detection of similarities in computer program and other texts

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGCSE Bulletin
          ACM SIGCSE Bulletin  Volume 28, Issue 1
          March 1996
          379 pages
          ISSN:0097-8418
          DOI:10.1145/236462
          Issue’s Table of Contents
          • cover image ACM Conferences
            SIGCSE '96: Proceedings of the twenty-seventh SIGCSE technical symposium on Computer science education
            March 1996
            447 pages
            ISBN:089791757X
            DOI:10.1145/236452

          Copyright © 1996 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 March 1996

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader