skip to main content
10.1145/3293882.3338996acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
short-paper

Go-clone: graph-embedding based clone detector for Golang

Published:10 July 2019Publication History

ABSTRACT

Golang (short for Go programming language) is a fast and compiled language, which has been increasingly used in industry due to its excellent performance on concurrent programming. Golang redefines concurrent programming grammar, making it a challenge for traditional clone detection tools and techniques. However, there exist few tools for detecting duplicates or copy-paste related bugs in Golang. Therefore, an effective and efficient code clone detector on Golang is especially needed.

In this paper, we present Go-Clone, a learning-based clone detector for Golang. Go-Clone contains two modules -- the training module and the user interaction module. In the training module, firstly we parse Golang source code into llvm IR (Intermediate Representation). Secondly, we calculate LSFG (labeled semantic flow graph) for each program function automatically. Go-Clone trains a deep neural network model to encode LSFGs for similarity classification. In the user interaction module, users can choose one or more Golang projects. Go-Clone identifies and presents a list of function pairs, which are most likely clone code for user inspection. To evaluate Go-Clone's performance, we collect 6,110 commit versions from 48 Github projects to construct a Golang clone detection data set. Go-Clone can reach the value of AUC (Area Under Curve) and ACC (Accuracy) for 89.61% and 83.80% in clone detection. By testing several groups of unfamiliar data, we also demonstrates the generility of Go-Clone. The address of the abstract demo video: https://youtu.be/o5DogtYGbeo

References

  1. Jian Gao, Xin Yang, Ying Fu, Yu Jiang, and Jiaguang Sun. 2018. VulSeeker: a semantic learning based vulnerability seeker for cross-platform binary. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, 896–899. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bryan Helmkamp, Chris Hulton, and Devon Blandin. 2018. Code Climate. https: //docs.codeclimate.com/docs/duplication. {Online; accessed 18-Sept-2018}.Google ScholarGoogle Scholar
  3. Rainer Koschke, Raimar Falke, and Pierre Frenzel. 2006. Clone detection using abstract syntax suffix trees. In Reverse Engineering, 2006. WCRE’06. 13th Working Conference on. IEEE, 253–262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Liuqing Li, He Feng, Wenjie Zhuang, Na Meng, and Barbara Ryder. 2017. CCLearner: A Deep Learning-Based Clone Detection Approach. In Software Maintenance and Evolution (ICSME), 2017 IEEE International Conference on. IEEE, 249– 260.Google ScholarGoogle ScholarCross RefCross Ref
  5. Lannan Luo, Jiang Ming, Dinghao Wu, Peng Liu, and Sencun Zhu. 2014. Semanticsbased obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 389–400. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Than McIntosh. 2018. gollvm - Git at Google. https://go.googlesource.com/gollvm/. {Online; accessed 20-Sept-2018}.Google ScholarGoogle Scholar
  7. Mibk. 2018. Dupl. https://github.com/mibk/dupl. {Online; accessed 18-Sept-2018}. Abstract 1 Introduction 2 Go-Clone Design 3 Evaluation 3.1 Experiment Setup 3.2 Result 4 Conclusion ReferencesGoogle ScholarGoogle Scholar

Index Terms

  1. Go-clone: graph-embedding based clone detector for Golang

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ISSTA 2019: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis
        July 2019
        451 pages
        ISBN:9781450362245
        DOI:10.1145/3293882

        Copyright © 2019 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 10 July 2019

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper

        Acceptance Rates

        Overall Acceptance Rate58of213submissions,27%

        Upcoming Conference

        ISSTA '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader