Skip to main content
Top

2017 | OriginalPaper | Chapter

SCVD: A New Semantics-Based Approach for Cloned Vulnerable Code Detection

Authors : Deqing Zou, Hanchao Qi, Zhen Li, Song Wu, Hai Jin, Guozhong Sun, Sujuan Wang, Yuyi Zhong

Published in: Detection of Intrusions and Malware, and Vulnerability Assessment

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The behavior of copying existing code to reuse or modify its functionality is very common in the software development. However, when developers clone the existing code, they also clone any vulnerabilities in it. Thus, it seriously affects the security of the system. In this paper, we propose a novel semantics-based approach called SCVD for cloned vulnerable code detection. We use the full path traversal algorithm to transform the Program Dependency Graph (PDG) into a tree structure while preserving all the semantic information carried by the PDG and apply the tree to the cloned vulnerable code detection. We use the identifier name mapping technique to eliminate the impact of identifier name modification. Our key insights are converting the complex graph similarity problem into a simpler tree similarity problem and using the identifier name mapping technique to improve the effectiveness of semantics-based cloned vulnerable code detection. We have developed a practical tool based on our approach and performed a large number of experiments to evaluate the performance from three aspects, including the false positive rate, false negative rate, and time cost. The experiment results show that our approach has a significant improvement on the vulnerability detection effectiveness compared with the existing approaches and has lower time cost than subgraph isomorphism approaches.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
3.
go back to reference Baker, B.S.: On finding duplication and near-duplication in large software systems. In: Proceedings of 2nd Working Conference on Reverse Engineering, pp. 86–95. IEEE (1995) Baker, B.S.: On finding duplication and near-duplication in large software systems. In: Proceedings of 2nd Working Conference on Reverse Engineering, pp. 86–95. IEEE (1995)
4.
go back to reference Baxter, I.D., Yahin, A., Moura, L., Sant’Anna, M., Bier, L.: Clone detection using abstract syntax trees. In: Proceedings of International Conference on Software Maintenance, pp. 368–377 (1998) Baxter, I.D., Yahin, A., Moura, L., Sant’Anna, M., Bier, L.: Clone detection using abstract syntax trees. In: Proceedings of International Conference on Software Maintenance, pp. 368–377 (1998)
5.
go back to reference Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: An improved algorithm for matching large graphs. In: Proceedings of 3rd IAPR-TC15 Workshop on Graph-Based Representations in Pattern Recognition, pp. 149–159 (2001) Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: An improved algorithm for matching large graphs. In: Proceedings of 3rd IAPR-TC15 Workshop on Graph-Based Representations in Pattern Recognition, pp. 149–159 (2001)
6.
go back to reference Csardi, G., Nepusz, T.: The igraph software package for complex network research. Int. J. Complex Syst. 1695(5), 1–9 (2006) Csardi, G., Nepusz, T.: The igraph software package for complex network research. Int. J. Complex Syst. 1695(5), 1–9 (2006)
7.
go back to reference Ducasse, S., Rieger, M., Demeyer, S.: A language independent approach for detecting duplicated code. In: Proceedings of the International Conference on Software Maintenance (ICSM), pp. 109–118. IEEE (1999) Ducasse, S., Rieger, M., Demeyer, S.: A language independent approach for detecting duplicated code. In: Proceedings of the International Conference on Software Maintenance (ICSM), pp. 109–118. IEEE (1999)
8.
go back to reference Ferrante, J., Ottenstein, K.J., Warren, J.D.: The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst. (TOPLAS) 9(3), 319–349 (1987)CrossRefMATH Ferrante, J., Ottenstein, K.J., Warren, J.D.: The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst. (TOPLAS) 9(3), 319–349 (1987)CrossRefMATH
9.
go back to reference Gabel, M., Jiang, L., Su, Z.: Scalable detection of semantic clones. In: Proceedings of ACM/IEEE 30th International Conference on Software Engineering (ICSE), pp. 321–330. IEEE (2008) Gabel, M., Jiang, L., Su, Z.: Scalable detection of semantic clones. In: Proceedings of ACM/IEEE 30th International Conference on Software Engineering (ICSE), pp. 321–330. IEEE (2008)
10.
go back to reference Jang, J., Agrawal, A., Brumley, D.: ReDeBug: finding unpatched code clones in entire OS distributions. In: Proceedings of IEEE Symposium on Security and Privacy (SP), pp. 48–62. IEEE (2012) Jang, J., Agrawal, A., Brumley, D.: ReDeBug: finding unpatched code clones in entire OS distributions. In: Proceedings of IEEE Symposium on Security and Privacy (SP), pp. 48–62. IEEE (2012)
11.
go back to reference Jiang, L., Misherghi, G., Su, Z., Glondu, S.: Deckard: scalable and accurate tree-based detection of code clones. In: Proceedings of the 29th International Conference on Software Engineering, pp. 96–105. IEEE Computer Society (2007) Jiang, L., Misherghi, G., Su, Z., Glondu, S.: Deckard: scalable and accurate tree-based detection of code clones. In: Proceedings of the 29th International Conference on Software Engineering, pp. 96–105. IEEE Computer Society (2007)
12.
go back to reference Johnson, J.H.: Identifying redundancy in source code using fingerprints. In: Proceedings of the 1993 Conference of the Centre for Advanced Studies on Collaborative Research, pp. 171–183. IBM Press (1993) Johnson, J.H.: Identifying redundancy in source code using fingerprints. In: Proceedings of the 1993 Conference of the Centre for Advanced Studies on Collaborative Research, pp. 171–183. IBM Press (1993)
13.
go back to reference Johnson, J.H.: Substring matching for clone detection and change tracking. In: Proceedings of the International Conference on Software Maintenance (ICSM), vol. 94, pp. 120–126 (1994) Johnson, J.H.: Substring matching for clone detection and change tracking. In: Proceedings of the International Conference on Software Maintenance (ICSM), vol. 94, pp. 120–126 (1994)
14.
go back to reference Jones, J.: Abstract syntax tree implementation idioms. In: Proceedings of the 10th Conference on Pattern Languages of Programs (PLoP). p. 26 (2003) Jones, J.: Abstract syntax tree implementation idioms. In: Proceedings of the 10th Conference on Pattern Languages of Programs (PLoP). p. 26 (2003)
15.
go back to reference Kamiya, T., Kusumoto, S., Inoue, K.: CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28(7), 654–670 (2002)CrossRef Kamiya, T., Kusumoto, S., Inoue, K.: CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28(7), 654–670 (2002)CrossRef
16.
go back to reference Kim, M., Sazawal, V., Notkin, D., Murphy, G.: An empirical study of code clone genealogies. In: ACM SIGSOFT Software Engineering Notes, vol. 30, pp. 187–196. ACM (2005) Kim, M., Sazawal, V., Notkin, D., Murphy, G.: An empirical study of code clone genealogies. In: ACM SIGSOFT Software Engineering Notes, vol. 30, pp. 187–196. ACM (2005)
17.
18.
go back to reference Koschke, R., Falke, R., Frenzel, P.: Clone detection using abstract syntax suffix trees. In: Proceedings of the 13th Working Conference on Reverse Engineering (WCRE), pp. 253–262. IEEE (2006) Koschke, R., Falke, R., Frenzel, P.: Clone detection using abstract syntax suffix trees. In: Proceedings of the 13th Working Conference on Reverse Engineering (WCRE), pp. 253–262. IEEE (2006)
19.
go back to reference Li, J., Ernst, M.D.: CBCD: cloned buggy code detector. In: Proceedings of 34th International Conference on Software Engineering (ICSE), pp. 310–320. IEEE (2012) Li, J., Ernst, M.D.: CBCD: cloned buggy code detector. In: Proceedings of 34th International Conference on Software Engineering (ICSE), pp. 310–320. IEEE (2012)
20.
go back to reference Li, Z., Zou, D., Xu, S., Jin, H., Qi, H., Hu, J.: VulPecker: an automated vulnerability detection system based on code similarity analysis. In: Proceedings of the 32nd Annual Conference on Computer Security Applications (ACSAC), pp. 201–213. ACM (2016) Li, Z., Zou, D., Xu, S., Jin, H., Qi, H., Hu, J.: VulPecker: an automated vulnerability detection system based on code similarity analysis. In: Proceedings of the 32nd Annual Conference on Computer Security Applications (ACSAC), pp. 201–213. ACM (2016)
21.
go back to reference Li, Z., Lu, S., Myagmar, S., Zhou, Y.: CP-Miner: finding copy-paste and related bugs in large-scale software code. IEEE Trans. Softw. Eng. 32(3), 176–192 (2006)CrossRef Li, Z., Lu, S., Myagmar, S., Zhou, Y.: CP-Miner: finding copy-paste and related bugs in large-scale software code. IEEE Trans. Softw. Eng. 32(3), 176–192 (2006)CrossRef
22.
go back to reference Mayrand, J., Leblanc, C., Merlo, E.: Experiment on the automatic detection of function clones in a software system using metrics. In: Proceedings of International Conference on Software Maintenance (ICSM), p. 244 (1996) Mayrand, J., Leblanc, C., Merlo, E.: Experiment on the automatic detection of function clones in a software system using metrics. In: Proceedings of International Conference on Software Maintenance (ICSM), p. 244 (1996)
24.
go back to reference Sajnani, H., Saini, V., Lopes, C.: A parallel and efficient approach to large scale clone detection. J. Softw. Evol. Process 27(6), 402–429 (2015)CrossRef Sajnani, H., Saini, V., Lopes, C.: A parallel and efficient approach to large scale clone detection. J. Softw. Evol. Process 27(6), 402–429 (2015)CrossRef
25.
go back to reference Sheneamer, A., Kalita, J.: Semantic clone detection using machine learning. In: Proceedings of 15th IEEE International Conference on Machine Learning and Applications, pp. 1024–1028. IEEE (2016) Sheneamer, A., Kalita, J.: Semantic clone detection using machine learning. In: Proceedings of 15th IEEE International Conference on Machine Learning and Applications, pp. 1024–1028. IEEE (2016)
26.
go back to reference White, M., Tufano, M., Vendome, C., Poshyvanyk, D.: Deep learning code fragments for code clone detection. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp. 87–98. ACM (2016) White, M., Tufano, M., Vendome, C., Poshyvanyk, D.: Deep learning code fragments for code clone detection. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp. 87–98. ACM (2016)
27.
go back to reference Yamaguchi, F., Golde, N., Arp, D., Rieck, K.: Modeling and discovering vulnerabilities with code property graphs. In: Proceedings of IEEE Symposium on Security and Privacy (SP), pp. 590–604. IEEE (2014) Yamaguchi, F., Golde, N., Arp, D., Rieck, K.: Modeling and discovering vulnerabilities with code property graphs. In: Proceedings of IEEE Symposium on Security and Privacy (SP), pp. 590–604. IEEE (2014)
Metadata
Title
SCVD: A New Semantics-Based Approach for Cloned Vulnerable Code Detection
Authors
Deqing Zou
Hanchao Qi
Zhen Li
Song Wu
Hai Jin
Guozhong Sun
Sujuan Wang
Yuyi Zhong
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-60876-1_15

Premium Partner