Skip to main content

2017 | OriginalPaper | Buchkapitel

SCVD: A New Semantics-Based Approach for Cloned Vulnerable Code Detection

verfasst von : Deqing Zou, Hanchao Qi, Zhen Li, Song Wu, Hai Jin, Guozhong Sun, Sujuan Wang, Yuyi Zhong

Erschienen in: Detection of Intrusions and Malware, and Vulnerability Assessment

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The behavior of copying existing code to reuse or modify its functionality is very common in the software development. However, when developers clone the existing code, they also clone any vulnerabilities in it. Thus, it seriously affects the security of the system. In this paper, we propose a novel semantics-based approach called SCVD for cloned vulnerable code detection. We use the full path traversal algorithm to transform the Program Dependency Graph (PDG) into a tree structure while preserving all the semantic information carried by the PDG and apply the tree to the cloned vulnerable code detection. We use the identifier name mapping technique to eliminate the impact of identifier name modification. Our key insights are converting the complex graph similarity problem into a simpler tree similarity problem and using the identifier name mapping technique to improve the effectiveness of semantics-based cloned vulnerable code detection. We have developed a practical tool based on our approach and performed a large number of experiments to evaluate the performance from three aspects, including the false positive rate, false negative rate, and time cost. The experiment results show that our approach has a significant improvement on the vulnerability detection effectiveness compared with the existing approaches and has lower time cost than subgraph isomorphism approaches.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Baker, B.S.: On finding duplication and near-duplication in large software systems. In: Proceedings of 2nd Working Conference on Reverse Engineering, pp. 86–95. IEEE (1995) Baker, B.S.: On finding duplication and near-duplication in large software systems. In: Proceedings of 2nd Working Conference on Reverse Engineering, pp. 86–95. IEEE (1995)
4.
Zurück zum Zitat Baxter, I.D., Yahin, A., Moura, L., Sant’Anna, M., Bier, L.: Clone detection using abstract syntax trees. In: Proceedings of International Conference on Software Maintenance, pp. 368–377 (1998) Baxter, I.D., Yahin, A., Moura, L., Sant’Anna, M., Bier, L.: Clone detection using abstract syntax trees. In: Proceedings of International Conference on Software Maintenance, pp. 368–377 (1998)
5.
Zurück zum Zitat Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: An improved algorithm for matching large graphs. In: Proceedings of 3rd IAPR-TC15 Workshop on Graph-Based Representations in Pattern Recognition, pp. 149–159 (2001) Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: An improved algorithm for matching large graphs. In: Proceedings of 3rd IAPR-TC15 Workshop on Graph-Based Representations in Pattern Recognition, pp. 149–159 (2001)
6.
Zurück zum Zitat Csardi, G., Nepusz, T.: The igraph software package for complex network research. Int. J. Complex Syst. 1695(5), 1–9 (2006) Csardi, G., Nepusz, T.: The igraph software package for complex network research. Int. J. Complex Syst. 1695(5), 1–9 (2006)
7.
Zurück zum Zitat Ducasse, S., Rieger, M., Demeyer, S.: A language independent approach for detecting duplicated code. In: Proceedings of the International Conference on Software Maintenance (ICSM), pp. 109–118. IEEE (1999) Ducasse, S., Rieger, M., Demeyer, S.: A language independent approach for detecting duplicated code. In: Proceedings of the International Conference on Software Maintenance (ICSM), pp. 109–118. IEEE (1999)
8.
Zurück zum Zitat Ferrante, J., Ottenstein, K.J., Warren, J.D.: The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst. (TOPLAS) 9(3), 319–349 (1987)CrossRefMATH Ferrante, J., Ottenstein, K.J., Warren, J.D.: The program dependence graph and its use in optimization. ACM Trans. Program. Lang. Syst. (TOPLAS) 9(3), 319–349 (1987)CrossRefMATH
9.
Zurück zum Zitat Gabel, M., Jiang, L., Su, Z.: Scalable detection of semantic clones. In: Proceedings of ACM/IEEE 30th International Conference on Software Engineering (ICSE), pp. 321–330. IEEE (2008) Gabel, M., Jiang, L., Su, Z.: Scalable detection of semantic clones. In: Proceedings of ACM/IEEE 30th International Conference on Software Engineering (ICSE), pp. 321–330. IEEE (2008)
10.
Zurück zum Zitat Jang, J., Agrawal, A., Brumley, D.: ReDeBug: finding unpatched code clones in entire OS distributions. In: Proceedings of IEEE Symposium on Security and Privacy (SP), pp. 48–62. IEEE (2012) Jang, J., Agrawal, A., Brumley, D.: ReDeBug: finding unpatched code clones in entire OS distributions. In: Proceedings of IEEE Symposium on Security and Privacy (SP), pp. 48–62. IEEE (2012)
11.
Zurück zum Zitat Jiang, L., Misherghi, G., Su, Z., Glondu, S.: Deckard: scalable and accurate tree-based detection of code clones. In: Proceedings of the 29th International Conference on Software Engineering, pp. 96–105. IEEE Computer Society (2007) Jiang, L., Misherghi, G., Su, Z., Glondu, S.: Deckard: scalable and accurate tree-based detection of code clones. In: Proceedings of the 29th International Conference on Software Engineering, pp. 96–105. IEEE Computer Society (2007)
12.
Zurück zum Zitat Johnson, J.H.: Identifying redundancy in source code using fingerprints. In: Proceedings of the 1993 Conference of the Centre for Advanced Studies on Collaborative Research, pp. 171–183. IBM Press (1993) Johnson, J.H.: Identifying redundancy in source code using fingerprints. In: Proceedings of the 1993 Conference of the Centre for Advanced Studies on Collaborative Research, pp. 171–183. IBM Press (1993)
13.
Zurück zum Zitat Johnson, J.H.: Substring matching for clone detection and change tracking. In: Proceedings of the International Conference on Software Maintenance (ICSM), vol. 94, pp. 120–126 (1994) Johnson, J.H.: Substring matching for clone detection and change tracking. In: Proceedings of the International Conference on Software Maintenance (ICSM), vol. 94, pp. 120–126 (1994)
14.
Zurück zum Zitat Jones, J.: Abstract syntax tree implementation idioms. In: Proceedings of the 10th Conference on Pattern Languages of Programs (PLoP). p. 26 (2003) Jones, J.: Abstract syntax tree implementation idioms. In: Proceedings of the 10th Conference on Pattern Languages of Programs (PLoP). p. 26 (2003)
15.
Zurück zum Zitat Kamiya, T., Kusumoto, S., Inoue, K.: CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28(7), 654–670 (2002)CrossRef Kamiya, T., Kusumoto, S., Inoue, K.: CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28(7), 654–670 (2002)CrossRef
16.
Zurück zum Zitat Kim, M., Sazawal, V., Notkin, D., Murphy, G.: An empirical study of code clone genealogies. In: ACM SIGSOFT Software Engineering Notes, vol. 30, pp. 187–196. ACM (2005) Kim, M., Sazawal, V., Notkin, D., Murphy, G.: An empirical study of code clone genealogies. In: ACM SIGSOFT Software Engineering Notes, vol. 30, pp. 187–196. ACM (2005)
17.
18.
Zurück zum Zitat Koschke, R., Falke, R., Frenzel, P.: Clone detection using abstract syntax suffix trees. In: Proceedings of the 13th Working Conference on Reverse Engineering (WCRE), pp. 253–262. IEEE (2006) Koschke, R., Falke, R., Frenzel, P.: Clone detection using abstract syntax suffix trees. In: Proceedings of the 13th Working Conference on Reverse Engineering (WCRE), pp. 253–262. IEEE (2006)
19.
Zurück zum Zitat Li, J., Ernst, M.D.: CBCD: cloned buggy code detector. In: Proceedings of 34th International Conference on Software Engineering (ICSE), pp. 310–320. IEEE (2012) Li, J., Ernst, M.D.: CBCD: cloned buggy code detector. In: Proceedings of 34th International Conference on Software Engineering (ICSE), pp. 310–320. IEEE (2012)
20.
Zurück zum Zitat Li, Z., Zou, D., Xu, S., Jin, H., Qi, H., Hu, J.: VulPecker: an automated vulnerability detection system based on code similarity analysis. In: Proceedings of the 32nd Annual Conference on Computer Security Applications (ACSAC), pp. 201–213. ACM (2016) Li, Z., Zou, D., Xu, S., Jin, H., Qi, H., Hu, J.: VulPecker: an automated vulnerability detection system based on code similarity analysis. In: Proceedings of the 32nd Annual Conference on Computer Security Applications (ACSAC), pp. 201–213. ACM (2016)
21.
Zurück zum Zitat Li, Z., Lu, S., Myagmar, S., Zhou, Y.: CP-Miner: finding copy-paste and related bugs in large-scale software code. IEEE Trans. Softw. Eng. 32(3), 176–192 (2006)CrossRef Li, Z., Lu, S., Myagmar, S., Zhou, Y.: CP-Miner: finding copy-paste and related bugs in large-scale software code. IEEE Trans. Softw. Eng. 32(3), 176–192 (2006)CrossRef
22.
Zurück zum Zitat Mayrand, J., Leblanc, C., Merlo, E.: Experiment on the automatic detection of function clones in a software system using metrics. In: Proceedings of International Conference on Software Maintenance (ICSM), p. 244 (1996) Mayrand, J., Leblanc, C., Merlo, E.: Experiment on the automatic detection of function clones in a software system using metrics. In: Proceedings of International Conference on Software Maintenance (ICSM), p. 244 (1996)
24.
Zurück zum Zitat Sajnani, H., Saini, V., Lopes, C.: A parallel and efficient approach to large scale clone detection. J. Softw. Evol. Process 27(6), 402–429 (2015)CrossRef Sajnani, H., Saini, V., Lopes, C.: A parallel and efficient approach to large scale clone detection. J. Softw. Evol. Process 27(6), 402–429 (2015)CrossRef
25.
Zurück zum Zitat Sheneamer, A., Kalita, J.: Semantic clone detection using machine learning. In: Proceedings of 15th IEEE International Conference on Machine Learning and Applications, pp. 1024–1028. IEEE (2016) Sheneamer, A., Kalita, J.: Semantic clone detection using machine learning. In: Proceedings of 15th IEEE International Conference on Machine Learning and Applications, pp. 1024–1028. IEEE (2016)
26.
Zurück zum Zitat White, M., Tufano, M., Vendome, C., Poshyvanyk, D.: Deep learning code fragments for code clone detection. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp. 87–98. ACM (2016) White, M., Tufano, M., Vendome, C., Poshyvanyk, D.: Deep learning code fragments for code clone detection. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp. 87–98. ACM (2016)
27.
Zurück zum Zitat Yamaguchi, F., Golde, N., Arp, D., Rieck, K.: Modeling and discovering vulnerabilities with code property graphs. In: Proceedings of IEEE Symposium on Security and Privacy (SP), pp. 590–604. IEEE (2014) Yamaguchi, F., Golde, N., Arp, D., Rieck, K.: Modeling and discovering vulnerabilities with code property graphs. In: Proceedings of IEEE Symposium on Security and Privacy (SP), pp. 590–604. IEEE (2014)
Metadaten
Titel
SCVD: A New Semantics-Based Approach for Cloned Vulnerable Code Detection
verfasst von
Deqing Zou
Hanchao Qi
Zhen Li
Song Wu
Hai Jin
Guozhong Sun
Sujuan Wang
Yuyi Zhong
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-60876-1_15