Skip to main content

2017 | OriginalPaper | Buchkapitel

SibStCNN and TBCNN + kNN-TED: New Models over Tree Structures for Source Code Classification

verfasst von : Anh Viet Phan, Minh Le Nguyen, Lam Thu Bui

Erschienen in: Intelligent Data Engineering and Automated Learning – IDEAL 2017

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper aims to solve a software engineering problem by applying several approaches to exploit tree representations of programs. Firstly, we propose a new sibling-subtree convolutional neural network (SibStCNN), and combination models of tree-based neural networks and k-Nearest Neighbors (kNN) for source code classification. Secondly, we present a pruning tree technique to reduce data dimension and strengthen classifiers. The experiments show that the proposed models outperform other methods, and the pruning tree leads to not only a substantial reduction in execution time but also an increase in accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
Literatur
1.
Zurück zum Zitat Binkley, D., Feild, H., Lawrie, D., Pighin, M.: Software fault prediction using language processing. In: Testing: Academic and Industrial Conference Practice and Research Techniques-MUTATION, 2007. TAICPART-MUTATION 2007, pp. 99–110. IEEE (2007) Binkley, D., Feild, H., Lawrie, D., Pighin, M.: Software fault prediction using language processing. In: Testing: Academic and Industrial Conference Practice and Research Techniques-MUTATION, 2007. TAICPART-MUTATION 2007, pp. 99–110. IEEE (2007)
2.
Zurück zum Zitat Huo, X., Li, M., Zhou, Z.-H.: Learning unified features from natural and programming languages for locating buggy source code Huo, X., Li, M., Zhou, Z.-H.: Learning unified features from natural and programming languages for locating buggy source code
3.
Zurück zum Zitat Joachims, T.: Making large scale SVM learning practical. Technical report, Universität Dortmund (1999) Joachims, T.: Making large scale SVM learning practical. Technical report, Universität Dortmund (1999)
4.
Zurück zum Zitat Kaur, J., Singh, S., Kahlon, K.S., Bassi, P.: Neural network-a novel technique for software effort estimation. Int. J. Comput. Theor. Eng. 2(1), 17 (2010)CrossRef Kaur, J., Singh, S., Kahlon, K.S., Bassi, P.: Neural network-a novel technique for software effort estimation. Int. J. Comput. Theor. Eng. 2(1), 17 (2010)CrossRef
5.
Zurück zum Zitat Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2007)CrossRef Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2007)CrossRef
6.
Zurück zum Zitat Mo, R., Cai, Y., Kazman, R., Xiao, L., Feng, Q.: Decoupling level: a new metric for architectural maintenance complexity. In: Proceedings of the 38th International Conference on Software Engineering, pp. 499–510. ACM (2016) Mo, R., Cai, Y., Kazman, R., Xiao, L., Feng, Q.: Decoupling level: a new metric for architectural maintenance complexity. In: Proceedings of the 38th International Conference on Software Engineering, pp. 499–510. ACM (2016)
7.
Zurück zum Zitat Mou, L., Li, G., Zhang, L., Wang, T., Jin, Z.: Convolutional neural networks over tree structures for programming language processing. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (2016) Mou, L., Li, G., Zhang, L., Wang, T., Jin, Z.: Convolutional neural networks over tree structures for programming language processing. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (2016)
8.
Zurück zum Zitat Phan, V.A., Chau, N.P., Nguyen, M.L.: Exploiting tree structures for classifying programs by functionalities. In: 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE), pp. 85–90. IEEE (2016) Phan, V.A., Chau, N.P., Nguyen, M.L.: Exploiting tree structures for classifying programs by functionalities. In: 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE), pp. 85–90. IEEE (2016)
9.
Zurück zum Zitat Socher, R., Huang, E.H., Pennin, J., Manning, C.D., Ng, A.Y.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Advances in Neural Information Processing Systems, pp. 801–809 (2011) Socher, R., Huang, E.H., Pennin, J., Manning, C.D., Ng, A.Y.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Advances in Neural Information Processing Systems, pp. 801–809 (2011)
10.
Zurück zum Zitat Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1422–1432 (2015) Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1422–1432 (2015)
11.
Zurück zum Zitat Ugurel, S., Krovetz, R., Giles, C.L.: What’s the code?: automatic classification of source code archives. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 632–638. ACM (2002) Ugurel, S., Krovetz, R., Giles, C.L.: What’s the code?: automatic classification of source code archives. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 632–638. ACM (2002)
12.
Zurück zum Zitat Wang, S., Yao, X.: Using class imbalance learning for software defect prediction. IEEE Trans. Reliab. 62(2), 434–443 (2013)CrossRef Wang, S., Yao, X.: Using class imbalance learning for software defect prediction. IEEE Trans. Reliab. 62(2), 434–443 (2013)CrossRef
Metadaten
Titel
SibStCNN and TBCNN + kNN-TED: New Models over Tree Structures for Source Code Classification
verfasst von
Anh Viet Phan
Minh Le Nguyen
Lam Thu Bui
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-68935-7_14

Premium Partner