Skip to main content

2025 | OriginalPaper | Buchkapitel

A Graph-Based Approach for Software Functionality Classification on the Web

verfasst von : Yinhao Jiang, Michael Bewong, Arash Mahboubi, Sajal Halder, Rafiqul Islam, Md Zahidul Islam, Ryan H. L. Ip, Praveen Gauravaram, Jason Xue

Erschienen in: Web Information Systems Engineering – WISE 2024

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the context of rising cybersecurity threats within software supply chains, the precise classification of software package functionalities is essential for mitigating risks posed by the exploitation of third-party libraries in web-based systems. This paper introduces a novel approach employing a Heterogeneous Information Network (HIN) and the Metapath2Vec algorithm to elevate the security and reliability of software package classification within the NPM repository, which is crucial for web application development. Our methodology capitalises on intricate package dependencies and metadata to not only enhance classification accuracy but also effectively utilise the complex and dynamic relationships widespread in web ecosystems. Comparative analyses underscore that our framework outstrips conventional methods such as DeepWalk and Node2Vec, with substantial improvements in precision and recall across a majority of functionality classes assessed. This research significantly advances web information systems engineering by providing a robust framework for the dynamic analysis of relationships and functionalities in software packages, thereby strengthening the security resilience of web-based software ecosystems.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Auch, M., Weber, M., Mandl, P., Wolff, C.: Similarity-based analyses on software applications: a systematic literature review. J. Syst. Softw. 168, 110669 (2020)CrossRef Auch, M., Weber, M., Mandl, P., Wolff, C.: Similarity-based analyses on software applications: a systematic literature review. J. Syst. Softw. 168, 110669 (2020)CrossRef
2.
Zurück zum Zitat Chen, Y., Santosa, A.E., Sharma, A., Lo, D.: Automated identification of libraries from vulnerability data. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Practice, pp. 90–99 (2020) Chen, Y., Santosa, A.E., Sharma, A., Lo, D.: Automated identification of libraries from vulnerability data. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Practice, pp. 90–99 (2020)
4.
Zurück zum Zitat Dong, Y., Chawla, N.V., Swami, A.: metapath2vec: Scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 135–144 (2017) Dong, Y., Chawla, N.V., Swami, A.: metapath2vec: Scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 135–144 (2017)
5.
Zurück zum Zitat Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016) Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)
6.
Zurück zum Zitat Guendouz, M., Amine, A., Hamou, R.M.: Recommending relevant open source projects on github using a collaborative-filtering technique. Int. J. Open Source Softw. Process. 6(1), 1–16 (2015)CrossRef Guendouz, M., Amine, A., Hamou, R.M.: Recommending relevant open source projects on github using a collaborative-filtering technique. Int. J. Open Source Softw. Process. 6(1), 1–16 (2015)CrossRef
7.
Zurück zum Zitat He, Y., Song, Y., Li, J., Ji, C., Peng, J., Peng, H.: Hetespaceywalk: a heterogeneous spacey random walk for heterogeneous information network embedding. In: Proceedings of the 28th ACM International Conference on Information And Knowledge Management, pp. 639–648 (2019) He, Y., Song, Y., Li, J., Ji, C., Peng, J., Peng, H.: Hetespaceywalk: a heterogeneous spacey random walk for heterogeneous information network embedding. In: Proceedings of the 28th ACM International Conference on Information And Knowledge Management, pp. 639–648 (2019)
8.
Zurück zum Zitat Hussein, R., Yang, D., Cudré-Mauroux, P.: Are meta-paths necessary? revisiting heterogeneous graph embeddings. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 437–446 (2018) Hussein, R., Yang, D., Cudré-Mauroux, P.: Are meta-paths necessary? revisiting heterogeneous graph embeddings. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 437–446 (2018)
9.
Zurück zum Zitat Jureczko, M., Madeyski, L.: Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th International Conference on Predictive Models in Software Engineering, pp. 1–10 (2010) Jureczko, M., Madeyski, L.: Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th International Conference on Predictive Models in Software Engineering, pp. 1–10 (2010)
10.
Zurück zum Zitat Kawaguchi, S., Garg, P.K., Matsushita, M., Inoue, K.: Automatic categorization algorithm for evolvable software archive. In: Sixth International Workshop on Principles of Software Evolution, 2003. Proceedings, pp. 195–200. IEEE (2003) Kawaguchi, S., Garg, P.K., Matsushita, M., Inoue, K.: Automatic categorization algorithm for evolvable software archive. In: Sixth International Workshop on Principles of Software Evolution, 2003. Proceedings, pp. 195–200. IEEE (2003)
11.
Zurück zum Zitat Liu, B., Ding, M., Shaham, S., Rahayu, W., Farokhi, F., Lin, Z.: When machine learning meets privacy: a survey and outlook. ACM Comput. Surv. 54(2), 1–36 (2021)CrossRef Liu, B., Ding, M., Shaham, S., Rahayu, W., Farokhi, F., Lin, Z.: When machine learning meets privacy: a survey and outlook. ACM Comput. Surv. 54(2), 1–36 (2021)CrossRef
12.
Zurück zum Zitat Morrison, P., Oyetoyan, T.D., Williams, L.: Poster: identifying security issues in software development: are keywords enough? In: 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion), pp. 426–427 (2018) Morrison, P., Oyetoyan, T.D., Williams, L.: Poster: identifying security issues in software development: are keywords enough? In: 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE-Companion), pp. 426–427 (2018)
14.
Zurück zum Zitat Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014) Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)
15.
Zurück zum Zitat Sandhu, P.S., Bala, M., Singh, H.: Automatic categorization of software modules. IJCSNS 7(8), 114 (2007) Sandhu, P.S., Bala, M., Singh, H.: Automatic categorization of software modules. IJCSNS 7(8), 114 (2007)
16.
Zurück zum Zitat Sobb, T., Turnbull, B., Moustafa, N.: Supply chain 4.0: a survey of cyber security challenges, solutions and future directions. Electronics 9(11), 1864 (2020) Sobb, T., Turnbull, B., Moustafa, N.: Supply chain 4.0: a survey of cyber security challenges, solutions and future directions. Electronics 9(11), 1864 (2020)
17.
Zurück zum Zitat Sun, Y., Han, J.: Mining heterogeneous information networks: a structural analysis approach. ACM SIGKDD Explor. Newsl. 14(2), 20–28 (2013)CrossRef Sun, Y., Han, J.: Mining heterogeneous information networks: a structural analysis approach. ACM SIGKDD Explor. Newsl. 14(2), 20–28 (2013)CrossRef
18.
Zurück zum Zitat Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: meta path-based top-k similarity search in heterogeneous information networks. Proc. VLDB Endowm. 4(11), 992–1003 (2011)CrossRef Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: meta path-based top-k similarity search in heterogeneous information networks. Proc. VLDB Endowm. 4(11), 992–1003 (2011)CrossRef
20.
Zurück zum Zitat Xue, M., Yuan, C., Wu, H., Zhang, Y., Liu, W.: Machine learning security: threats, countermeasures, and evaluations. IEEE Access 8, 74720–74742 (2020)CrossRef Xue, M., Yuan, C., Wu, H., Zhang, Y., Liu, W.: Machine learning security: threats, countermeasures, and evaluations. IEEE Access 8, 74720–74742 (2020)CrossRef
22.
Zurück zum Zitat Yusof, Y., Ramadan, Q.H.: Automation of software artifacts classification. Int. J. Soft Comput. 5(3), 109–115 (2010)CrossRef Yusof, Y., Ramadan, Q.H.: Automation of software artifacts classification. Int. J. Soft Comput. 5(3), 109–115 (2010)CrossRef
Metadaten
Titel
A Graph-Based Approach for Software Functionality Classification on the Web
verfasst von
Yinhao Jiang
Michael Bewong
Arash Mahboubi
Sajal Halder
Rafiqul Islam
Md Zahidul Islam
Ryan H. L. Ip
Praveen Gauravaram
Jason Xue
Copyright-Jahr
2025
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-96-0576-7_5