Skip to main content
Erschienen in: Pattern Analysis and Applications 3/2023

19.02.2023 | Theoretical Advances

ABSLearn: a GNN-based framework for aliasing and buffer-size information retrieval

verfasst von: Ke Liang, Jim Tan, Dongrui Zeng, Yongzhe Huang, Xiaolei Huang, Gang Tan

Erschienen in: Pattern Analysis and Applications | Ausgabe 3/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Inferring aliasing and buffer-size information is important to understanding a C program's memory layout, which is critical to program analysis and security-related tasks. However, traditional static and dynamic program analysis methods suffer from certain limitations: static alias analysis methods suffer from precision loss and have poor scalability. Meanwhile, although dynamic analysis can achieve high precision, there is no soundness guarantee, and an online analysis may cause non-negligible runtime overhead. Besides, the current methods can only capture aliasing information. As for the buffer-size relational information, which is the specific variable storing the size of the buffer pointed by the pointers, it is tough to analyze by traditional methods. Moreover, we observe that most methods are designed for specific information. To address these limitations, we present ABSLearn, a deep learning framework that is capable of retrieving both aliasing and buffer-size information from C programs. The core idea of ABSLearn is to formulate the information retrieval as a link prediction problem, where a Graph Neural Network (GNN) model is applied to solve the problem. We developed the first related dataset that contains 285 C program samples to train ABSLearn. Then, the trained model is applied to infer the information on three practical benchmarks: Gzip-1.2.4, Make-3.80, and Tar-1.15.1. The results show that ABSLearn achieves comparable performance and excellent runtime performance. As the first attempt at applying GNN to infer aliasing and buffer-size information, ABSLearn can potentially benefit future program analysis frameworks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The C library function void *realloc(void *ptr, size_t size) attempts to resize the memory block pointed to by ptr that was previously allocated with a call to malloc or calloc.
 
Literatur
2.
Zurück zum Zitat Evans I, Long F, Otgonbaatar U, Shrobe H, Rinard M, Okhravi H, Sidiroglou-Douskos S (2015) Control jujutsu: on the weaknesses of fine-grained control flow integrity. In: Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. Association for Computing Machinery, New York, p 901–913. https://doi.org/10.1145/2810103.2813646 Evans I, Long F, Otgonbaatar U, Shrobe H, Rinard M, Okhravi H, Sidiroglou-Douskos S (2015) Control jujutsu: on the weaknesses of fine-grained control flow integrity. In: Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. Association for Computing Machinery, New York, p 901–913. https://​doi.​org/​10.​1145/​2810103.​2813646
3.
Zurück zum Zitat Zeng D, Tan G (2018) From debugging-information based binary-level type inference to CFG generation. In: Proceedings of the eighth ACM conference on data and application security and privacy. Association for Computing Machinery, New York, p 366–376.https://doi.org/10.1145/3176258.3176309 Zeng D, Tan G (2018) From debugging-information based binary-level type inference to CFG generation. In: Proceedings of the eighth ACM conference on data and application security and privacy. Association for Computing Machinery, New York, p 366–376.https://​doi.​org/​10.​1145/​3176258.​3176309
4.
Zurück zum Zitat Lu K, Hu H (2019) Where does it go? Refining indirect-call targets with multi-layer type analysis. In: Proceedings of the 2019 ACM SIGSAC conference on computer and communications security. Association for Computing Machinery, p 1867–1881. https://doi.org/10.1145/3319535.3354244 Lu K, Hu H (2019) Where does it go? Refining indirect-call targets with multi-layer type analysis. In: Proceedings of the 2019 ACM SIGSAC conference on computer and communications security. Association for Computing Machinery, p 1867–1881. https://​doi.​org/​10.​1145/​3319535.​3354244
7.
Zurück zum Zitat Zhang C, Wei T, Chen Z, Duan L, Szekeres L, McCamant S, Song D, Zou W (2013) Practical control flow integrity and randomization for binary executables. In: IEEE symposium on security and privacy (S&P). pp 559–573. https://doi.org/10.1109/SP.2013.44 Zhang C, Wei T, Chen Z, Duan L, Szekeres L, McCamant S, Song D, Zou W (2013) Practical control flow integrity and randomization for binary executables. In: IEEE symposium on security and privacy (S&P). pp 559–573. https://​doi.​org/​10.​1109/​SP.​2013.​44
9.
Zurück zum Zitat Tice C, Roeder T, Collingbourne P, Checkoway S, Erlingsson Ú, Lozano L, Pike G (2014) Enforcing forward-edge control-flow integrity in GCC & LLVM. In: Proceedings of the 23rd USENIX conference on Security Symposium (SEC'14). USENIX Association, USA, pp 941–955. https://doi.org/10.5555/2671225.2671285 Tice C, Roeder T, Collingbourne P, Checkoway S, Erlingsson Ú, Lozano L, Pike G (2014) Enforcing forward-edge control-flow integrity in GCC & LLVM. In: Proceedings of the 23rd USENIX conference on Security Symposium (SEC'14). USENIX Association, USA, pp 941–955. https://​doi.​org/​10.​5555/​2671225.​2671285
12.
Zurück zum Zitat van der Veen V, Andriesse D, Göktaş E, Gras B, Sambuc L, Slowinska A, Bos H, Giuffrida C (2015) Practical context-sensitive CFI. In: Proceedings of the 22nd ACM SIGSAC conference on computer and communications security (CCS '15). Association for Computing Machinery, New York, pp 927–940. https://doi.org/10.1145/2810103.2813673 van der Veen V, Andriesse D, Göktaş E, Gras B, Sambuc L, Slowinska A, Bos H, Giuffrida C (2015) Practical context-sensitive CFI. In: Proceedings of the 22nd ACM SIGSAC conference on computer and communications security (CCS '15). Association for Computing Machinery, New York, pp 927–940. https://​doi.​org/​10.​1145/​2810103.​2813673
14.
Zurück zum Zitat van der Veen V, G ̈oktas E, Contag M, Pawoloski A, Chen X, Rawat S, Bos H, Holz T, Athanasopoulos E, Giuffrida C (2016) A tough call: Mitigating advanced code-reuse attacks at the binary level,” in IEEE Symposium on Security and Privacy (S&P). pp. 934–953. https://doi.org/10.1109/SP.2016.60 van der Veen V, G ̈oktas E, Contag M, Pawoloski A, Chen X, Rawat S, Bos H, Holz T, Athanasopoulos E, Giuffrida C (2016) A tough call: Mitigating advanced code-reuse attacks at the binary level,” in IEEE Symposium on Security and Privacy (S&P). pp. 934–953. https://​doi.​org/​10.​1109/​SP.​2016.​60
18.
19.
Zurück zum Zitat Xu S, Huang W, Lie D (2021) In-fat pointer: hardware-assisted tagged-pointer spatial memory safety defense with subobject granularity protection. In: Proceedings of the 26th ACM international conference on architectural support for programming languages and operating systems, 224–240. https://doi.org/10.1145/3445814.3446761 Xu S, Huang W, Lie D (2021) In-fat pointer: hardware-assisted tagged-pointer spatial memory safety defense with subobject granularity protection. In: Proceedings of the 26th ACM international conference on architectural support for programming languages and operating systems, 224–240. https://​doi.​org/​10.​1145/​3445814.​3446761
24.
Zurück zum Zitat Kilpatrick D (2003) Privman: a library for partitioning applications. In: USENIX annual technical conference, pp 273–284 Kilpatrick D (2003) Privman: a library for partitioning applications. In: USENIX annual technical conference, pp 273–284
27.
32.
35.
Zurück zum Zitat Mambretti A, Onarlioglu K, Mulliner C, Robertson W, Kirda E, Maggi F, Zanero S (2016) Trellis: privilege separation for multi-user applications made easy. In: International symposium on research in attacks, intrusions and Defenses (RAID), pp. 437–456. https://doi.org/10.1007/978-3-319-45719-2_20 Mambretti A, Onarlioglu K, Mulliner C, Robertson W, Kirda E, Maggi F, Zanero S (2016) Trellis: privilege separation for multi-user applications made easy. In: International symposium on research in attacks, intrusions and Defenses (RAID), pp. 437–456. https://​doi.​org/​10.​1007/​978-3-319-45719-2_​20
36.
Zurück zum Zitat Lind J, Priebe C, Muthukumaran D, O’Keeffe D, Aublin P, Kelbert F, Reiher T, Goltzsche D, Eyers DM, Kapitza R, Fetzer C, Pietzuch PR (2017) Glamdring: automatic application partitioning for intel SGX. In: USENIX annual technical conference (ATC). pp. 285–298. https://dl.acm.org/doi/https://doi.org/10.5555/3154690.3154718 Lind J, Priebe C, Muthukumaran D, O’Keeffe D, Aublin P, Kelbert F, Reiher T, Goltzsche D, Eyers DM, Kapitza R, Fetzer C, Pietzuch PR (2017) Glamdring: automatic application partitioning for intel SGX. In: USENIX annual technical conference (ATC). pp. 285–298. https://​dl.​acm.​org/​doi/​https://​doi.​org/​10.​5555/​3154690.​3154718
43.
48.
Zurück zum Zitat Jaccard P (1901) Distribution de la flore alpine dans le Bassin des Dranses et dans quelques regions voisines. Bulletin de la Société Vaudoise des Sciences Naturelles 37:241–272 Jaccard P (1901) Distribution de la flore alpine dans le Bassin des Dranses et dans quelques regions voisines. Bulletin de la Société Vaudoise des Sciences Naturelles 37:241–272
57.
Zurück zum Zitat Kuo TT, Yan R, Huang YY, Kung PH, Lin SD (2013) Unsupervised link prediction using aggregative statistics on heterogeneous social networks. In Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, ser. KDD ’13, Association for Computing Machinery, New York, pp 775–783. https://doi.org/10.1145/2487575.2487614 Kuo TT, Yan R, Huang YY, Kung PH, Lin SD (2013) Unsupervised link prediction using aggregative statistics on heterogeneous social networks. In Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, ser. KDD ’13, Association for Computing Machinery, New York, pp 775–783. https://​doi.​org/​10.​1145/​2487575.​2487614
59.
Zurück zum Zitat Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, ser. KDD ’14. ACM, New York, pp 701–710. https://doi.org/10.1145/2623330.2623732 Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, ser. KDD ’14. ACM, New York, pp 701–710. https://​doi.​org/​10.​1145/​2623330.​2623732
60.
Zurück zum Zitat Grover A, Leskovec J (2016) Node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining , ser. KDD ’16. Association for Computing Machinery, New York, p 855–864. https://doi.org/10.1145/2939672.2939754 Grover A, Leskovec J (2016) Node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining , ser. KDD ’16. Association for Computing Machinery, New York, p 855–864. https://​doi.​org/​10.​1145/​2939672.​2939754
61.
Zurück zum Zitat Ribeiro L F, Saverese P H, Figueiredo D R (2017) Struc2vec: Learning node representations from structural identity. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining , ser. KDD ’17. ACM, New York, pp 385–394. https://doi.org/10.1145/3097983.3098061 Ribeiro L F, Saverese P H, Figueiredo D R (2017) Struc2vec: Learning node representations from structural identity. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining , ser. KDD ’17. ACM, New York, pp 385–394. https://​doi.​org/​10.​1145/​3097983.​3098061
62.
Zurück zum Zitat Perozzi B, Kulkarni V, Chen H, Skiena S (2017) Don't Walk, Skip! Online learning of multi-scale network embeddings. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, pp 258–265. https://doi.org/10.1145/3110025.3110086 Perozzi B, Kulkarni V, Chen H, Skiena S (2017) Don't Walk, Skip! Online learning of multi-scale network embeddings. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, pp 258–265. https://​doi.​org/​10.​1145/​3110025.​3110086
63.
Zurück zum Zitat Schlichtkrull M, Kipf T N, Bloem P, Berg R V D, Titov I, Welling M (2018) Modeling relational data with graph convolutional networks. In: European semantic web conference, Springer, Cham, pp 593–607 Schlichtkrull M, Kipf T N, Bloem P, Berg R V D, Titov I, Welling M (2018) Modeling relational data with graph convolutional networks. In: European semantic web conference, Springer, Cham, pp 593–607
68.
Zurück zum Zitat Xu W, DuVarney D C, Sekar R (2004) An efficient and backwards-compatible transformation to ensure memory safety of C programs. In: Proceedings of the 12th ACM SIGSOFT twelfth international symposium on foundations of software engineering, pp 117–126. https://doi.org/10.1145/1029894.1029913 Xu W, DuVarney D C, Sekar R (2004) An efficient and backwards-compatible transformation to ensure memory safety of C programs. In: Proceedings of the 12th ACM SIGSOFT twelfth international symposium on foundations of software engineering, pp 117–126. https://​doi.​org/​10.​1145/​1029894.​1029913
69.
Zurück zum Zitat Blanchet B, Cousot P, Cousot R, Feret J, Mauborgne L, Miné A, Monniaux D, Rival X (2003) A static analyzer for large safety-critical software. In: Proceedings of the ACM SIGPLAN 2003 conference on programming language design and implementation, pp 196–207. https://doi.org/10.1145/781131.781153 Blanchet B, Cousot P, Cousot R, Feret J, Mauborgne L, Miné A, Monniaux D, Rival X (2003) A static analyzer for large safety-critical software. In: Proceedings of the ACM SIGPLAN 2003 conference on programming language design and implementation, pp 196–207. https://​doi.​org/​10.​1145/​781131.​781153
70.
Zurück zum Zitat Dor N, Rodeh M, Sagiv M (2003) CSSV: Towards a realistic tool for statically detecting all buffer overflows in C. In: Proceedings of the ACM SIGPLAN 2003 conference on programming language design and implementation, pp 155–167. https://doi.org/10.1145/781131.781149 Dor N, Rodeh M, Sagiv M (2003) CSSV: Towards a realistic tool for statically detecting all buffer overflows in C. In: Proceedings of the ACM SIGPLAN 2003 conference on programming language design and implementation, pp 155–167. https://​doi.​org/​10.​1145/​781131.​781149
71.
Zurück zum Zitat Nethercote N, Fitzhardinge J (2004) Bounds-checking entire programs without recompiling. SPACE Nethercote N, Fitzhardinge J (2004) Bounds-checking entire programs without recompiling. SPACE
72.
Zurück zum Zitat Narayanan V, Huang Y, Tan G, Jaeger T, Burtsev A (2020) Lightweight kernel isolation with virtualization and VM functions. In: Proceedings of the 16th ACM SIGPLAN/SIGOPS international conference on virtual execution environments, pp 157–171. https://doi.org/10.1145/3381052.3381328 Narayanan V, Huang Y, Tan G, Jaeger T, Burtsev A (2020) Lightweight kernel isolation with virtualization and VM functions. In: Proceedings of the 16th ACM SIGPLAN/SIGOPS international conference on virtual execution environments, pp 157–171. https://​doi.​org/​10.​1145/​3381052.​3381328
74.
Zurück zum Zitat Gilmer J, Schoenholz S S, Riley P F, Vinyals O, Dahl G E (2017) Neural message passing for quantum chemistry. In: International conference on machine learning, ser. Proceedings of Machine Learning Research, D. Precup and Y. W. I, Eds., vol 70. PMLR, 06–11 Aug 2017, pp 1263–1272. https://doi.org/10.5555/3305381.3305512 Gilmer J, Schoenholz S S, Riley P F, Vinyals O, Dahl G E (2017) Neural message passing for quantum chemistry. In: International conference on machine learning, ser. Proceedings of Machine Learning Research, D. Precup and Y. W. I, Eds., vol 70. PMLR, 06–11 Aug 2017, pp 1263–1272. https://​doi.​org/​10.​5555/​3305381.​3305512
75.
Zurück zum Zitat Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? In: International conference on learning representations Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? In: International conference on learning representations
78.
Zurück zum Zitat Andersen LO (1994) Program analysis and specialization for the C programming language. University of Copenhagen, DIKU Andersen LO (1994) Program analysis and specialization for the C programming language. University of Copenhagen, DIKU
80.
Zurück zum Zitat Chen J, He H, Wu F, Wang J (2021) Topology-aware correlations between relations for inductive link prediction in knowledge graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol 35, No 7, pp 6271–6278 Chen J, He H, Wu F, Wang J (2021) Topology-aware correlations between relations for inductive link prediction in knowledge graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol 35, No 7, pp 6271–6278
81.
Zurück zum Zitat Mai S, Zheng S, Yang Y, Hu H (2021) Communicative message passing for inductive relation reasoning. In AAAI, pp 4294–4302 Mai S, Zheng S, Yang Y, Hu H (2021) Communicative message passing for inductive relation reasoning. In AAAI, pp 4294–4302
Metadaten
Titel
ABSLearn: a GNN-based framework for aliasing and buffer-size information retrieval
verfasst von
Ke Liang
Jim Tan
Dongrui Zeng
Yongzhe Huang
Xiaolei Huang
Gang Tan
Publikationsdatum
19.02.2023
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 3/2023
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-023-01142-2

Weitere Artikel der Ausgabe 3/2023

Pattern Analysis and Applications 3/2023 Zur Ausgabe

Premium Partner