Skip to main content

2018 | OriginalPaper | Buchkapitel

On Leveraging Coding Habits for Effective Binary Authorship Attribution

verfasst von : Saed Alrabaee, Paria Shirani, Lingyu Wang, Mourad Debbabi, Aiman Hanna

Erschienen in: Computer Security

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We propose BinAuthor, a novel and the first compiler-agnostic method for identifying the authors of program binaries. Having filtered out unrelated functions (compiler and library) to detect user-related functions, it converts user-related functions into a canonical form to eliminate compiler/compilation effects. Then, it leverages a set of features based on collections of authors’ choices made during coding. These features capture an author’s coding habits. Our evaluation demonstrated that BinAuthor outperforms existing methods in several respects. First, when tested on large datasets extracted from selected open-source C/C++ projects in GitHub, Google Code Jam events, and Planet Source Code contests, it successfully attributed a larger number of authors with a significantly higher accuracy: around \(90\%\) when the number of authors is 1000. Second, when the code was subjected to refactoring techniques, code transformation, or processing using different compilers or compilation settings, there was no significant drop in accuracy, indicating that BinAuthor is more robust than previous methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
15.
Zurück zum Zitat Alrabaee, S., Saleem, N., Preda, S., Wang, L., Debbabi, M.: OBA2: an onion approach to binary code authorship attribution. Digit. Investig. 11, S94–S103 (2014)CrossRef Alrabaee, S., Saleem, N., Preda, S., Wang, L., Debbabi, M.: OBA2: an onion approach to binary code authorship attribution. Digit. Investig. 11, S94–S103 (2014)CrossRef
17.
Zurück zum Zitat Alrabaee, S., Shirani, P., Wang, L., Debbabi, M.: SIGMA: a semantic integrated graph matching approach for identifying reused functions in binary code. Digit. Investig. 12, S61–S71 (2015)CrossRef Alrabaee, S., Shirani, P., Wang, L., Debbabi, M.: SIGMA: a semantic integrated graph matching approach for identifying reused functions in binary code. Digit. Investig. 12, S61–S71 (2015)CrossRef
18.
Zurück zum Zitat Alrabaee, S., Shirani, P., Wang, L., Debbabi, M.: FOSSIL: a resilient and efficient system for identifying FOSS functions in malware binaries. ACM Trans. Priv. Secur. (TOPS) 21(2), 8 (2018) Alrabaee, S., Shirani, P., Wang, L., Debbabi, M.: FOSSIL: a resilient and efficient system for identifying FOSS functions in malware binaries. ACM Trans. Priv. Secur. (TOPS) 21(2), 8 (2018)
19.
Zurück zum Zitat Caliskan-Islam, A., et al.: When coding style survives compilation: de-anonymizing programmers from executable binaries. Netw. Distrib. Syst. Secur. Symp. (NDSS) (2018) Caliskan-Islam, A., et al.: When coding style survives compilation: de-anonymizing programmers from executable binaries. Netw. Distrib. Syst. Secur. Symp. (NDSS) (2018)
20.
21.
Zurück zum Zitat David, Y., Partush, N., Yahav, E.: Similarity of binaries through re-optimization. In: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 79–94. ACM (2017) David, Y., Partush, N., Yahav, E.: Similarity of binaries through re-optimization. In: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 79–94. ACM (2017)
22.
Zurück zum Zitat Junod, P., Rinaldini, J., Wehrli, J., Michielin, J.: Obfuscator-LLVM: software protection for the masses. In: Proceedings of the 1st International Workshop on Software Protection, pp. 3–9. IEEE Press (2015) Junod, P., Rinaldini, J., Wehrli, J., Michielin, J.: Obfuscator-LLVM: software protection for the masses. In: Proceedings of the 1st International Workshop on Software Protection, pp. 3–9. IEEE Press (2015)
23.
Zurück zum Zitat Junttila, T.A., Kaski, P.: Engineering an efficient canonical labeling tool for large and sparse graphs. In: ALENEX, vol. 7, pp. 135–149. SIAM (2007) Junttila, T.A., Kaski, P.: Engineering an efficient canonical labeling tool for large and sparse graphs. In: ALENEX, vol. 7, pp. 135–149. SIAM (2007)
24.
Zurück zum Zitat Knuth, D.E.: Backus normal form vs. Backus Naur form. Commun. ACM 7(12), 735–736 (1964)CrossRef Knuth, D.E.: Backus normal form vs. Backus Naur form. Commun. ACM 7(12), 735–736 (1964)CrossRef
25.
Zurück zum Zitat Krsul, I., Spafford, E.H.: Authorship analysis: identifying the author of a program. Comput. Secur. 16(3), 233–257 (1997)CrossRef Krsul, I., Spafford, E.H.: Authorship analysis: identifying the author of a program. Comput. Secur. 16(3), 233–257 (1997)CrossRef
26.
Zurück zum Zitat Mahalanobis, P.C.: On the generalized distance in statistics. Proc. Natl. Inst. Sci. (Calcutta) 2, 49–55 (1936)MATH Mahalanobis, P.C.: On the generalized distance in statistics. Proc. Natl. Inst. Sci. (Calcutta) 2, 49–55 (1936)MATH
28.
Zurück zum Zitat Moran, N., Bennett, J.: Supply Chain Analysis: From Quartermaster to Sunshop, vol. 11. FireEye Labs, Milpitas (2013) Moran, N., Bennett, J.: Supply Chain Analysis: From Quartermaster to Sunshop, vol. 11. FireEye Labs, Milpitas (2013)
29.
Zurück zum Zitat Nethercote, N., Seward, J.: Valgrind: a framework for heavyweight dynamic binary instrumentation. In: ACM SIGPLAN Notices, vol. 42, pp. 89–100. ACM (2007) Nethercote, N., Seward, J.: Valgrind: a framework for heavyweight dynamic binary instrumentation. In: ACM SIGPLAN Notices, vol. 42, pp. 89–100. ACM (2007)
30.
Zurück zum Zitat Palmer, G., et al.: A road map for digital forensic research. In: First Digital Forensic Research Workshop, Utica, New York, pp. 27–30 (2001) Palmer, G., et al.: A road map for digital forensic research. In: First Digital Forensic Research Workshop, Utica, New York, pp. 27–30 (2001)
31.
Zurück zum Zitat Rajlich, V.: Software evolution and maintenance. In: Proceedings of the Future of Software Engineering, pp. 133–144. ACM (2014) Rajlich, V.: Software evolution and maintenance. In: Proceedings of the Future of Software Engineering, pp. 133–144. ACM (2014)
33.
Zurück zum Zitat Schleimer, S., Wilkerson, D.S., Aiken, A.: Winnowing: local algorithms for document fingerprinting. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 76–85. ACM (2003) Schleimer, S., Wilkerson, D.S., Aiken, A.: Winnowing: local algorithms for document fingerprinting. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 76–85. ACM (2003)
36.
Zurück zum Zitat Shoshitaishvili, Y., et al.: SOK: (state of) the art of war: offensive techniques in binary analysis. In: 2016 IEEE Symposium on Security and Privacy, SP, pp. 138–157. IEEE (2016) Shoshitaishvili, Y., et al.: SOK: (state of) the art of war: offensive techniques in binary analysis. In: 2016 IEEE Symposium on Security and Privacy, SP, pp. 138–157. IEEE (2016)
37.
Zurück zum Zitat Spafford, E.H., Weeber, S.A.: Software forensics: can we track code to its authors? Comput. Secur. 12(6), 585–595 (1993)CrossRef Spafford, E.H., Weeber, S.A.: Software forensics: can we track code to its authors? Comput. Secur. 12(6), 585–595 (1993)CrossRef
38.
Zurück zum Zitat Tristan, J.-B., Govereau, P., Morrisett, G.: Evaluating value-graph translation validation for LLVM. ACM SIGPLAN Not. 46(6), 295–305 (2011)CrossRef Tristan, J.-B., Govereau, P., Morrisett, G.: Evaluating value-graph translation validation for LLVM. ACM SIGPLAN Not. 46(6), 295–305 (2011)CrossRef
39.
Zurück zum Zitat Wang, J.T.-L., Ma, Q., Shasha, D., Wu, C.H.: New techniques for extracting features from protein sequences. IBM Syst. J. 40(2), 426–441 (2001)CrossRef Wang, J.T.-L., Ma, Q., Shasha, D., Wu, C.H.: New techniques for extracting features from protein sequences. IBM Syst. J. 40(2), 426–441 (2001)CrossRef
40.
Zurück zum Zitat Yujian, L., Bo, L.: A normalized Levenshtein distance metric. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1091–1095 (2007)CrossRef Yujian, L., Bo, L.: A normalized Levenshtein distance metric. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1091–1095 (2007)CrossRef
Metadaten
Titel
On Leveraging Coding Habits for Effective Binary Authorship Attribution
verfasst von
Saed Alrabaee
Paria Shirani
Lingyu Wang
Mourad Debbabi
Aiman Hanna
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-99073-6_2