Skip to main content

2017 | OriginalPaper | Buchkapitel

Identifying Multiple Authors in a Binary Program

verfasst von : Xiaozhu Meng, Barton P. Miller, Kwang-Sung Jun

Erschienen in: Computer Security – ESORICS 2017

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Knowing the authors of a binary program has significant application to forensics of malicious software (malware), software supply chain risk management, and software plagiarism detection. Existing techniques assume that a binary is written by a single author, which does not hold true in real world because most modern software, including malware, often contains code from multiple authors. In this paper, we make the first step toward identifying multiple authors in a binary. We present new fine-grained techniques to address the tougher problem of determining the author of each basic block. The decision of attributing authors at the basic block level is based on an empirical study of three large open source software, in which we find that a large fraction of basic blocks can be well attributed to a single author. We present new code features that capture programming style at the basic block level, our approach for identifying external template library code, and a new approach to capture correlations between the authors of basic blocks in a binary. Our experiments show strong evidence that programming styles can be recovered at the basic block level and it is practical to identify multiple authors in a binary.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abbasi, A., Li, W., Benjamin, V., Hu, S., Chen, H.: Descriptive analytics: examining expert hackers in web forums. In: 2014 IEEE Joint Intelligence and Security Informatics Conference (JISIC), Hague, Netherlands, September 2014 Abbasi, A., Li, W., Benjamin, V., Hu, S., Chen, H.: Descriptive analytics: examining expert hackers in web forums. In: 2014 IEEE Joint Intelligence and Security Informatics Conference (JISIC), Hague, Netherlands, September 2014
2.
Zurück zum Zitat Allodi, L., Corradin, M., Massacci, F.: Then and now: on the maturity of the cybercrime markets (the lesson that black-hat marketeers learned). IEEE Trans. Emerg. Top. Comput. 4 (2015) Allodi, L., Corradin, M., Massacci, F.: Then and now: on the maturity of the cybercrime markets (the lesson that black-hat marketeers learned). IEEE Trans. Emerg. Top. Comput. 4 (2015)
3.
Zurück zum Zitat Alrabaee, S., Saleem, N., Preda, S., Wang, L., Debbabi, M.: Oba2: an onion approach to binary code authorship attribution. Digit. Investig. 11(Suppl. 1), S94–S103 (2014)CrossRef Alrabaee, S., Saleem, N., Preda, S., Wang, L., Debbabi, M.: Oba2: an onion approach to binary code authorship attribution. Digit. Investig. 11(Suppl. 1), S94–S103 (2014)CrossRef
5.
Zurück zum Zitat Benjamin, V., Chen, H.: Securing cyberspace: identifying key actors in hacker communities. In: 2012 IEEE International Conference on Intelligence and Security Informatics (ISI), Arlington, VA, USA, June 2012 Benjamin, V., Chen, H.: Securing cyberspace: identifying key actors in hacker communities. In: 2012 IEEE International Conference on Intelligence and Security Informatics (ISI), Arlington, VA, USA, June 2012
6.
Zurück zum Zitat Burrows, S.: Source code authorship attribution. Ph.D. thesis, RMIT University, Melbourne, Victoria, Australia (2010) Burrows, S.: Source code authorship attribution. Ph.D. thesis, RMIT University, Melbourne, Victoria, Australia (2010)
7.
Zurück zum Zitat Caliskan-Islam, A., Harang, R., Liu, A., Narayanan, A., Voss, C., Yamaguchi, F., Greenstadt, R.: De-anonymizing programmers via code stylometry. In: 24th USENIX Security Symposium (SEC), Austin, TX, USA, August 2015 Caliskan-Islam, A., Harang, R., Liu, A., Narayanan, A., Voss, C., Yamaguchi, F., Greenstadt, R.: De-anonymizing programmers via code stylometry. In: 24th USENIX Security Symposium (SEC), Austin, TX, USA, August 2015
8.
Zurück zum Zitat Caliskan-Islam, A., Yamaguchi, F., Dauber, E., Harang, R., Rieck, K., Greenstadt, R., Narayanan, A.: When coding style survives compilation: de-anonymizing programmers from executable binaries. Technical report. arxiv http://arxiv.org/pdf/1512.08546.pdf Caliskan-Islam, A., Yamaguchi, F., Dauber, E., Harang, R., Rieck, K., Greenstadt, R., Narayanan, A.: When coding style survives compilation: de-anonymizing programmers from executable binaries. Technical report. arxiv http://​arxiv.​org/​pdf/​1512.​08546.​pdf
9.
Zurück zum Zitat Chatzicharalampous, E., Frantzeskou, G., Stamatatos, E.: Author identification in imbalanced sets of source code samples. In: 2012 IEEE 24th International Conference on Tools with Artificial Intelligence (ICTAI), Athens, Greece, November 2012 Chatzicharalampous, E., Frantzeskou, G., Stamatatos, E.: Author identification in imbalanced sets of source code samples. In: 2012 IEEE 24th International Conference on Tools with Artificial Intelligence (ICTAI), Athens, Greece, November 2012
10.
Zurück zum Zitat Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)MATH Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)MATH
11.
Zurück zum Zitat Croll, P.R.: Supply chain risk management-understanding vulnerabilities in code you buy, build, or integrate. In: 2011 IEEE International System Conference (SysCon), Montreal, QC, Canada, April 2011 Croll, P.R.: Supply chain risk management-understanding vulnerabilities in code you buy, build, or integrate. In: 2011 IEEE International System Conference (SysCon), Montreal, QC, Canada, April 2011
12.
Zurück zum Zitat de la Cuadra, F.: The geneology of malware. Netw. Secur. 4, 17–20 (2007)CrossRef de la Cuadra, F.: The geneology of malware. Netw. Secur. 4, 17–20 (2007)CrossRef
13.
Zurück zum Zitat David, Y., Partush, N., Yahav, E.: Statistical similarity of binaries. In: 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Santa Barbara, California, USA, June 2016 David, Y., Partush, N., Yahav, E.: Statistical similarity of binaries. In: 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Santa Barbara, California, USA, June 2016
14.
Zurück zum Zitat Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)MATH Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)MATH
17.
Zurück zum Zitat Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATH Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATH
18.
Zurück zum Zitat Hemel, A., Kalleberg, K.T., Vermaas, R., Dolstra, E.: Finding software license violations through binary code clone detection. In: 8th Working Conference on Mining Software Repositories (MSR), Waikiki, Honolulu, HI, USA, May 2011 Hemel, A., Kalleberg, K.T., Vermaas, R., Dolstra, E.: Finding software license violations through binary code clone detection. In: 8th Working Conference on Mining Software Repositories (MSR), Waikiki, Honolulu, HI, USA, May 2011
20.
Zurück zum Zitat Ho, T.K.: Random decision forests. In: 3rd International Conference on Document Analysis and Recognition (ICDAR), Montreal, Canada, August 1995 Ho, T.K.: Random decision forests. In: 3rd International Conference on Document Analysis and Recognition (ICDAR), Montreal, Canada, August 1995
21.
Zurück zum Zitat Holt, T.J., Strumsky, D., Smirnova, O., Kilger, M.: Examining the social networks of malware writers and hackers. Int. J. Cyber Criminol. 6(1), 891–903 (2012) Holt, T.J., Strumsky, D., Smirnova, O., Kilger, M.: Examining the social networks of malware writers and hackers. Int. J. Cyber Criminol. 6(1), 891–903 (2012)
23.
Zurück zum Zitat Jacobson, E.R., Rosenblum, N., Miller, B.P.: Labeling library functions in stripped binaries. In: 10th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools (PASTE), Szeged, Hungary, September 2011 Jacobson, E.R., Rosenblum, N., Miller, B.P.: Labeling library functions in stripped binaries. In: 10th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools (PASTE), Szeged, Hungary, September 2011
24.
Zurück zum Zitat Jang, J., Woo, M., Brumley, D.: Towards automatic software lineage inference. In: 22nd USENIX Conference on Security (SEC), Washington, D.C. (2013) Jang, J., Woo, M., Brumley, D.: Towards automatic software lineage inference. In: 22nd USENIX Conference on Security (SEC), Washington, D.C. (2013)
25.
Zurück zum Zitat Khoo, W.M., Mycroft, A., Anderson, R.: Rendezvous: a search engine for binary code. In: 10th Working Conference on Mining Software Repositories (MSR), San Francisco, CA, USA, May 2013 Khoo, W.M., Mycroft, A., Anderson, R.: Rendezvous: a search engine for binary code. In: 10th Working Conference on Mining Software Repositories (MSR), San Francisco, CA, USA, May 2013
26.
Zurück zum Zitat Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: 8th International Conference on Machine Learning (ICML), Bellevue, Washington, USA, June 2001 Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: 8th International Conference on Machine Learning (ICML), Bellevue, Washington, USA, June 2001
27.
Zurück zum Zitat Lindorfer, M., Di Federico, A., Maggi, F., Comparetti, P.M., Zanero, S.: Lines of malicious code: insights into the malicious software industry. In: 28th Annual Computer Security Applications Conference (ACSAC), Orlando, Florida, USA, December 2012 Lindorfer, M., Di Federico, A., Maggi, F., Comparetti, P.M., Zanero, S.: Lines of malicious code: insights into the malicious software industry. In: 28th Annual Computer Security Applications Conference (ACSAC), Orlando, Florida, USA, December 2012
29.
Zurück zum Zitat Marquis-Boire, M., Marschalek, M., Guarnieri, C.: Big game hunting: the peculiarities in nation-state malware research. In: Black Hat, Las Vegas, NV, USA, August 2015 Marquis-Boire, M., Marschalek, M., Guarnieri, C.: Big game hunting: the peculiarities in nation-state malware research. In: Black Hat, Las Vegas, NV, USA, August 2015
30.
Zurück zum Zitat Meng, X., Miller, B.P., Williams, W.R., Bernat, A.R.: Mining software repositories for accurate authorship. In: 2013 IEEE International Conference on Software Maintenance (ICSM), Eindhoven, Netherlands, September 2013 Meng, X., Miller, B.P., Williams, W.R., Bernat, A.R.: Mining software repositories for accurate authorship. In: 2013 IEEE International Conference on Software Maintenance (ICSM), Eindhoven, Netherlands, September 2013
33.
Zurück zum Zitat Qiu, J., Su, X., Ma, P.: Library functions identification in binary code by using graph isomorphism testings. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Montreal, Quebec, Canada, March 2015 Qiu, J., Su, X., Ma, P.: Library functions identification in binary code by using graph isomorphism testings. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Montreal, Quebec, Canada, March 2015
34.
Zurück zum Zitat Rahimian, A., Shirani, P., Alrbaee, S., Wang, L., Debbabi, M.: Bincomp: a stratified approach to compiler provenance attribution. Digit. Investig. 14(Suppl. 1), S146–S155 (2015)CrossRef Rahimian, A., Shirani, P., Alrbaee, S., Wang, L., Debbabi, M.: Bincomp: a stratified approach to compiler provenance attribution. Digit. Investig. 14(Suppl. 1), S146–S155 (2015)CrossRef
35.
Zurück zum Zitat Rahman, F., Devanbu, P.: Ownership, experience and defects: a fine-grained study of authorship. In: Proceedings of 33rd International Conference on Software Engineering (ICSE), Waikiki, Honolulu, HI, USA, May 2011 Rahman, F., Devanbu, P.: Ownership, experience and defects: a fine-grained study of authorship. In: Proceedings of 33rd International Conference on Software Engineering (ICSE), Waikiki, Honolulu, HI, USA, May 2011
36.
Zurück zum Zitat Roberts, R.: Malware development life cycle. In: Virus Bulletin Conference (VB), October 2008 Roberts, R.: Malware development life cycle. In: Virus Bulletin Conference (VB), October 2008
37.
Zurück zum Zitat Rosenblum, N., Miller, B.P., Zhu, X.: Recovering the toolchain provenance of binary code. In: 2011 International Symposium on Software Testing and Analysis (ISSTA), Toronto, Ontario, Canada, July 2011 Rosenblum, N., Miller, B.P., Zhu, X.: Recovering the toolchain provenance of binary code. In: 2011 International Symposium on Software Testing and Analysis (ISSTA), Toronto, Ontario, Canada, July 2011
38.
Zurück zum Zitat Rosenblum, N., Zhu, X., Miller, B.P.: Who wrote this code? Identifying the authors of program binaries. In: 16th European Conference on Research in Computer Security (ESORICS), Leuven, Belgium, September 2011 Rosenblum, N., Zhu, X., Miller, B.P.: Who wrote this code? Identifying the authors of program binaries. In: 16th European Conference on Research in Computer Security (ESORICS), Leuven, Belgium, September 2011
39.
Zurück zum Zitat Ruttenberg, B., Miles, C., Kellogg, L., Notani, V., Howard, M., LeDoux, C., Lakhotia, A., Pfeffer, A.: Identifying shared software components to support malware forensics. In: 11th Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA), Egham, London, UK, July 2014 Ruttenberg, B., Miles, C., Kellogg, L., Notani, V., Howard, M., LeDoux, C., Lakhotia, A., Pfeffer, A.: Identifying shared software components to support malware forensics. In: 11th Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA), Egham, London, UK, July 2014
40.
Zurück zum Zitat Sæbjørnsen, A., Willcock, J., Panas, T., Quinlan, D., Su, Z.: Detecting code clones in binary executables. In: 18th International Symposium on Software Testing and Analysis (ISSTA), Chicago, IL, USA, July 2009 Sæbjørnsen, A., Willcock, J., Panas, T., Quinlan, D., Su, Z.: Detecting code clones in binary executables. In: 18th International Symposium on Software Testing and Analysis (ISSTA), Chicago, IL, USA, July 2009
41.
Zurück zum Zitat Śliwerski, J., Zimmermann, T., Zeller, A.: When do changes induce fixes? In: Proceedings of 2005 International Workshop on Mining Software Repositories (MSR), St. Louis, Missouri, USA, May 2005 Śliwerski, J., Zimmermann, T., Zeller, A.: When do changes induce fixes? In: Proceedings of 2005 International Workshop on Mining Software Repositories (MSR), St. Louis, Missouri, USA, May 2005
42.
Zurück zum Zitat Yavvari, C., Tokhtabayev, A., Rangwala, H., Stavrou, A.: Malware characterization using behavioral components. In: 6th International Conference on Mathematical Methods, Models and Architectures for Computer Network Security (MMM-ACNS), St. Petersburg, Russia, October 2012 Yavvari, C., Tokhtabayev, A., Rangwala, H., Stavrou, A.: Malware characterization using behavioral components. In: 6th International Conference on Mathematical Methods, Models and Architectures for Computer Network Security (MMM-ACNS), St. Petersburg, Russia, October 2012
43.
Zurück zum Zitat Yin, Z., Yuan, D., Zhou, Y., Pasupathy, S., Bairavasundaram, L.: How do fixes become bugs? In: Proceedings of 19th ACM SIGSOFT Symposium and 13th European Conference on Foundations of Software Engineering (ESEC/FSE), Szeged, Hungary, September 2011 Yin, Z., Yuan, D., Zhou, Y., Pasupathy, S., Bairavasundaram, L.: How do fixes become bugs? In: Proceedings of 19th ACM SIGSOFT Symposium and 13th European Conference on Foundations of Software Engineering (ESEC/FSE), Szeged, Hungary, September 2011
Metadaten
Titel
Identifying Multiple Authors in a Binary Program
verfasst von
Xiaozhu Meng
Barton P. Miller
Kwang-Sung Jun
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-66399-9_16