ABSTRACT
With the wide existence of binary code, it is desirable to reuse it in many security applications, such as malware analysis and software patching. While prior approaches have shown that binary code can be extracted and reused, they are often based on static analysis and face challenges when coping with obfuscated binaries. This paper introduces trace-oriented programming (TOP), a general framework for generating new software from existing binary code by elevating the low-level binary code to C code with templates and inlined assembly. Different from existing work, TOP gains benefits from dynamic analysis such as resilience against obfuscation and avoidance of points-to analysis. Thus, TOP can be used for malware analysis, especially for malware function analysis and identification. We have implemented a proof-of-concept of TOP and our evaluation results with a range of benign and malicious software indicate that TOP is able to reconstruct source code from binary execution traces in malware analysis and identification, and binary function transplanting.
- Intel-64 and IA-32 Architectures Software Developer's Manual Combined Volumes 3A, 3B, and 3C.Google Scholar
- Hex-rays decompiler SDK. http://www.hex-rays.com/.Google Scholar
- Making a disassembler: Instruction aliasing. http://trusted-disassembler.blogspot.com/2012/ 12/instruction-aliasing.html.Google Scholar
- QEMU: an open source processor emulator. http://www.qemu.org/.Google Scholar
- BRANCO, R. R. Scientific but not academical overview of malware anti-debugging, anti-disassembly and anti-vm technologies. In Black Hat Technical Security Conf. (Las Vegas, Nevada, July 2012).Google Scholar
- BREUER, P. T., AND BOWEN, J. P. Decompilation: The enumeration of types and grammars. ACM Trans. Program. Lang. Syst. 16, 5 (1994), 1613--1647. Google ScholarDigital Library
- BRUMLEY, D., AND NEWSOME, J. Alias analysis for assembly. Tech. Rep. CMU-CS-06-180, Carnegie Mellon University School of Computer Science, 2006.Google Scholar
- BURKE, M. G., CARINI, P. R., CHOI, J.-D., AND HIND, M. Flow-insensitive interprocedural alias analysis in the presence of pointers. In Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing (London, UK, 1995), Springer-Verlag, pp. 234--250. Google ScholarDigital Library
- CABALLERO, J., JOHNSON, N. M., MCCAMANT, S., AND SONG, D. Binary code extraction and interface identification for security applications. In Proceedings of the 17th Annual Network and Distributed System Security Symposium (NDSS'10) (San Diego, CA, February 2010).Google Scholar
- CABALLERO, J., POOSANKAM, P., KREIBICH, C., AND SONG, D. Dispatcher: Enabling active botnet infiltration using automatic protocol reverse-engineering. In Proceedings of the 16th ACM Conference on Computer and and Communications Security (CCS'09) (Chicago, Illinois, USA, 2009), pp. 621--634. Google ScholarDigital Library
- CABALLERO, J., AND SONG, D. Polyglot: Automatic extraction of protocol format using dynamic binary analysis. In Proceedings of the 14th ACM Conference on Computer and and Communications Security (CCS'07) (Alexandria, Virginia, USA, 2007), pp. 317--329. Google ScholarDigital Library
- CADAR, C., DUNBAR, D., AND ENGLER, D. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In USENIX Symposium on Operating Systems Design and Implementation (OSDI'08) (San Diego, CA, 2008). Google ScholarDigital Library
- CADAR, C., GANESH, V., PAWLOWSKI, P. M., DILL, D. L., AND ENGLER, D. R. Exe: Automatically generating inputs of death. In Proceedings of the 13th ACM Conference on Computer and Communications Security (CCS'06) (Alexandria, Virginia, USA, 2006), ACM, pp. 322--335. Google ScholarDigital Library
- CHIPOUNOV, V., KUZNETSOV, V., AND CANDEA, G. S2e: a platform for in-vivo multi-path analysis of software systems. In Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems (Newport Beach, California, USA, 2011), ASPLOS '11, pp. 265--278. Google ScholarDigital Library
- CHOW, J., PFAFF, B., CHRISTOPHER, K., AND ROSENBLUM, M. Understanding data lifetime via whole-system simulation. In Proceedings of the 13th USENIX Security Symposium (2004). Google ScholarDigital Library
- CIFUENTES, C. Reverse Compilation Techniques. PhD thesis, Queensland University of Technology (1994).Google Scholar
- CIFUENTES, C., AND GOUGH, K. J. Decompilation of binary programs. Softw. Pract. Exper. 25, 7 (July 1995), 811--829. Google ScholarDigital Library
- COLLBERG, C., THOMBORSON, C., AND LOW, D. A taxonomy of obfuscating transformations. Technical Report 148, Department of Computer Science, University of Auckland (1997).Google Scholar
- CONSTANTIN, L. Decompiled stuxnet code published online, 2011. http://news.softpedia.com/news/Anonymous-Publishes-Decompiled- Stuxnet-Code-184448.shtml.Google Scholar
- CRANDALL, J. R., WU, S. F., AND CHONG, F. T. Minos: Architectural support for protecting control data. ACM Trans. Archit. Code Optim. 3, 4 (2006), 359--389. Google ScholarDigital Library
- CUI, W., PEINADO, M., CHEN, K., WANG, H. J., AND IRUN-BRIZ, L. Tupni: Automatic reverse engineering of input formats. In Proceedings of the 15th ACM Conference on Computer and Communications Security (CCS'). Google ScholarDigital Library
- DEBRAY, S. K., MUTH, R., AND WEIPPERT, M. Alias analysis of executable code. In Symposium on Principles of Programming Languages (POPL'98) (1998), pp. 12--24. Google ScholarDigital Library
- DENG, Z., ZHANG, X., AND XU, D. Bistro: Binary component extraction and embedding for software security applications. In Proceedings of 18th European Symposium on Research in Computer Security (ESORICS'13) (Egham, UK, September 2013), LNCS.Google ScholarCross Ref
- DOLAN-GAVITT, B., LEEK, T., ZHIVICH, M., GIFFIN, J., AND LEE, W. Virtuoso: Narrowing the semantic gap in virtual machine introspection. In Proceedings of the 32nd IEEE Symposium on Security and Privacy (Oakland, CA, USA, 2011), pp. 297--312. Google ScholarDigital Library
- EGELE, M., KRUEGEL, C., KIRDA, E., YIN, H., , AND SONG, D. Dynamic spyware analysis. In Proceedings of the 2007 USENIX Annual Technical Conference (Usenix'07) (June 2007). Google ScholarDigital Library
- EGELE, M., KRUEGEL, C., KIRDA, E., YIN, H., AND SONG, D. Dynamic spyware analysis. In ATC'07: 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference (Santa Clara, CA, 2007), USENIX Association, pp. 1--14. Google ScholarDigital Library
- EMMERIK, M. V., AND WADDINGTON, T. Using a decompiler for real-world source recovery. In Proceedings of the 11th Working Conference on Reverse Engineering (2004), pp. 27--36. Google ScholarDigital Library
- FORRESTER, J. E., AND MILLER, B. P. An empirical study of the robustness of Windows NT applications using random testing. In Proceedings of the 4th Conference on USENIX Windows Systems Symposium (Seattle, Washington, 2000), USENIX Association, pp. 1--10. Google ScholarDigital Library
- FU, Y., AND LIN, Z. Space traveling across vm: Automatically bridging the semantic gap in virtual machine introspection via online kernel data redirection. In Proceedings of 33rd IEEE Symposium on Security and Privacy (May 2012). Google ScholarDigital Library
- FU, Y., AND LIN, Z. Exterior: Using a dual-vm based external shell for guest-os introspection, configuration, and recovery. In Proceedings of the Ninth Annual International Conference on Virtual Execution Environments (Houston, TX, March 2013). Google ScholarDigital Library
- GARFINKEL, T., AND ROSENBLUM, M. A virtual machine introspection based architecture for intrusion detection. In Proceedings Network and Distributed Systems Security Symposium (NDSS'03) (February 2003).Google Scholar
- GODEFROID, P., KIEZUN, A., AND LEVIN, M. Y. Grammar-based whitebox fuzzing. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'08) (Tucson, AZ, USA, 2008), ACM, pp. 206--215. Google ScholarDigital Library
- GODEFROID, P., KLARLUND, N., AND SEN, K. Dart: Directed automated random testing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'05) (Chicago, IL, USA, 2005), ACM, pp. 213--223. Google ScholarDigital Library
- GODEFROID, P., LEVIN, M., AND MOLNAR, D. Automated whitebox fuzz testing. In Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS'08) (San Diego, CA, February 2008).Google Scholar
- GUO, F., FERRIE, P., AND CKER CHIUEH, T. A study of the packer problem and its solutions. In Proceedings of the 11th International Symposium on Recent Advances in Intrusion Detection (RAID 2008) (Boston, USA, September 2008). Google ScholarDigital Library
- KANG, M. G., POOSANKAM, P., AND YIN, H. Renovo: a hidden code extractor for packed executables. In Proceedings of the 2007 ACM Workshop on Recurring malcode (Alexandria, Virginia, USA, 2007), ACM, pp. 46--53. Google ScholarDigital Library
- KOLBITSCH, C., COMPARETTI, P. M., KRUEGEL, C., KIRDA, E., ZHOU, X., AND WANG, X. Effective and efficient malware detection at the end host. In Proceedings of the 18th conference on USENIX security symposium (Montreal, Canada, 2009), pp. 351--366. Google ScholarDigital Library
- KOLBITSCH, C., HOLZ, T., KRUEGEL, C., AND KIRDA, E. Inspector gadget: Automated extraction of proprietary gadgets from malware binaries. In Proceedings of 2010 IEEE Security and Privacy (Oakland, CA, May 2010). Google ScholarDigital Library
- LIANG, D., AND HARROLD, M. J. Efficient points-to analysis for whole-program analysis. In Proceedings of the 7th European Software Engineering Conference held jointly with the 7th ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE-7) (Toulouse, France, 1999), Springer-Verlag, pp. 199--215. Google ScholarDigital Library
- LIN, Z., ZHANG, X., AND XU, D. Automatic reverse engineering of data structures from binary execution. In Proceedings of the 17th Annual Network and Distributed System Security Symposium (NDSS'10) (San Diego, CA, February 2010).Google Scholar
- LUK, C.-K., COHN, R., MUTH, R., PATIL, H., KLAUSER, A., LOWNEY, G., WALLACE, S., REDDI, V. J., AND HAZELWOOD, K. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'05) (Chicago, IL, USA, 2005), pp. 190--200. Google ScholarDigital Library
- MARTIGNONI, L., CHRISTODORESCU, M., AND JHA, S. Omniunpack: Fast, generic, and safe unpacking of malware. In Proceedings of the 23rd Annual Computer Security Applications Conference (ACSAC'07) (2007), pp. 431--441.Google Scholar
- MIECZNIKOWSKI, J., AND HENDREN, L. J. Decompiling java bytecode: Problems, traps and pitfalls. In Proceedings of the 11th International Conference on Compiler Construction (London, UK, UK, 2002), CC '02, Springer-Verlag, pp. 111--127. Google ScholarDigital Library
- MILLER, B. P., FREDRIKSEN, L., AND SO, B. An empirical study of the reliability of UNIX utilities. In Proceedings of the Workshop of Parallel and Distributed Debugging (1990), Academic Medicine, pp. 9--19,.Google ScholarDigital Library
- MOSER, A., KRUEGEL, C., AND KIRDA, E. Exploring multiple execution paths for malware analysis. In Proceedings of the 2007 IEEE Symposium on Security and Privacy (Washington, DC, USA, 2007), IEEE Computer Society, pp. 231--245. Google ScholarDigital Library
- MYCROFT, A. Type-based decompilation (or program reconstruction via type reconstruction). In Proceedings of the 8th European Symposium on Programming Languages and Systems (ESOP'99) (London, UK, 1999), Springer-Verlag, pp. 208--223. Google ScholarDigital Library
- NEWSOME, J., AND SONG, D. Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In Proceedings of the 14th Annual Network and Distributed System Security Symposium (NDSS'05) (San Diego, CA, February 2005).Google Scholar
- PEARCE, D. J., KELLY, P. H., AND HANKIN, C. Efficient field-sensitive pointer analysis of c. ACM Trans. Program. Lang. Syst. 30, 1 (2007), 4. Google ScholarDigital Library
- RODRIGO RUBIRA BRANCO, G. N. B., AND NETO, P. D. Scientific but not academical overview of malware anti-debugging,anti-disassembly and antivm technologies. Tech. rep., "NOSPAM"qualys.com, Qualys-Vulnerability and Malware Research Labs.Google Scholar
- ROYAL, P., HALPIN, M., DAGON, D., EDMONDS, R., AND LEE, W. Polyunpack: Automating the hidden-code extraction of unpack-executing malware. In Proceedings of the 22nd Annual Computer Security Applications Conference (ACSAC'06) (Washington, DC, USA, 2006), IEEE Computer Society, pp. 289--300. Google ScholarDigital Library
- SCHWARTZ, E. J., LEE, J., WOO, M., AND BRUMLEY, D. Native x86 decompilation using semantics-preserving structural analysis and iterative control-flow structuring. In Proceedings of the 22nd USENIX Security Symposium (Washington DC, USA, 2013), USENIX Association. Google ScholarDigital Library
- SEN, K., MARINOV, D., AND AGHA, G. Cute: A concolic unit testing engine for c. In Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE-13) (Lisbon, Portugal, 2005), ACM, pp. 263--272. Google ScholarDigital Library
- SHARIF, M., LANZI, A., GIFFIN, J., AND LEE, W. Automatic reverse engineering of malware emulators. In Proceedings of the 2009 30th IEEE Symposium on Security and Privacy (2009), SP '09, pp. 94--109. Google ScholarDigital Library
- SHARIF, M., YEGNESWARAN, V., SAIDI, H., AND PORRAS, P. Eureka: A framework for enabling static analysis on malware. In Proceedings of the 13th European Symposium on Research in Computer Security (Malaga, Spain, October 2008), LNCS. Google ScholarDigital Library
- SLOWINSKA, A., STANCESCU, T., AND BOS, H. Howard: A dynamic excavator for reverse engineering data structures. In Proceedings of the 18th Annual Network and Distributed System Security Symposium (NDSS'11) (San Diego, CA, February 2011).Google Scholar
- WONDRACEK, G., MILANI, P., KRUEGEL, C., AND KIRDA, E. Automatic network protocol analysis. In Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS'08) (San Diego, CA, February 2008).Google Scholar
Index Terms
- Obfuscation resilient binary code reuse through trace-oriented programming
Recommendations
Code obfuscation against symbolic execution attacks
ACSAC '16: Proceedings of the 32nd Annual Conference on Computer Security ApplicationsCode obfuscation is widely used by software developers to protect intellectual property, and malware writers to hamper program analysis. However, there seems to be little work on systematic evaluations of effectiveness of obfuscation techniques against ...
Capturing Malware Propagations with Code Injections and Code-Reuse Attacks
CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications SecurityDefending against malware involves analysing large amounts of suspicious samples. To deal with such quantities we rely heavily on automatic approaches to determine whether a sample is malicious or not. Unfortunately, complete and precise automatic ...
Analysis on Technique for Code Obfuscation
CNCIT '23: Proceedings of the 2023 2nd International Conference on Networks, Communications and Information TechnologyCode obfuscation is used to reduce legibility of the code, and protect the critical code information from being stolen by reverse engineering. For the characteristic that obfuscation can be used for assembly and source code, the main method and ...
Comments