Abstract
With the growing sophistication of malware, the need to devise improved malware detection schemes is crucial. The packing of executable files, which is one of the most common techniques for code protection, has been repurposed for code obfuscation by malware authors as a means of evading malware detectors (mainly static analysis-based detectors). This paper provides statistics on the use of packers based on an extensive analysis of 24,000 PE files (both malicious and benign files) for the past 10 years, which allowed us to observe trends in packing use during that time and showed that packing is still widely used in malware. This paper then surveys 23 methods proposed in academic research for the detection and classification of packed portable executable (PE) files and highlights various trends in malware packing. The paper highlights the differences between the methods and their abilities to detect and identify various aspects of packing. A taxonomy is presented, classifying the methods as static, dynamic, and hybrid analysis-based methods. The paper also sheds light on the increasing role of machine learning methods in the development of modern packing detection methods. We analyzed and mapped the different packing methods and identified which of them can be countered by the detection methods surveyed in this paper.
- [1] . 2013. Dynamic classification of packing algorithms for inspecting executables using entropy analysis. Proc. 2013 8th Int. Conf. Malicious Unwanted Softw. The Am. MALWARE 2013. 19–26.Google Scholar
- [2] . 2014. Generic packing detection using several complexity analysis for accurate malware detection 5, 1 (2014), 7–14.Google Scholar
- [3] . 2010. Pattern Recognition Techniques for the Classification of Malware Packers. Springer, Berlin, 2010, 370–390.Google Scholar
- [4] . Useful and useless statistics about viruses and anti-virus programs.Google Scholar
- [5] , Senior Anti-virus Researcher, and Microsoft Corporation. 2008. Anti-unpacker tricks. Current (2008).Google Scholar
- [6] . 2017. Packer detection for multi-layer executables using entropy analysis. Entropy 19, 3 (2017), 1–18.Google ScholarCross Ref
- [7] . 2008. Revealing packed malware. IEEE Secur. Priv. Mag. 6, 5 (2008), 65–69.Google ScholarDigital Library
- [8] 2010. Design and performance evaluation of binary code packing for protecting embedded software against reverse engineering. In 2010 13th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing 2010, 80–86.Google ScholarDigital Library
- [9] . 2008. PE file header analysis-based packed PE file detection technique (PHAD). In International Symposium on Computer Science and its Applications 2008. 28–31.Google ScholarDigital Library
- [10] . 2008. Classification of packed executables for accurate computer virus detection. Pattern Recognit. Lett. 29, 14 (2008), 1941–1946.Google ScholarDigital Library
- [11] PE Format | Microsoft Docs. [Online]. Available: https://docs.microsoft.com/en-us/windows/desktop/debug/pe-format. [Accessed: 04-Feb-2021].Google Scholar
- [12] . 2012. PE file features in detection of packed executables. Entropy 4, 3 (2012), 476–478.Google Scholar
- [13] . 2012. Practical malware analysis: The hands-on guide to dissecting malicious software. No Starch Press.Google Scholar
- [14] . 2010, July. Pattern recognition techniques for the classification of malware packers. In Australasian Conference on Information Security and Privacy. Springer, Berlin, Heidelberg, 370–390.Google Scholar
- [15] . 2018. Hashing Base Ed Encryption N And Anti-Deb Bugger Suppor Rt For Packing Multiple Fi Es Into Sing E Executable (2018), 96–99.Google Scholar
- [16] “I Executable and Linkable Format (ELF).”Google Scholar
- [17] . 2015. ITEE Journal. ITEE J. 4, 4 (2015), 1–5.Google Scholar
- [18] . 2017. ITEE Journal. Int. J. Inf. Technol. Electr. Eng. 6, 1 (2017), 10–16.Google Scholar
- [19] . 2017. Entropy analysis to classify unknown packing algorithms for malware detection. Int. J. Inf. Secur. 16, 3 (2017), 227–248.Google ScholarDigital Library
- [20] . 2018. AppSpear: Automating the hidden-code extraction and reassembling of packed Android malware. J. Syst. Softw. 140 (2018), 3–16.Google ScholarCross Ref
- [21] . 2007. Using entropy analysis to find encrypted and packed malware. IEEE Secur. Priv. Mag. 5, 2 (2007), 40–45.Google ScholarDigital Library
- [22] . 2013. A static, packer-agnostic filter to detect similar malware samples. In Proceedings of the 9th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer-Verlag, 2013, 102–122.Google Scholar
- [23] . 2018. Sensitive system calls based packed malware variants detection using principal component initialized multilayers neural networks. 1–13.Google Scholar
- [24] Entropy and the distinctive signs of packed PE files. | NTinfo. [Online]. Available: http://n10info.blogspot.com/2014/06/entropy-and-distinctive-signs-of-packed.html. [Accessed: 04-Feb-2021].Google Scholar
- [25] . 1970. Exact distributions for X 2 and for the likelihood-ratio statistic for the equiprobable multinomial distribution. 1970.Google Scholar
- [26] 2018. Towards paving the way for large-scale windows malware analysis. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security - CCS ’18. 395–411.Google Scholar
- [27] . 2015. SoK: Deep packer inspection: A longitudinal study of the complexity of run-time packers. In 2015 IEEE Symposium on Security and Privacy. 659–673.Google ScholarDigital Library
- [28] 2015. CoDisasm: Medium scale concatic disassembly of self-modifying binaries with overlapping instructions. 2015.Google Scholar
- [29] . Detecting traditional packers, decisively.Google Scholar
- [30] . 2013. Malwise—an effective and efficient classification system for packed and polymorphic malware. IEEE Trans. Comput. 62, 6 (2013), 1193–1206.Google ScholarDigital Library
- [31] . 2015. A token strengthened encryption packer to prevent reverse engineering PE files. In 2015 International Conference on Estimation, Detection and Information Fusion (ICEDIF'15). 307–312.Google Scholar
- [32] . Thwarting real-time dynamic unpacking.Google Scholar
- [33] . Advanced preprocessing of binary executable files and its usage in retargetable decompilation.Google Scholar
- [34] . 2013. PsybOt malware: A step-by-step decompilation case study. In 2013 20th Working Conference on Reverse Engineering (WCRE'13). 449–456.Google ScholarCross Ref
- [35] . 2000. Linkers and Loaders. Morgan Kaufmann.Google Scholar
- [36] . 2018. Understanding Linux malware. Proc. - IEEE Symp. Secur. Priv. 2018-May, 161–175.Google Scholar
- [37] . 2013. Binary-code obfuscations in prevalent packer tools. ACM Comput. Surv. 46, 1 (2013), 1–32.Google ScholarDigital Library
- [38] UPX: the Ultimate Packer for eXecutables - Homepage. [Online]. Available: https://upx.github.io/. [Accessed: 10-Feb-2021].Google Scholar
- [39] . 2019. The arms race: Adversarial search defeats entropy used to detect malware. Expert Syst. Appl. 118 (2019), 246–260.Google ScholarDigital Library
- [40] Manual Unpacking of UPX Packed Binary File - www.SecurityXploded.com. [Online]. Available: https://securityxploded.com/unpackingupx.php. [Accessed: 10-Feb-2021].Google Scholar
- [41] Unpacking, Reversing, Patching. [Online]. Available: https://resources.infosecinstitute.com/unpacking-reversing-patching/#gref. [Accessed: 11-Feb-2021].Google Scholar
- [42] Oreans Technology: Software Security Defined. [Online]. Available: https://www.oreans.com/themida.php. [Accessed: 11-Feb-2021].Google Scholar
- [43] . 2009. Automatic static unpacking of malware binaries. In 2009 16th Working Conference on Reverse Engineering 2009, 167–176.Google ScholarDigital Library
- [44] . 2016. Classifying packed programs as malicious software detected 2016.Google Scholar
- [45] . 2009. Pandora's Bochs: Automatic Unpacking of Malware 121, 2009.Google Scholar
- [46] . 2008. A study of the packer problem and its solutions. In Recent Advances in Intrusion Detection. Berlin, Springer, Berlin, 2008, 98–115.Google ScholarDigital Library
- [47] . 2009. A unpacking and reconstruction system-AGUnpacker. In 2009 International Symposium on Computer Network and Multimedia Technology. 1–4.Google Scholar
- [48] . 2007. Software protection through anti-debugging. IEEE Secur. Priv. Mag. 5, 3 (2007), 82–84.Google ScholarDigital Library
- [49] . 2008. Anti-unpacker tricks. Current. 1–25.Google Scholar
- [50] . 2018. Anti-emulation trends in modern packers: A survey on the evolution of anti-emulation techniques in UPA packers. J. Comput. Virol. Hacking Tech. 14, 2 (2018), 107–126.Google ScholarCross Ref
- [51] . 2004. Digital genome mapping. 2004.Google Scholar
- [52] . 2009. Large-Scale Malware Indexing Using Function-Call Graphs * †. 2009.Google Scholar
- [53] . 2007. Detecting obfuscated viruses using cosine similarity analysis. In First Asia International Conference on Modelling & Simulation (AMS’07). 165–170.Google ScholarDigital Library
- [54] . Exploiting similarity between variants to defeat malware “Vilo” method for comparing and searching binary programs.Google Scholar
- [55] . Static malware detection & subterfuge: Quantifying the robustness of machine learning and current anti-virus.Google Scholar
- [56] . 2004. N-gram-based detection of new malicious code. Proc. 28th Annu. Int. Comput. Softw. Appl. Conf. 2004. COMPSAC 2004 2 (2004), 41–42.Google Scholar
- [57] . 2009. peHash: A novel approach to fast malware clustering. 2nd USENIX Work. Large-Scale Exploit. Emergent Threat. 2009.Google Scholar
- [58] . 2008. Near duplicate image detection: Min-Hash and tf-idf weighting. In Proceedings of the British Machine Vision Conference 2008, 50, 1–50.10.Google ScholarCross Ref
- [59] 2012. Binary function clustering using semantic hashes. In Proceedings - 2012 11th International Conference on Machine Learning and Applications, ICMLA'2012, 1, 386–391.Google Scholar
- [60] . 2013. Scalable semantics-based detection of similar Android applications. In Esorics 2013, 182–199.Google Scholar
- [61] . 2014. A two-stage methodology using K-NN and false-positive minimizing ELM for nominal data classification. Cognit. Comput. 6, 3 (2014), 432–445.Google ScholarCross Ref
- [62] . 2014. Guilt by association. Proc. 20th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min. - KDD’14. 1524–1533.Google ScholarDigital Library
- [63] . 2014. Locality-sensitive hashing optimizations for fast malware clustering. In Proceedings - 2014 IEEE 10th International Conference on Intelligent Computer Communication and Processing, ICCP'2014. 97–104.Google Scholar
- [64] Statistical Mechanics – R. K. Pathria, Paul D. Beale - Google ספרים.” [Online]. Available: https://books.google.co.il/books?id=KdbJJAXQ-RsC&printsec=frontcover&redir_esc=y&hl=iw#v=onepage&q&f=false. [Accessed: 11-Feb-2021].Google Scholar
- [65] . 2011. Collective classification for packed executable identification. In Proceedings of the 8th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference on - CEAS’11. 23–30.Google ScholarDigital Library
- [66] . Effective and efficient malware detection at the end host.Google Scholar
- [67] . 2011. Combining Static and Dynamic Analysis for the Detection of Malicious Documents. 2011.Google Scholar
- [68] . 2013. Looking at the Bag is not Enough to Find the Bomb: An Evasion of Structural Methods for Malicious PDF Files Detection. 2013.Google Scholar
- [69] . 2012. PDF scrutinizer: Detecting Javascript-based attacks in PDF documents. In 2012 Tenth Annual International Conference on Privacy, Security and Trust. 104–111.Google ScholarDigital Library
- [70] . 2013. De-obfuscation and detection of malicious PDF files with high accuracy. In 2013 46th Hawaii International Conference on System Sciences. 4890–4899.Google ScholarDigital Library
- [71] . 2006. PolyUnpack: Automating the hidden-code extraction of unpack-executing malware. In 2006 22nd Annual Computer Security Applications Conference (ACSAC’06). 289–300.Google ScholarDigital Library
- [72] . 2018. Trusted detection of ransomware in a private cloud using machine learning methods leveraging meta-features from volatile memory. Expert Syst. Appl. 102, 158–178.Google ScholarDigital Library
- [73] . 2018. Trusted system-calls analysis methodology aimed at detection of compromised virtual machines using sequential mining. Knowledge-Based Syst. 153, (2018), 147–175.Google ScholarDigital Library
- [74] . 2014. Novel active learning methods for enhanced PC malware detection in Windows OS. Expert Syst. Appl. 41, 13 (2014), 5843–5857.Google ScholarCross Ref
- [75] . 2012. Detecting unknown computer worm activity via support vector machines and active learning. Pattern Anal. Appl. 15, 4 (2012), 459–475.Google ScholarDigital Library
- [76] . 2016. SFEM: Structural feature extraction methodology for the detection of malicious office documents using machine learning methods. Expert Syst. Appl. 63 (2016), 324–343.Google ScholarDigital Library
- [77] . 2015. Detection of malicious PDF files and directions for enhancements: A state-of-the art survey. Comput. Secur. 48 (2015), 246–266.Google ScholarDigital Library
- [78] . 2017. ALDOCX: Detection of unknown malicious microsoft office documents using designated active learning methods based on new structural feature extraction methodology. IEEE Trans. Inf. Forensics Secur. 12, 3 (2017), 631–646.Google ScholarDigital Library
- [79] 2016. Keeping pace with the creation of new malicious PDF files using an active-learning based detection framework. Secur. Inform. 5, 1 (2016) 1.Google ScholarCross Ref
- [80] . 2016. Boosting the detection of malicious documents using designated active learning methods. Proc. - 2015 IEEE 14th Int. Conf. Mach. Learn. Appl. ICMLA 2015. 760–765.Google Scholar
- [81] . 2018. Novel set of general descriptive features for enhanced detection of malicious emails using machine learning methods. Expert Syst. Appl. 110 (2018), 143–169.Google ScholarCross Ref
- [82] . 2008. A fast randomness test that preserves local detail. Virus Bull. (2008), 34–42.Google Scholar
- [83] “PE iDentifier (PEiD) 0.95 /Binary Analysis/Editing/Downloads - Tuts 4 You.” [Online]. Available: https://tuts4you.com/e107_plugins/download/download.php?view.398. [Accessed: 11-Feb-2021].Google Scholar
- [84] . 2007. Renovo. In Proceedings of the 2007 ACM Workshop on Recurring Malcode - WORM’07. 46.Google ScholarDigital Library
- [85] “Exeinfo PE 0.0.5.1 - Download.” [Online]. Available: https://exeinfo-pe.en.uptodown.com/windows. [Accessed: 11-Feb-2021].Google Scholar
- [86] “Exeinfo PE by A.S.L - packer - compression detector and data detector.” [Online]. Available: http://exeinfo.atwebpages.com/. [Accessed: 11-Feb-2021].Google Scholar
- [87] Google Code Archive - Long-term storage for Google Code Project Hosting. [Online]. Available: https://code.google.com/archive/p/fuu/. [Accessed: 21-Feb-2021].Google Scholar
- [88] . 1983. Temporal probabilistic profiles for sepsis prediction in the ICU. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD’19. 2961–2969.Google Scholar
- [89] . 1983. Maintaining knowledge about temporal intervals. 1983.Google Scholar
- [90] . 1999. A general language model for information retrieval. In Proceedings of the Eighth International Conference on Information and Knowledge Management. 316–321.Google Scholar
- [91] . 1951. On information and sufficiency. Ann. Math. Stat. 22, 1 (1951), 79–86.Google ScholarCross Ref
- [92] Which are the Linux Executable Files, and How do We Create Them? [Online]. Available: https://www.webhostinghero.com/blog/which-are-the-linux-executable-files-and-how-do-we-create-them/. [Accessed: 11-Feb-2021].Google Scholar
- [93] . 2016. Packer identification using byte plot and Markov plot. J. Comput. Virol. Hacking Tech. (2016).
DOI: DOI: https://doi.org/10.1007/s11416-015-0249-8Google ScholarCross Ref - [94] . 2019. Efficient SVM based packer identification with binary diffing measures. In Proceedings - International Computer Software and Applications Conference.
DOI: DOI: https://doi.org/10.1109/COMPSAC.2019.00117Google Scholar - [95] . 2020. Packer identification method based on byte sequences. In Concurrency Computation.
DOI: DOI: https://doi.org/10.1002/cpe.5082Google ScholarCross Ref - [96] . 2020. Detection of metamorphic malware packers using multilayered LSTM networks. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).
DOI: DOI: https://doi.org/10.1007/978-3-030-61078-4_3Google Scholar - [97] . 2017. A control flow graph-based signature for packer identification. In Proceedings - IEEE Military Communications Conference MILCOM.
DOI: DOI: https://doi.org/10.1109/MILCOM.2017.8170793Google Scholar - [98] . 2019. Using convolutional neural networks for classification of malware represented as images. J. Comput. Virol. Hacking Tech. (2019).
DOI: DOI: https://doi.org/10.1007/s11416-018-0323-0Google ScholarCross Ref - [99] . 2019. Effective, efficient, and robust packing detection and classification. Comput. Secur. (2019).
DOI: DOI: https://doi.org/10.1016/j.cose.2019.05.007Google ScholarDigital Library - [100] . 2018. BareunPack: Generic unpacking on the bare-metal operating system. IEICE Trans. Inf. Syst. (2018).
DOI: DOI: https://doi.org/10.1587/transinf.2017EDP7424Google ScholarCross Ref - [101] . 2020. Packed malware variants detection using deep belief networks. MATEC Web Conf. (2020).
DOI: DOI: https://doi.org/10.1051/matecconf/202030902002Google Scholar - [102] . 2020. Classifying packed malware represented as control flow graphs using deep graph convolutional neural network. In Proceedings - 2020 International Conference on Computer Engineering and Application ICCEA'2020.
DOI: DOI: https://doi.org/10.1109/ICCEA50009.2020.00062Google Scholar - [103] . 2018. Malware analysis of imaged binary samples by convolutional neural network with attention mechanism. In CODASPY 2018 - Proceedings of the 8th ACM Conference on Data and Application Security and Privacy.
DOI: DOI: https://doi.org/10.1145/3176258.3176335Google Scholar - [104] . 2017. Attention is all you need. In Advances in Neural Information Processing Systems.Google Scholar
- [105] Obfuscated Files or Information: Software Packing | MITRE. Retrieved February 21, 2021 from https://attack.mitre.org/techniques/T1027/002/.Google Scholar
- [106] The WildList Organization International. Retrieved February 25, 2021 from http://www.wildlist.org/.Google Scholar
- [107] Five ways Android malware is becoming more resilient | Broadcom. Retrieved February 21, 2021 from https://www.symantec.com/connect/blogs/five-ways-android-malware-becoming-more-resilient.Google Scholar
- [108] Executable compression - Wikipedia. Retrieved February 21, 2021 from https://en.wikipedia.org/wiki/Executable_compression.Google Scholar
- [109] ImpREC - aldeid. Retrieved February 25, 2021 from https://www.aldeid.com/wiki/ImpREC.Google Scholar
- [110] LordPE - aldeid. Retrieved February 25, 2021 from https://www.aldeid.com/wiki/LordPE.Google Scholar
- [111] Cuckoo Sandbox - Automated Malware Analysis. Retrieved February 21, 2021 from https://cuckoosandbox.org/.Google Scholar
- [112] The Sandbox | Understanding CyberForensics. Retrieved February 25, 2021 from https://cwsandbox.org/.Google Scholar
- [113] Automated Malware Analysis Tool | Falcon Sandbox | CrowdStrike. Retrieved February 21, 2021 from https://www.crowdstrike.com/endpoint-security-products/falcon-sandbox-malware-analysis/.Google Scholar
- [114] Free Automated Malware Analysis Service - powered by Falcon Sandbox. Retrieved February 21, 2021 from https://www.hybrid-analysis.com/.Google Scholar
- [115] unicorn/sample_arm.c at master · unicorn-engine/unicorn. Retrieved February 21, 2021 from https://github.com/unicorn-engine/unicorn/blob/master/samples/sample_arm.c.Google Scholar
- [116] . 2020. When malware is packin’ heat; Limits of machine learning classifiers based on static analysis features.
DOI: DOI: https://doi.org/10.14722/ndss.2020.24310Google Scholar - [117] . 2008. Revealing packed malware. In IEEE Security & Privacy 6, 5 (2008), 65–69,
DOI: Google ScholarDigital Library - [118] . 1987. Data compression. ACM Comput. Surv. 19, 3 (1987), 261–296.
DOI: DOI: https://doi.org/10.1145/45072.45074Google ScholarDigital Library - [119] . 1952. A method for the construction of minimum-redundancy codes. Proceedings of the IRE 40, 9 (1952), 1098–1101.Google ScholarCross Ref
- [120] Threat Actors Use Delphi Packer to Shield Binaries From Malware Classification. Retrieved November 11, 2021 from https://securityintelligence.com/news/threat-actors-use-delphi-packer-to-shield-binaries-from-malware-classification/.Google Scholar
- [121] 2019. Sec-lib: Protecting scholarly digital libraries from infected papers using active machine learning framework. IEEE Access 7 (2019), 110050–110073.Google ScholarCross Ref
- [122] . 2020. MalJPEG: Machine learning based solution for the detection of malicious JPEG images. IEEE Access 8 (2020), 19997–20011.Google ScholarCross Ref
- [123] 2014. ALPD: Active learning framework for enhancing the detection of malicious pdf files. 2014 IEEE Joint Intelligence and Security Informatics Conference. IEEE, 2014.Google ScholarDigital Library
- [124] . 2018. MEADE: Towards a malicious email attachment detection engine. 2018 IEEE Int. Symp. Technol. Homel. Secur. HST'2018. 1–7.
DOI: Google ScholarCross Ref - [125] . 2019. RNN-Based classifier to detect stealthy malware using localized features and complex symbolic sequence. Proc. - 18th IEEE Int. Conf. Mach. Learn. Appl. ICMLA'2019. 406–409.
DOI: Google ScholarCross Ref - [126] . 2017. FEPDF: A robust feature extractor for malicious PDF detection. 2017 IEEE Trustcom/BigDataSE/ICESS.Google ScholarCross Ref
- [127] GitHub - NtQuery/Scylla: Imports Reconstructor. Retrieved March 7, 2022 from https://github.com/NtQuery/Scylla.Google Scholar
- [128] . Prevalence and impact of low-entropy packing schemes in the malware ecosystem.
DOI: DOI: https://doi.org/10.14722/ndss.2020.24297Google Scholar
Index Terms
- File Packing from the Malware Perspective: Techniques, Analysis Approaches, and Directions for Enhancements
Recommendations
Malware Dynamic Analysis Evasion Techniques: A Survey
The cyber world is plagued with ever-evolving malware that readily infiltrate all defense mechanisms, operate viciously unbeknownst to the user, and surreptitiously exfiltrate sensitive data. Understanding the inner workings of such malware provides a ...
Revealing Packed Malware
In concert with the ever-growing network applications, a significant increase in the spread of malware over the Internet has been observed. In cases where malware are the zero-day threats, generating their signatures for detection via anti-virus (AV) ...
Obfuscation: The Hidden Malware
A cyberwar exists between malware writers and antimalware researchers. At this war's heart rages a weapons race that originated in the 80s with the first computer virus. Obfuscation is one of the latest strategies to camouflage the telltale signs of ...
Comments