Abstract
A malware detection model based on a negative selection algorithm with penalty factor (NSAPF) is proposed in this paper. This model extracts a malware instruction library (MIL), containing instructions that tend to appear in malware, through deep instruction analysis with respect to instruction frequency and file frequency. From the MIL, the proposed model creates a malware candidate signature library (MCSL) and a benign program malware-like signature library (BPMSL) by splitting programs orderly into various short bit strings. Depending on whether a signature matches “self”, the NSAPF further divides the MCSL into two malware detection signature libraries (MDSL1 and MDSL2), and uses these as a two-dimensional reference for detecting suspicious programs. The model classifies suspicious programs as malware and benign programs by matching values of the suspicious programs with MDSL1 and MDSL2. Introduction of a penalty factor C in the negative selection algorithm enables this model to overcome the drawback of traditional negative selection algorithms in defining the harmfulness of “self” and “nonself”, and focus on the harmfulness of the code, thus greatly improving the effectiveness of the model and also enabling the model to satisfy the different requirements of users in terms of true positive and false positive rates. Experimental results confirm that the proposed model achieves a better true positive rate on completely unknown malware and a better generalization ability while keeping a low false positive rate. The model can balance and adjust the true positive and false positive rates by adjusting the penalty factor C to achieve better performance.
Similar content being viewed by others
References
Forrest S, Perelson A S, Allen L, et al. Self-nonself discrimination in a computer. In: 1994 IEEE Computer Society Symposium on Research in Security and Privacy, Oakland, 1994. 202–212
Henchiri O, Japkowicz N. A feature selection and evaluation scheme for computer virus detection. In: Sixth International Conference on Data Mining, HongKong, 2006. 891–895
Tabish S M, Shafiq M Z, Farooq M. Malware detection using statistical analysis of byte-level file content. In: CSIKDD’ 09, Paris, 2009. 23–31
Ye Y F, Jiang Q S, Zhuang W W. Associative classification and post-processing techniques used for malware detection. In: 2nd International Conference on Anti-counterfeiting, Security and Identification, Gaiyang 2008. 276–279
Ye Y F, Wang D D, Li T, et al. IMDS: Intelligent malware detection system. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, 2007. 1043–1047
Deng P S H, Wang J H, Shieh W G, et al. Intelligent automatic malicious code signatures extraction. In: Proceedings of IEEE 37th Annual 2003 International Carnahan Conference on Security Technology, Washington, 2003. 600–603
Karnik A, Goswami S, Guha P. Detecting obfuscated viruses using cosine similarity analysis. In: Proceedings of the First Asia International Conference on Modeling & Simulation, Phuket, 2007. 165–170
Forrest S, Hofmeyr S A, Somayaji A, et al. A sense of self for Unix processes. In: Proceedings of 1996 IEEE Symposium on Security and Privacy, Oakland, 1996. 120–128
Kim J, Bentley P. Towards an artificial immune system for network intrusion detection: An investigation of clonal selection with a negative selection operator. In: Congress on Evolutionary Computation, Seoul, 2001. 1244–1252
Lee H, Kim W, Hong M. Artificial immune system against viral attack. In: ICCS 2004, Kraków 2004. 499–506
Edge K S, Lamont G B, Raines R A. A retrovirus inspired algorithm for virus detection & optimization. In: Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, Seattle, 2006. 103–110
Li Z, Liang Y W, Wu Z J, et al. Immunity based virus detection with process call arguments and user feedback. In: Bio-Inspired Models of Network, Information and Computing Systems, Budapest, 2007. 57–64
Balachandran S, Dasgupta D, Nino F, et al. A general framework for evolving multi-shaped detectors in negative selection. In: Proceedings of IEEE Symposium Series on Computational Intelligence, Honloulu, 2007. 401–408
Li T. Dynamic detection for computer virus based on immune system. Sci China Ser F-Inf Sci, 2008, 51: 1475–1486
Wang W, Zhang P T, Tan Y, et al. A hierarchical artificial immune model for virus detection. In: International Conference on Computational Intelligence and Security, Beijing, 2009. 1–5
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, P., Wang, W. & Tan, Y. A malware detection model based on a negative selection algorithm with penalty factor. Sci. China Inf. Sci. 53, 2461–2471 (2010). https://doi.org/10.1007/s11432-010-4123-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11432-010-4123-5