2008 | OriginalPaper | Buchkapitel
Embedded Malware Detection Using Markov n-Grams
verfasst von : M. Zubair Shafiq, Syed Ali Khayam, Muddassar Farooq
Erschienen in: Detection of Intrusions and Malware, and Vulnerability Assessment
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Embedded malware is a recently discovered security threat that allows malcode to be hidden inside a benign file. It has been shown that embedded malware is not detected by commercial antivirus software even when the malware signature is present in the antivirus database. In this paper, we present a novel anomaly detection scheme to detect embedded malware. We first analyze byte sequences in benign files to show that benign files’ data generally exhibit a 1-st order dependence structure. Consequently, conditional
n
-grams provide a more meaningful representation of a file’s statistical properties than traditional
n
-grams. To capture and leverage this correlation structure for embedded malware detection, we model the conditional distributions as
Markov n
-grams
. For embedded malware detection, we use an information-theoretic measure, called entropy rate, to quantify changes in Markov
n
-gram distributions observed in a file. We show that the entropy rate of Markov
n
-grams gets significantly perturbed at malcode embedding locations, and therefore can act as a robust feature for embedded malware detection. We evaluate the proposed Markov
n
-gram detector on a comprehensive malware dataset consisting of more than 37,000 malware samples and 1,800 benign samples of six well-known filetypes. We show that the Markov
n
-gram detector provides better detection and false positive rates than the only existing embedded malware detection scheme.