research-article

Detecting Malicious PowerShell Commands using Deep Neural Networks

Authors:
Danny Hendler

Ben-Gurion University of the Negev, Beer-Sheva, Israel

Ben-Gurion University of the Negev, Beer-Sheva, Israel
View Profile

,
Shay Kels

Microsoft, Hertzlia, Israel

Microsoft, Hertzlia, Israel
View Profile

,
Amir Rubin

Ben-Gurion University of the Negev, Beer-Sheva, Israel

Ben-Gurion University of the Negev, Beer-Sheva, Israel
View Profile

ASIACCS '18: Proceedings of the 2018 on Asia Conference on Computer and Communications SecurityMay 2018Pages 187–197https://doi.org/10.1145/3196494.3196511

Published:29 May 2018Publication History

ASIACCS '18: Proceedings of the 2018 on Asia Conference on Computer and Communications Security

Pages 187–197

ABSTRACT

Microsoft's PowerShell is a command-line shell and scripting language that is installed by default on Windows machines. Based on Microsoft's .NET framework, it includes an interface that allows programmers to access operating system services. While PowerShell can be configured by administrators for restricting access and reducing vulnerabilities, these restrictions can be bypassed. Moreover, PowerShell commands can be easily generated dynamically, executed from memory, encoded and obfuscated, thus making the logging and forensic analysis of code executed by PowerShell challenging. For all these reasons, PowerShell is increasingly used by cybercriminals as part of their attacks' tool chain, mainly for downloading malicious contents and for lateral movement. Indeed, a recent comprehensive technical report by Symantec dedicated to PowerShell's abuse by cybercrimials [52] reported on a sharp increase in the number of malicious PowerShell samples they received and in the number of penetration tools and frameworks that use PowerShell. This highlights the urgent need of developing effective methods for detecting malicious PowerShell commands. In this work, we address this challenge by implementing several novel detectors of malicious PowerShell commands and evaluating their performance. We implemented both "traditional" natural language processing (NLP) based detectors and detectors based on character-level convolutional neural networks (CNNs). Detectors' performance was evaluated using a large real-world dataset. Our evaluation results show that, although our detectors (and especially the traditional NLP-based ones) individually yield high performance, an ensemble detector that combines an NLP-based classifier with a CNN-based classifier provides the best performance, since the latter classifier is able to detect malicious commands that succeed in evading the former. Our analysis of these evasive commands reveals that some obfuscation patterns automatically detected by the CNN classifier are intrinsically difficult to detect using the NLP techniques we applied. Our detectors provide high recall values while maintaining a very low false positive rate, making us cautiously optimistic that they can be of practical value.

References

Ben Athiwaratkun and Jack W Stokes. 2017. Malware classification with LSTM and GRU language models and a character-level CNN. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on. IEEE, ., 2482-- 2486.Google ScholarCross Ref
Pierre Baldi, Søren Brunak, Paolo Frasconi, Giovanni Soda, and Gianluca Pollastri. 1999. Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 15, 11 (1999), 937--946.Google ScholarCross Ref
Daniel Bohannon. 2016. The Invoke-Obfuscation module. https://github.com/ danielbohannon/Invoke-Obfuscation. (2016).Google Scholar
Léon Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT'2010. Springer, ., 177--186.Google ScholarCross Ref
Y-Lan Boureau, Francis Bach, Yann LeCun, and Jean Ponce. 2010. Learning midlevel features for recognition. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, ., 2559--2566.Google Scholar
Microsoft Corporation. 2017. Antimalware Scan Interface. https://msdn.microsoft. com/he-il/library/windows/desktop/dn889587(v=vs.85).aspx. (2017).Google Scholar
Microsoft Corporation. 2017. PowerShell. https://docs.microsoft.com/en-us/ powershell/scripting/powershell-scripting?view=powershell-5.1. (2017).Google Scholar
Microsoft Corporation. 2017. Trojan:Win32/Kovter. https://www.microsoft. com/en-us/wdsi/threats/malware-encyclopedia-description?Name=Trojan: Win32/Kovter. (2017).Google Scholar
Marco Cova, Christopher Kruegel, and Giovanni Vigna. 2010. Detection and analysis of drive-by-download attacks and malicious JavaScript code. In Proceedings of the 19th international conference on World wide web. ACM, ., 281--290. Google ScholarDigital Library
Charlie Curtsinger, Benjamin Livshits, Benjamin G Zorn, and Christian Seifert. 2011. ZOZZLE: Fast and Precise In-Browser JavaScript Malware Detection.. In USENIX Security Symposium. USENIX Association, ., 33--48. Google ScholarDigital Library
George E Dahl, Jack W Stokes, Li Deng, and Dong Yu. 2013. Large-scale malware classification using random projections and neural networks. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, ., 3422--3426.Google Scholar
Jeffrey L Elman. 1990. Finding structure in time. Cognitive science 14, 2 (1990), 179--211.Google Scholar
Kunihiko Fukushima and Sei Miyake. 1982. Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and cooperation in neural nets. Springer, ., 267--285.Google Scholar
Ian J. Goodfellow, Yoshua Bengio, and Aaron C. Courville. 2016. Deep Learning. MIT Press, . http://www.deeplearningbook.org/ Google ScholarDigital Library
Alex Graves. 2013. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013).Google Scholar
Alex Graves, Santiago Fernández, and Jürgen Schmidhuber. 2005. Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition. In Artificial Neural Networks: Formal Models and Their Applications - ICANN 2005, 15th International Conference, Warsaw, Poland, September 11--15, 2005, Proceedings, Part II (Lecture Notes in Computer Science), Wlodzislaw Duch, Janusz Kacprzyk, Erkki Oja, and Slawomir Zadrozny (Eds.), Vol. 3697. Springer, ., 799--804. Google ScholarDigital Library
Alex Graves and Navdeep Jaitly. 2014. Towards end-to-end speech recognition with recurrent neural networks. In Proceedings of the 31st International Conference on Machine Learning (ICML-14). JMLR.org, ., 1764--1772. Google ScholarDigital Library
Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In Acoustics, speech and signal processing (icassp), 2013 ieee international conference on. IEEE, ., 6645--6649.Google Scholar
Douglas M Hawkins. 2004. The problem of overfitting. Journal of chemical information and computer sciences 44, 1 (2004), 1--12.Google ScholarCross Ref
Geoffrey E Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R Salakhutdinov. 2012. Improving neural networks by preventing coadaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012).Google Scholar
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780. Google ScholarDigital Library
Michael Hopkins and Ali Dehghantanha. 2015. Exploit kits: the production line of the cybercrime economy?. In Information Security and Cyber Forensics (InfoSec), 2015 Second International Conference on. IEEE, ., 23--27.Google ScholarCross Ref
Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. 2016. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410 (2016).Google Scholar
Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. 2014. Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. IEEE, ., 1725--1732. Google ScholarDigital Library
Shachar Kaufman, Saharon Rosset, Claudia Perlich, and Ori Stitelman. 2012. Leakage in data mining: Formulation, detection, and avoidance. ACM Transactions on Knowledge Discovery from Data (TKDD) 6, 4 (2012), 15. Google ScholarDigital Library
Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent Convolutional Neural Networks for Text Classification.. In AAAI, Vol. 333. AAAI Press, ., 2267-- 2273. Google ScholarDigital Library
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.Google Scholar
Yann LeCun, Bernhard Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne Hubbard, and Lawrence D Jackel. 1989. Backpropagation applied to handwritten zip code recognition. Neural computation 1, 4 (1989), 541--551. Google ScholarDigital Library
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradientbased learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278-- 2324.Google ScholarCross Ref
Peter Likarish, Eunjin Jung, and Insoon Jo. 2009. Obfuscated malicious javascript detection using classification techniques. In Malicious and Unwanted Software (MALWARE), 2009 4th International Conference on. IEEE, ., 47--54.Google ScholarCross Ref
Bing Liu and Lei Zhang. 2012. A survey of opinion mining and sentiment analysis. In Mining text data. Springer, ., 415--463.Google Scholar
Christopher D Manning, Hinrich Schütze, et al. 1999. Foundations of statistical natural language processing. Vol. 999. MIT Press, . Google ScholarDigital Library
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google Scholar
Tomas Mikolov, Martin Karafiàt, Lukas Burget, Jan Cernocky, and Sanjeev Khu-Google Scholar
danpur. 2010. Recurrent neural network based language model.. In Interspeech, Vol. 2. ISCA, ., 3Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. NIPS, ., 3111--3119. Google ScholarDigital Library
Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10). Omnipress, ., 807--814. Google ScholarDigital Library
PaloAlto. 2017. Pulling Back the Curtains on EncodedCommand PowerShell Attacks. https://researchcenter.paloaltonetworks.com/2017/03/ unit42-pulling-back-the-curtains-on-encodedcommand-powershell-attacks/. (2017).Google Scholar
Razvan Pascanu, Jack W Stokes, Hermineh Sanossian, Mady Marinescu, and Anil Thomas. 2015. Malware classification with recurrent networks. In Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, ., 1916--1920.Google ScholarCross Ref
Santiago M Pontiroli and F Roberto Martinez. 2015. The Tao of .NET and PowerShell Malware Analysis. In Virus Bulletin Conference. ., .Google Scholar
Joseph D. Prusa and Taghi M. Khoshgoftaar. 2017. Deep Neural Network Architecture for Character-Level Learning on Short Text. In Proceedings of the Thirtieth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2017, Marco Island, Florida, USA, May 22--24, 2017. AAAI Press, ., 353--358.Google Scholar
Amanda Rousseau. 2017. Hijacking. NET to Defend PowerShell. arXiv preprint arXiv:1709.07508 (2017).Google Scholar
Sam T Roweis and Lawrence K Saul. 2000. Nonlinear dimensionality reduction by locally linear embedding. science 290, 5500 (2000), 2323--2326.Google Scholar
David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. 1986. Learning representations by back-propagating errors. nature 323, 6088 (1986), 533--536.Google Scholar
Haşim Sak, Andrew Senior, and Françoise Beaufays. 2014. Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In Fifteenth Annual Conference of the International Speech Communication Association. ISCA, .Google Scholar
Joshua Saxe and Konstantin Berlin. 2015. Deep neural network based malware detection using two dimensional binary program features. In Malicious and Unwanted Software (MALWARE), 2015 10th International Conference on. IEEE, ., 11--20. Google ScholarDigital Library
Joshua Saxe and Konstantin Berlin. 2017. eXpose: A Character-Level Convolutional Neural Network with Embeddings For Detecting Malicious URLs, File Paths and Registry Keys. arXiv preprint arXiv:1702.08568 (2017).Google Scholar
Robert J Schalkoff. 1997. Artificial neural networks. Vol. 1. McGraw-Hill New York, . Google ScholarDigital Library
Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45, 11 (1997), 2673--2681. Google ScholarDigital Library
Michael R Smith, Joe B Ingram, Christopher C Lamb, Timothy J Draelos, Justin E Doak, James B Aimone, and Conrad D James. 2017. Dynamic Analysis of Executables to Detect and Characterize Malware. arXiv preprint arXiv:1711.03947 (2017).Google Scholar
Brett Stone-Gross, Marco Cova, Lorenzo Cavallaro, Bob Gilbert, Martin Szydlowski, Richard Kemmerer, Christopher Kruegel, and Giovanni Vigna. 2009. Your botnet is my botnet: analysis of a botnet takeover. In Proceedings of the 16th ACM conference on Computer and communications security. ACM, ., 635--647. Google ScholarDigital Library
Martin Sundermeyer, Tamer Alkhouli, Joern Wuebker, and Hermann Ney. 2014. Translation Modeling with Bidirectional Recurrent Neural Networks.. In EMNLP. ACL, ., 14--25.Google Scholar
Symantec. 2016. The increased use of Powershell in attacks. https://www.symantec.com/content/dam/symantec/docs/security-center/ white-papers/increased-use-of-powershell-in-attacks-16-en.pdf. (2016).Google Scholar
Yao Wang, Wan-dong Cai, and Peng-cheng Wei. 2016. A deep learning approach for detecting malicious JavaScript code. Security and Communication Networks 9, 11 (2016), 1520--1534. Google ScholarDigital Library
B Yegnanarayana. 2009. Artificial neural networks. PHI Learning Pvt. Ltd., . Google ScholarDigital Library
Xiang Zhang and Yann LeCun. 2015. Text understanding from scratch. arXiv preprint arXiv:1502.01710 (2015).Google Scholar
Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In Advances in neural information processing systems. NIPS, ., 649--657. Google ScholarDigital Library

Index Terms

Detecting Malicious PowerShell Commands using Deep Neural Networks

Recommendations

AMSI-Based Detection of Malicious PowerShell Code Using Contextual Embeddings
ASIA CCS '20: Proceedings of the 15th ACM Asia Conference on Computer and Communications Security

PowerShell is a command-line shell, supporting a scripting language. It is widely used in organizations for configuration management and task automation but is also increasingly used for launching cyber attacks against organizations, mainly because it ...
Read More
AST-Based Deep Learning for Detecting Malicious PowerShell
CCS '18: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security

With the celebrated success of deep learning, some attempts to develop effective methods for detecting malicious PowerShell programs employ neural nets in a traditional natural language processing setup while others employ convolutional neural nets to ...
Read More
Effective and Light-Weight Deobfuscation and Semantic-Aware Attack Detection for PowerShell Scripts
CCS '19: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security

In recent years, PowerShell is increasingly reported to appear in a variety of cyber attacks ranging from advanced persistent threat, ransomware, phishing emails, cryptojacking, financial threats, to fileless attacks. However, since the PowerShell ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASIACCS '18: Proceedings of the 2018 on Asia Conference on Computer and Communications Security
May 2018
866 pages
ISBN:9781450355766
DOI:10.1145/3196494
General Chairs:
Jong Kim
Pohang University of Science and Technology, South Korea
,
Gail-Joon Ahn
Arizona State University, USA &Samsung Electronics, South Korea
,
Seungjoo Kim
Korea University, South Korea
,
Program Chairs:
Yongdae Kim
KAIST, South Korea
,
Javier Lopez
University of Malaga, Spain
,
Taesoo Kim
Georgia Tech, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 May 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
malware detection
natural language processing
neural networks
powershell
Qualifiers
- research-article
Conference

Acceptance Rates
ASIACCS '18 Paper Acceptance Rate52of310submissions,17%Overall Acceptance Rate418of2,322submissions,18%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 50
  Total Citations
  View Citations
- 1,093
  Total Downloads
- Downloads (Last 12 months)97
- Downloads (Last 6 weeks)10
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Detecting Malicious PowerShell Commands using Deep Neural Networks

ASIACCS '18: Proceedings of the 2018 on Asia Conference on Computer and Communications Security

ABSTRACT

References

Cited By

Index Terms

Recommendations

AMSI-Based Detection of Malicious PowerShell Code Using Contextual Embeddings

AST-Based Deep Learning for Detecting Malicious PowerShell

Effective and Light-Weight Deobfuscation and Semantic-Aware Attack Detection for PowerShell Scripts

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Detecting Malicious PowerShell Commands using Deep Neural Networks

ASIACCS '18: Proceedings of the 2018 on Asia Conference on Computer and Communications Security

ABSTRACT

References

Cited By

Index Terms

Recommendations

AMSI-Based Detection of Malicious PowerShell Code Using Contextual Embeddings

AST-Based Deep Learning for Detecting Malicious PowerShell

Effective and Light-Weight Deobfuscation and Semantic-Aware Attack Detection for PowerShell Scripts

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media