Abstract
In this article, we propose Segugio, a novel defense system that allows for efficiently tracking the occurrence of new malware-control domain names in very large ISP networks. Segugio passively monitors the DNS traffic to build a machine-domain bipartite graph representing who is querying what. After labeling nodes in this query behavior graph that are known to be either benign or malware-related, we propose a novel approach to accurately detect previously unknown malware-control domains.
We implemented a proof-of-concept version of Segugio and deployed it in large ISP networks that serve millions of users. Our experimental results show that Segugio can track the occurrence of new malware-control domains with up to 94% true positives (TPs) at less than 0.1% false positives (FPs). In addition, we provide the following results: (1) we show that Segugio can also detect control domains related to new, previously unseen malware families, with 85% TPs at 0.1% FPs; (2) Segugio’s detection models learned on traffic from a given ISP network can be deployed into a different ISP network and still achieve very high detection accuracy; (3) new malware-control domains can be detected days or even weeks before they appear in a large commercial domain-name blacklist; (4) Segugio can be used to detect previously unknown malware-infected machines in ISP networks; and (5) we show that Segugio clearly outperforms domain-reputation systems based on Belief Propagation.
- Manos Antonakakis, Roberto Perdisci, David Dagon, Wenke Lee, and Nick Feamster. 2010. Building a dynamic reputation system for DNS. In Proceedings of the 19th USENIX Conference on Security (USENIX Security’10). Google ScholarDigital Library
- Manos Antonakakis, Roberto Perdisci, Wenke Lee, Nikolaos Vasiloglou, II, and David Dagon. 2011. Detecting malware domains at the upper DNS hierarchy. In Proceedings of the 20th USENIX Conference on Security (SEC’11). Google ScholarDigital Library
- Manos Antonakakis, Roberto Perdisci, Yacin Nadji, Nikolaos Vasiloglou, Saeed Abu-Nimeh, Wenke Lee, and David Dagon. 2012. From throw-away traffic to bots: Detecting the rise of DGA-based malware. In Proceedings of the 21st USENIX Conference on Security Symposium (Security’12). USENIX Association, Berkeley, CA, 24--24. http://dl.acm.org/citation.cfm?id=2362793.2362817 Google ScholarDigital Library
- Leyla Bilge, Engin Kirda, Christopher Kruegel, and Marco Balduzzi. 2011. EXPOSURE: Finding malicious domains using passive DNS analysis. In NDSS. The Internet Society.Google Scholar
- Leo Breiman. 2001. Random forests. Machine Learning 45, 1, 5--32. Google ScholarDigital Library
- Juan Caballero, Chris Grier, Christian Kreibich, and Vern Paxson. 2011. Measuring pay-per-install: The commoditization of malware distribution. In Proceedings of the 20th USENIX Conference on Security (SEC’11). USENIX Association, Berkeley, CA, USA, 13--13. Google ScholarDigital Library
- D. H. Chau, C. Nachenberg, J. Willhelm, A. Wright, and C. Faloutsos. 2011. Polonium: Tera-scale graph mining and inference for malware detection. Proceedings of SIAM International Conference on Data Mining (SDM’11) 131--142.Google Scholar
- Baris Coskun, Sven Dietrich, and Nasir Memon. 2010. Friends of an enemy: Identifying local members of peer-to-peer botnets using mutual contacts. In Proceedings of the 26th Annual Computer Security Applications Conference. ACM, 131--140. Google ScholarDigital Library
- Manuel Egele, Theodoor Scholte, Engin Kirda, and Christopher Kruegel. 2008. A survey on automated dynamic malware-analysis techniques and tools. ACM Computing Surveys 44, 2, Article 6. Google ScholarDigital Library
- Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research 9, 1871--1874. Google ScholarDigital Library
- Mark Felegyhazi, Christian Kreibich, and Vern Paxson. 2010. On the potential of proactive domain blacklisting. In Proceedings of the 3rd USENIX Workshop on Large-scale Exploits and Emergent Threats (LEET’10). Google ScholarDigital Library
- Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee. 2008a. BotMiner: Clustering analysis of network traffic for protocol- and structure-independent botnet detection. In Proceedings of the 17th Conference on Security Symposium (SS’08). USENIX Association, Berkeley, CA, 139--154. Google ScholarDigital Library
- Guofei Gu, Phillip Porras, Vinod Yegneswaran, Martin Fong, and Wenke Lee. 2007. BotHunter: Detecting malware infection through IDS-driven dialog correlation. In Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium (SS’07). USENIX Association, Berkeley, CA, Article 12. Google ScholarDigital Library
- Guofei Gu, Junjie Zhang, and Wenke Lee. 2008b. BotSniffer: Detecting botnet command and control channels in network traffic. In Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS’08).Google Scholar
- Gregoire Jacob, Ralf Hund, Christopher Kruegel, and Thorsten Holz. 2011. JACKSTRAWS: Picking command and control connections from bot traffic. In Proceedings of the 20th USENIX Conference on Security. Berkeley, CA. Google ScholarDigital Library
- Thomas Karagiannis, Konstantina Papagiannaki, and Michalis Faloutsos. 2005. BLINC: Multilevel traffic classification in the dark. In Proceedings of the 2005 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM’05). ACM, New York, NY, 12. Google ScholarDigital Library
- Daphne Koller and Nir Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques. The MIT Press, Cambridge, MA. Google ScholarDigital Library
- Marc Kührer, Christian Rossow, and Thorsten Holz. 2014. Paint it black: Evaluating the effectiveness of malware blacklists. In Research in Attacks, Intrusions and Defenses. Springer, 1--21.Google Scholar
- Ludmila I. Kuncheva. 2004. Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience, Hoboken, NJ. Google ScholarDigital Library
- Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein. 2010. GraphLab: A new parallel framework for machine learning. In Conference on Uncertainty in Artificial Intelligence (UAI). Catalina Island, CA.Google Scholar
- Pratyusa K. Manadhata, Sandeep Yadav, Prasad Rao, and William Horne. 2014. Detecting malicious domains via graph inference. In Computer Security - ESORICS’14, Miroslaw Kutylowski and Jaideep Vaidya (Eds.). Lecture Notes in Computer Science, Vol. 8712. Springer, Berlin, 1--18.Google Scholar
- Terry Nelms, Roberto Perdisci, and Mustaque Ahamad. 2013. ExecScent: Mining for new C&C domains in live networks with adaptive control protocol templates. In Proceedings of the 22nd USENIX Conference on Security. USENIX Association, 589--604. Google ScholarDigital Library
- Roberto Perdisci, Wenke Lee, and Nick Feamster. 2010. Behavioral clustering of HTTP-based malware and signature generation using malicious network traces. In Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation (NSDI’10). Google ScholarDigital Library
- M. Zubair Rafique and Juan Caballero. 2013. FIRMA: Malware clustering and network signature generation with mixed network behaviors. In Proceedings of the 16th International Symposium on Research in Attacks, Intrusions and Defenses. St. Lucia. Google ScholarDigital Library
- Babak Rahbarinia, Roberto Perdisci, and Manos Antonakakis. 2015. Segugio: Efficient behavior-based tracking of malware-control domains in large ISP networks. In Proceedings of the 2015 IEEE/IFIP International Conference on Dependable Systems &Networks (DSN’’15). Google ScholarDigital Library
- Christian Rossow, Christian Dietrich, and Herbert Bos. 2013. Large-scale analysis of malware downloaders. In Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 42--61. Google ScholarDigital Library
- Kazumichi Sato, Keisuke Ishibashi, Tsuyoshi Toyono, and Nobuhisa Miyake. 2010. Extending black domain name list by using co-occurrence relation between DNS queries. In LEET. Google ScholarDigital Library
- Le Song, Arthur Gretton, Danny Bickson, Yucheng Low, and Carlos Guestrin. 2011. Kernel belief propagation. In Artificial Intelligence and Statistics (AISTATS).Google Scholar
- Symantec. 2013a. India Sees 280 Percent Increase in Bot Infections. Retrieved July 18, 2016 from http://www.symantec.com/en/in/about/news/release/article.jsp?pr id=20130428_01.Google Scholar
- Symantec. 2013b. Internet Security Threat Report, Volume 18. http://www.symantec.com/content/en/us/enterprise/other_resources/b-istr_main_report_v18_2012_21291018.en-us.pdf.Google Scholar
- Peter Wurzinger, Leyla Bilge, Thorsten Holz, Jan Goebel, Christopher Kruegel, and Engin Kirda. 2009. Automatically generating models for botnet detection. In Proceedings of the 14th European Conference on Research in Computer Security (ESORICS’09). Google ScholarDigital Library
- Kuai Xu, Feng Wang, and Lin Gu. 2011. Network-aware behavior clustering of Internet end hosts. In Proceedings of IEEE INFOCOM.Google ScholarCross Ref
- Ting-Fang Yen and Michael K. Reiter. 2010. Are your hosts trading or plotting? Telling P2P file-sharing and bots apart. In Proceedings of the IEEE 30th International Conference on Distributed Computing Systems (ICDCS’10). Google ScholarDigital Library
- Junjie Zhang, Roberto Perdisci, Wenke Lee, Unum Sarfraz, and Xiapu Luo. 2011. Detecting stealthy P2P botnets using statistical traffic fingerprints. In Proceedings of the IEEE/IFIP 41st International Conference on Dependable Systems &Networks (DSN’’11). Google ScholarDigital Library
Index Terms
- Efficient and Accurate Behavior-Based Tracking of Malware-Control Domains in Large ISP Networks
Recommendations
Segugio: Efficient Behavior-Based Tracking of Malware-Control Domains in Large ISP Networks
DSN '15: Proceedings of the 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and NetworksIn this paper, we propose Segugio, a novel defense system that allows for efficiently tracking the occurrence of new malware-control domain names in very large ISP networks. Segugio passively monitors the DNS traffic to build a machine-domain bipartite ...
Design and implementation of a malware detection system based on network behavior
With the increasing of new malicious software attacks, the host-based malware detection methods cannot always detect the latest unknown malware. Intrusion detection system does not focus on malware detection, whereas the behavior-based detection methods ...
Comments