Identifying botnets by capturing group activities in DNS traffic
Introduction
A botnet is a network of computers compromised by malicious software. The botnet is operated by a criminal entity to perform Internet attacks, such as identity theft, spam distribution, and DDoS attack. All of these infected hosts are unwilling victims, performing malicious tasks unbeknownst to their owners.
Researchers have focused on bot traffic detection using incidental traits of prevalent bots. However, the detection approaches can be quickly overcome by evasion techniques. Moreover, some approaches need to run with an overwhelming amount of data, which becomes ineffective for high speed networks.
We have proposed a botnet detection mechanism using a fundamental property of botnets [1], [2]. We focused on an underlying common association among infected hosts, command and control (C&C) servers and victims, and found that the botnet generally acts as a coordinated group. Using this “group activity” property, it is possible to detect unknown botnets, irrespective of their communication protocol and structure. BotGAD detects botnets using DNS traffic, since it is possible to capture botnet group activities by monitoring the DNS traffic and monitoring DNS traffic has less overhead than monitoring the entire network traffic. Moreover, DNS monitoring enables botnet detection at their early stages, since botnet DNS traffic is often sent prior to performing attacks.
However, our previous mechanism has three limitations as follows:
- •
The mechanism may generate false negatives, when a set of infected hosts in a botnet is changed frequently. For example, a part of a botnet can appear only for a short time because they are removed by users or temporally deactivated. The changes in the botnet can decrease detection accuracy of our previous mechanism.
- •
The previous mechanism is too sensitive against detection parameters. For example, if a time window parameter of the previous mechanism is misconfigured, the mechanism can generate a significant number of false positives or false negatives.
- •
The mechanism cannot detect recently introduced evasive botnets that utilize a domain generation algorithm for C&C. For example, Kraken/Bobax [3], Srizbi [4], Torpig [5] and Conficker [6] use the domain generation algorithm (DGA) to evade detection.
To overcome the limitations, we improve the mechanism by applying three methods: error correction, cluster analysis, and hypothesis test.
- •
Error correction. We develop the error correction method to alleviate errors caused when analyzing group activities. Error correction can decrease false alarms caused by unexpected changes in a botnet or by misconfigured detection parameters. We devise column filtering and row filtering operations to correct the errors.
- •
Cluster analysis. We develop a clustering method using unsupervised machine learning to detect a set of correlated botnets. Several features are devised to classify correlated clusters. Each cluster is analyzed to detect botnet clusters.
- •
Hypothesis test. We adopt Sequential Probability Ratio Testing (SPRT) [7], as a hypothesis test for sequential analysis, where a decision is made within a small number of rounds with bounded false alarm rates. The SPRT method guarantees a higher level of confidence to make decisions than simple threshold based detection used in our previous mechanism.
The three methods are added in BotGAD to enhance accuracy and robustness against evasions.
We evaluate BotGAD using real-life DNS traces collected from several networks, such as a campus network and a large ISP network. BotGAD can report hundreds of botnet domains and correlated botnet domain clusters. It takes only a few minutes to analyze an hour’s DNS trace of a large ISP network. The evaluation shows BotGAD can automatically detect botnets in real-time, even though they apply evasion techniques, such as the DGA algorithm.
The remainder of this paper is organized as follows. Section 2 reviews related work. We describe the botnet group activity, detection algorithms and a framework of BotGAD in Section 3. We evaluate BotGAD performance in Section 4 and analyze results. We also discuss possible evasion techniques in Section 4. We draw conclusions in Section 5.
Section snippets
Related work
In this section, we review several network based botnet detection approaches that can be classified machine learning approaches and non-machine learning approaches. We further classify the approaches according to their detection object such as botnet traffic, cooperative behavior, and spamming botnets. We also distinguish the approaches by the source data they analyze (DNS traffic or other network traffic).
Botnet group activity and detection scheme
In this section, we illustrate the concept of our mechanism and a botnet detection scheme.
Evaluation result and analysis
We tested BotGAD on three different real-life traces to evaluate its performance. We also evaluate the performance of the three modules, i.e., error correction, correlated cluster analysis and hypothesis test. We discuss evadability and compare BotGAD to other mechanisms.
Conclusion
Botnets are the major threats to network security and major contributors to unwanted network traffic. Thus, it is necessary to provide appropriate countermeasures to botnets. We propose BotGAD to reveal both unknown domain names of C&C servers and IP addresses of hidden infected hosts. We define an inherent property of botnets, termed group activity. Using this property, we propose a light-weight mechanism to detect botnets. Our mechanism needs a small amount of data from DNS traffic, not all
Acknowledgments
This research was supported by the MKE, Korea, under the ITRC support program supervised by the NIPA (NIPA-2011-C1090-1131-0005) and the Seoul R&BD Program (WR080951). The preliminary version of this paper was presented in IEEE CIT [1] and COMSWARE [2].
Hyunsang Choi received the B.S. and M.S. degree in computer science and engineering from Korea University in Seoul, Korea, in 2007 and 2009, respectively. He is currently working toward doctorate degree in computer and communication security at Korea University in Seoul, Korea. He was an intern at Microsoft Resarche Asia (Beijing, China) from 2009 to 2010.
References (55)
- et al.
Matrices and the structure of random number sequences
Linear Algebra and its Applications
(1985) - H. Choi, H. Lee, H. Lee, H. Kim, Botnet Detection by monitoring group activities in dns traffic, in: Proceedings of the...
- H. Choi, H. Lee, H. Kim, BotGAD: detecting botnets by capturing group activities in network traffic, in: Proceedings of...
- S. Shevchenko, Kraken changes tactics. <http://www.blog.threatexpert.com/2008/04/kraken-changes-tactics.html>,...
- J. Wolf, Technical details of Srizbi’s domain generation algorithm....
- B. Stone-Gross, M. Cova, L. Cavallaro, B. Gilbert, M. Szydlowski, R. Kemmerer, C. Kruegel, G. Vigna, Your Botnet is My...
- P. Porras, H. Saidi, V. Yegneswaran, A foray into Conficker’s logic and rendezvous points, in: Proceedings of the...
Sequential tests of statistical hypotheses
The Annals of Mathematical Statistics
(1945)- A. Ramachandran, N. Feamster, D. Dagon, Revealing botnet membership using DNSBL counter-intelligence, in: Proceedings...
- R.V.-Salomon, J.C. Brustoloni, Bayesian bot detection based on DNS traffic similarity, in: Proceedings of the ACM...
Botnet Detection by Monitoring Similar Communication Patterns
International Journal of Computer Science and Information Security (IJCSIS)
Detecting Botnet Activities Based on Abnormal DNS traffic
International Journal of Computer Science and Information Security (IJCSIS)
Cited by (111)
Distributed denial of service attack prediction: Challenges, open issues and opportunities
2023, Computer NetworksFPMBot: Discovering the frequent pattern of IoT-botnet domain queries in large-scale network
2022, Computer CommunicationsDGA-based botnets detection using DNS traffic mining
2022, Journal of King Saud University - Computer and Information SciencesA machine learning approach for detecting fast flux phishing hostnames
2022, Journal of Information Security and ApplicationsA novel Machine Learning-based approach for the detection of SSH botnet infection
2021, Future Generation Computer SystemsA recursive DNS identification method based on top-level domain resolution logs
2023, Gaojishu Tongxin/Chinese High Technology Letters
Hyunsang Choi received the B.S. and M.S. degree in computer science and engineering from Korea University in Seoul, Korea, in 2007 and 2009, respectively. He is currently working toward doctorate degree in computer and communication security at Korea University in Seoul, Korea. He was an intern at Microsoft Resarche Asia (Beijing, China) from 2009 to 2010.
Heejo Lee is an associate professor at the Division of Computer and Communication Engineering, Korea University, Seoul, Korea. Before joining Korea University, he was at AhnLab, Inc. as a CTO from 2001 to 2003. From 2000 to 2001, he was a postdoc at the Department of Computer Sciences and the security center CERIAS, Purdue University. Dr. Lee received his BS, MS, PhD degree in Computer Science andEngineering from POSTECH, Pohang, Korea. Dr. Lee serves as an editor of Journal of Communications and Networks. He has been an advisory member of Korea Information SecurityAgency and Korea Supreme Prosecutor’s Office.