Identifying botnets by capturing group activities in DNS traffic

doi:10.1016/j.comnet.2011.07.018

Computer Networks

Volume 56, Issue 1, 12 January 2012, Pages 20-33

https://doi.org/10.1016/j.comnet.2011.07.018 Get rights and content

Abstract

Botnets have become the main vehicle to conduct online crimes such as DDoS, spam, phishing and identity theft. Even though numerous efforts have been directed towards detection of botnets, evolving evasion techniques easily thwart detection. Moreover, existing approaches can be overwhelmed by the large amount of data needed to be analyzed. In this paper, we propose a light-weight mechanism to detect botnets using their fundamental characteristics, i.e., group activity. The proposed mechanism, referred to as BotGAD (botnet group activity detector) needs a small amount of data from DNS traffic to detect botnet, not all network traffic content or known signatures. BotGAD can detect botnets from a large-scale network in real-time even though the botnet performs encrypted communications. Moreover, BotGAD can detect botnets that adopt recent evasion techniques. We evaluate BotGAD using multiple DNS traces collected from different sources including a campus network and large ISP networks. The evaluation shows that BotGAD can automatically detect botnets while providing real-time monitoring in large scale networks.

Introduction

A botnet is a network of computers compromised by malicious software. The botnet is operated by a criminal entity to perform Internet attacks, such as identity theft, spam distribution, and DDoS attack. All of these infected hosts are unwilling victims, performing malicious tasks unbeknownst to their owners.

Researchers have focused on bot traffic detection using incidental traits of prevalent bots. However, the detection approaches can be quickly overcome by evasion techniques. Moreover, some approaches need to run with an overwhelming amount of data, which becomes ineffective for high speed networks.

We have proposed a botnet detection mechanism using a fundamental property of botnets [1], [2]. We focused on an underlying common association among infected hosts, command and control (C&C) servers and victims, and found that the botnet generally acts as a coordinated group. Using this “group activity” property, it is possible to detect unknown botnets, irrespective of their communication protocol and structure. BotGAD detects botnets using DNS traffic, since it is possible to capture botnet group activities by monitoring the DNS traffic and monitoring DNS traffic has less overhead than monitoring the entire network traffic. Moreover, DNS monitoring enables botnet detection at their early stages, since botnet DNS traffic is often sent prior to performing attacks.

However, our previous mechanism has three limitations as follows:

•
The mechanism may generate false negatives, when a set of infected hosts in a botnet is changed frequently. For example, a part of a botnet can appear only for a short time because they are removed by users or temporally deactivated. The changes in the botnet can decrease detection accuracy of our previous mechanism.
•
The previous mechanism is too sensitive against detection parameters. For example, if a time window parameter of the previous mechanism is misconfigured, the mechanism can generate a significant number of false positives or false negatives.
•
The mechanism cannot detect recently introduced evasive botnets that utilize a domain generation algorithm for C&C. For example, Kraken/Bobax [3], Srizbi [4], Torpig [5] and Conficker [6] use the domain generation algorithm (DGA) to evade detection.

To overcome the limitations, we improve the mechanism by applying three methods: error correction, cluster analysis, and hypothesis test.

•
Error correction. We develop the error correction method to alleviate errors caused when analyzing group activities. Error correction can decrease false alarms caused by unexpected changes in a botnet or by misconfigured detection parameters. We devise column filtering and row filtering operations to correct the errors.
•
Cluster analysis. We develop a clustering method using unsupervised machine learning to detect a set of correlated botnets. Several features are devised to classify correlated clusters. Each cluster is analyzed to detect botnet clusters.
•
Hypothesis test. We adopt Sequential Probability Ratio Testing (SPRT) [7], as a hypothesis test for sequential analysis, where a decision is made within a small number of rounds with bounded false alarm rates. The SPRT method guarantees a higher level of confidence to make decisions than simple threshold based detection used in our previous mechanism.

The three methods are added in BotGAD to enhance accuracy and robustness against evasions.

We evaluate BotGAD using real-life DNS traces collected from several networks, such as a campus network and a large ISP network. BotGAD can report hundreds of botnet domains and correlated botnet domain clusters. It takes only a few minutes to analyze an hour’s DNS trace of a large ISP network. The evaluation shows BotGAD can automatically detect botnets in real-time, even though they apply evasion techniques, such as the DGA algorithm.

The remainder of this paper is organized as follows. Section 2 reviews related work. We describe the botnet group activity, detection algorithms and a framework of BotGAD in Section 3. We evaluate BotGAD performance in Section 4 and analyze results. We also discuss possible evasion techniques in Section 4. We draw conclusions in Section 5.

Section snippets

Related work

In this section, we review several network based botnet detection approaches that can be classified machine learning approaches and non-machine learning approaches. We further classify the approaches according to their detection object such as botnet traffic, cooperative behavior, and spamming botnets. We also distinguish the approaches by the source data they analyze (DNS traffic or other network traffic).

Botnet group activity and detection scheme

In this section, we illustrate the concept of our mechanism and a botnet detection scheme.

Evaluation result and analysis

We tested BotGAD on three different real-life traces to evaluate its performance. We also evaluate the performance of the three modules, i.e., error correction, correlated cluster analysis and hypothesis test. We discuss evadability and compare BotGAD to other mechanisms.

Conclusion

Botnets are the major threats to network security and major contributors to unwanted network traffic. Thus, it is necessary to provide appropriate countermeasures to botnets. We propose BotGAD to reveal both unknown domain names of C&C servers and IP addresses of hidden infected hosts. We define an inherent property of botnets, termed group activity. Using this property, we propose a light-weight mechanism to detect botnets. Our mechanism needs a small amount of data from DNS traffic, not all

Acknowledgments

This research was supported by the MKE, Korea, under the ITRC support program supervised by the NIPA (NIPA-2011-C1090-1131-0005) and the Seoul R&BD Program (WR080951). The preliminary version of this paper was presented in IEEE CIT [1] and COMSWARE [2].

Hyunsang Choi received the B.S. and M.S. degree in computer science and engineering from Korea University in Seoul, Korea, in 2007 and 2009, respectively. He is currently working toward doctorate degree in computer and communication security at Korea University in Seoul, Korea. He was an intern at Microsoft Resarche Asia (Beijing, China) from 2009 to 2010.

References (55)

G. Marsaglia et al.
Matrices and the structure of random number sequences
Linear Algebra and its Applications
(1985)
H. Choi, H. Lee, H. Lee, H. Kim, Botnet Detection by monitoring group activities in dns traffic, in: Proceedings of the...
H. Choi, H. Lee, H. Kim, BotGAD: detecting botnets by capturing group activities in network traffic, in: Proceedings of...
S. Shevchenko, Kraken changes tactics. <http://www.blog.threatexpert.com/2008/04/kraken-changes-tactics.html>,...
J. Wolf, Technical details of Srizbi’s domain generation algorithm....
B. Stone-Gross, M. Cova, L. Cavallaro, B. Gilbert, M. Szydlowski, R. Kemmerer, C. Kruegel, G. Vigna, Your Botnet is My...
P. Porras, H. Saidi, V. Yegneswaran, A foray into Conficker’s logic and rendezvous points, in: Proceedings of the...
A. Wald
Sequential tests of statistical hypotheses
The Annals of Mathematical Statistics
(1945)
A. Ramachandran, N. Feamster, D. Dagon, Revealing botnet membership using DNSBL counter-intelligence, in: Proceedings...
R.V.-Salomon, J.C. Brustoloni, Bayesian bot detection based on DNS traffic similarity, in: Proceedings of the ACM...

K. Sato, K. Ishibashi, T. Toyono, N. Miyake, Extending black domain name list by using co-occurrence relation between...

J. Brustoloni, N. Farnan, R. Villamarin-Salomon, D. Kyle, Efficient detection of bots in subscribers’ computers, in:...

G. Gu, P. Porras, V. Yegneswaran, M. Fong, W. Lee, BotHunter: detecting malware infection through ids-driven dialog...

A. Karasaridis, B. Rexroad, D. Hoeflin, Wide-scale botnet detection and characterization, in: Proceedings of the...

W. Lu, M. Tavallaee, G. Rammidi, A.A. Ghorbani, BotCop: an online botnets traffic classifier, in: Proceedings of the...

X. Hu, M. Knyz, K.G. Shin, RB-Seeker: auto-detection of redirection botnets, in: Proceedings of the Annual Network and...

H.R. Zeidanloo et al.

Botnet Detection by Monitoring Similar Communication Patterns

International Journal of Computer Science and Information Security (IJCSIS)

(2010)

A.M. Manasrah et al.

Detecting Botnet Activities Based on Abnormal DNS traffic

International Journal of Computer Science and Information Security (IJCSIS)

(2009)

M. Reiter, T. Yen, Traffic aggregation for malware detection, in: Proceedings of the Conference on Detection of...

G. Gu, J. Zhang, W. Lee, BotSniffer: detecting botnet command and control channels in network traffic, in: Proceedings...

X. Yu, X. Dong, G. Yu, Y. Qin, D. Yue, Y. Zhao, Online botnet detection by continuous similarity monitoring, in:...

H. Husna, S. Phithakkitnukoon, S. Palla, R. Dantu, Behavior analysis of spam botnets, in: Proceedings of the...

L. Zhuang, J. Dunagan, D.R. Simon, H.J. Wang, I. Osipkov, G. Hulten, J.D. Tygar, Characterizing botnets from email spam...

Z. Duan, P. Chen, F. Sanchez, Y. Dong, M. Stephenson, J. Barker, Detecting spam zombies by monitoring outgoing...

Y. Zhao, Y. Xie, F. Yu, Q. Ke, Y. Yu, Y. Chen, E. Gillum, Botgraph: Large scale spamming botnet detection, in:...

M. Antonakakis, R. Perdisci, D. Dagon, W. Lee, N. Feamster, Building a dynamic reputation system for DNS, in:...

G. Gu, R. Perdisci, J. Zhang, W. Lee, BotMiner: clustering analysis of network traffic for protocol- and...

Cited by (111)

Distributed denial of service attack prediction: Challenges, open issues and opportunities
2023, Computer Networks
Distributed Denial of Service (DDoS) attack is one of the biggest cyber threats. DDoS attacks have evolved in quantity and volume to evade detection and increase damage. Changes during the COVID-19 pandemic have left traditional perimeter-based security measures vulnerable to attackers that have diversified their activities by targeting health services, e-commerce, and educational services. DDoS attack prediction searches for signals of attack preparation to warn about the imminence of the attack. Prediction is necessary to handle high-volumetric DDoS attacks and to increase the time to defend against them. This survey article presents the classification of studies from the literature comprising the current state-of-the-art on DDoS attack prediction. It highlights the results of this extensive literature review categorizing the works by prediction time, architecture, employed methodology, and the type of data utilized to predict attacks. Further, this survey details each identified study and, finally, it emphasizes the research opportunities to evolve the DDoS attack prediction state-of-the-art.
FPMBot: Discovering the frequent pattern of IoT-botnet domain queries in large-scale network
2022, Computer Communications
With the rapid development of Internet of Things (IoT), new types of security issues has emerged, and one of the most severe one is the IoT-based botnet. A number of traditional works dedicate in analyzing DNS traffic for its wide misuse in botnets. However, most works are limited to specific behaviors, such as regularity in domains queries and periodicity in C&C(Command and Control) processes, which are often sheltered by attackers. Moreover, ever-growing numbers of domains and queries caused by IoT devices make most approaches not effective anymore since their architectures are hard to confront traffic explosion. In this paper, we illustrate the essential property of botnet, i.e., the frequent pattern of DNS request relationships, and propose a generic and scalable IoT-botnet detection system, named FPMBot, to detect bots domain queries in large-scale DNS traffic. The key insight is that bots in the same botnet inevitably query same sets of domains or servers whenever they try to conduct attacks or connect to C&C servers, and form frequent patterns in a bipartite graph of requesters and responsers. This frequent pattern in domain queries is an essential behavior for bots since a significant advantage of bots is the ability to launch large-scale attacks synchronously. We utilize a frequent pattern mining algorithm to detect such patterns and implement FPMBot based on Apache Spark parallel computing architecture to handle the daily increasing traffic. The experiment result on more than 14 billion records in four days real-world DNS logs shows that FPMBot has a high detection precision over 95% and performs well in large-scale network.
DGA-based botnets detection using DNS traffic mining
2022, Journal of King Saud University - Computer and Information Sciences
Botnet is a network of infected workstations that are remotely managed by BotMaster via the command and control (C&C) server. Botnets pose a serious threat to network security since they are the source of a variety of malicious behaviors such as information theft, phishing, and Distributed Denial of Service (DDoS) assaults. Using a Domain Generation Algorithm (DGA) to produce a vast set of domain names is one of the most prevalent ways for hiding the identity of the C&C server. As a result, existing defensive methods have a limited chance of detecting and defeating such infrastructure. In this study, a system is suggested that employs machine learning techniques to categorize domain names into malicious or legitimate domain names. The suggested method is based on assessing the linguistic qualities of domain names requested from various hosts. Fifteen associated linguistic features were collected from the domain wordings to determine the degree of randomization, rarity, typing difficulty, and other related factors. The proposed system is tested with DNS requests gathered from various sources and seven distinct DGA botnet families. The findings reveal that the suggested technique can detect DGA domains with a 99.1% and a 0.6% false-positive rate.
A machine learning approach for detecting fast flux phishing hostnames
2022, Journal of Information Security and Applications
Attackers are increasingly using Fast Flux Service Networks (FFSNs), networks of compromised machines, to host phishing websites. In FFSNs, the machines rapidly change such that blacklisting them does not entirely stop the networks from operating the websites. This increases the longevity of the websites thus becoming more harmful. Existing solutions for detecting the websites are limited with relatively low or moderate prediction performances, high prediction time and use of less diversified features which increases their susceptibility to detection evasions. This paper proposes a Machine Learning (ML) based approach for detecting phishing websites hosted in FFSNs using a novel set of 56 features. Compared with previous works, the approach achieves high accuracy, a low detection time and uses highly diversified features to enhance resilience to detection evasion. The effectiveness of the features for prediction was evaluated in the context of binary and multi-class classification tasks using multiple traditional and deep learning ML algorithms. The proposed approach achieves an accuracy of 98.42% and 97.81% for binary and multi-class classification tasks respectively. Our results showed that temporal and DNS based features are the strongest predictors while network and host related features are the weakest. Our approach is a significant step towards tracking of core components of FFSNs with an aim of shutting down the entire phishing ecosystem.
A novel Machine Learning-based approach for the detection of SSH botnet infection
2021, Future Generation Computer Systems
Botnets are causing severe damages to users, companies, and governments through information theft, abuse of online services, DDoS attacks, etc. Although significant research is being made to detect them and mitigate their effect, they are exponentially increasing due to new zero-day attacks, a variation of their behavior, and obfuscation techniques. High Interaction Honeypots (HIH) are the only honeypots able to capture attacks and log all the information generated by attackers when setting up a botnet. The data generated is being processed using Machine Learning (ML) techniques for detection since they can detect hidden patterns. However, so far, research has been focused on intermediate phases of the botnet’s life cycle during operation, underestimating the initial phase of infection. To the best of our knowledge, this is the first solution in the infection phase of SSH-based botnets. Therefore, we have designed an approach based on an SSH-based HIH to generate a dataset consisting of executed commands and network information. Herein, we have applied ML techniques for the development of a real-time detection model. This approach reached a very high level of prediction and zero false negatives. Indeed, our system detected all known and unknown SSH sessions intended to infect our honeypots. Thus, our research has demonstrated that new SSH infections can be detected through ML techniques.
A recursive DNS identification method based on top-level domain resolution logs
2023, Gaojishu Tongxin/Chinese High Technology Letters

View all citing articles on Scopus

Heejo Lee is an associate professor at the Division of Computer and Communication Engineering, Korea University, Seoul, Korea. Before joining Korea University, he was at AhnLab, Inc. as a CTO from 2001 to 2003. From 2000 to 2001, he was a postdoc at the Department of Computer Sciences and the security center CERIAS, Purdue University. Dr. Lee received his BS, MS, PhD degree in Computer Science andEngineering from POSTECH, Pohang, Korea. Dr. Lee serves as an editor of Journal of Communications and Networks. He has been an advisory member of Korea Information SecurityAgency and Korea Supreme Prosecutor’s Office.

View full text

Identifying botnets by capturing group activities in DNS traffic

Abstract

Introduction

Section snippets

Related work

Botnet group activity and detection scheme

Evaluation result and analysis

Conclusion

Acknowledgments

Linear Algebra and its Applications

Sequential tests of statistical hypotheses

The Annals of Mathematical Statistics

Botnet Detection by Monitoring Similar Communication Patterns

International Journal of Computer Science and Information Security (IJCSIS)

Detecting Botnet Activities Based on Abnormal DNS traffic

International Journal of Computer Science and Information Security (IJCSIS)