ABSTRACT
Botnets continue to be a significant problem on the Internet. Accordingly, a great deal of research has focused on methods for detecting and mitigating the effects of botnets. Two of the primary factors preventing the development of effective large-scale, wide-area botnet detection systems are seemingly contradictory. On the one hand, technical and administrative restrictions result in a general unavailability of raw network data that would facilitate botnet detection on a large scale. On the other hand, were this data available, real-time processing at that scale would be a formidable challenge. In contrast to raw network data, NetFlow data is widely available. However, NetFlow data imposes several challenges for performing accurate botnet detection.
In this paper, we present Disclosure, a large-scale, wide-area botnet detection system that incorporates a combination of novel techniques to overcome the challenges imposed by the use of NetFlow data. In particular, we identify several groups of features that allow Disclosure to reliably distinguish C&C channels from benign traffic using NetFlow records (i.e., flow sizes, client access patterns, and temporal behavior). To reduce Disclosure's false positive rate, we incorporate a number of external reputation scores into our system's detection procedure. Finally, we provide an extensive evaluation of Disclosure over two large, real-world networks. Our evaluation demonstrates that Disclosure is able to perform real-time detection of botnet C&C channels over datasets on the order of billions of flows per day.
- Alexa Web Information Company. http://www.alexa.com/topsites/, 2009.Google Scholar
- EXPOSURE: Exposing Malicious Domains. http://exposure.iseclab.org/, 2011.Google Scholar
- FIRE: FInding RoguE Networks. http://www.maliciousnetworks.org/, 2011.Google Scholar
- Google Safe Browsing. http://www.google.com/safebrowsing/diagnostic?site=AS:as_number, 2011.Google Scholar
- L. Bilge, E. Kirda, C. Kruegel, and M. Balduzzi. Exposure: Finding malicious domains using passive dns analysis. In 18th Annual Network and Distributed System Security Symposium (NDSS'11), 2011.Google Scholar
- J. Binkley and S. Singh. An Algorithm for Anomaly-based Botnet Detection. In Usenix Steps to Reduce Unwanted Traffic on the Internet (SRUTI), 2006. Google ScholarDigital Library
- G. E. P. Box, G. M. Jenkins, and G. Reinsel. Time Series Analysis: Forecasting and Control. In 3rd eddition Upper Saddle River, NJ: Prentice-Hall, 1994. Google ScholarDigital Library
- D. Brauckhoff, X. Dimitropoulos, A. Wagner, and K. Salamatian. Anomaly extraction in backbone networks using association rules. In ACM Internet Measurement Conference (IMC'09), 2009. Google ScholarDigital Library
- D. Brauckhoff, B. Tellenbach, A. Wagner, M. May, and A. Lakhina. Impact of packet sampling on anomaly detection metrics. In Proceedings of the 6th ACM SIGCOMM conference on Internet measurement, IMC '06, 2006. Google ScholarDigital Library
- B. Claise. Cisco systems netflow services export version 9, 2004.Google Scholar
- E. Cooke, F. Jahanian, and D. McPherson. The Zombie Roundup: Understanding, Detecting, and Disrupting Botnets. In 1st Workshop on Steps to Reducing Unwanted Traffic on the Internet, pages 39--44, 2005. Google ScholarDigital Library
- N. Cristianini and J. Shawe-Taylor. An introduction to support vector machines and other kernel-based learning methods. In Cambridge University Press, 2000. Google ScholarDigital Library
- G. Dewaele, K. Fukuda, P. Borgnat, P. Abry, and K. Cho. Extracting hidden anomalies using sketch and non gaussian multiresolution statistical detection procedures. In Proceedings of the 2007 workshop on Large scale attack defense (LSAD'07), 2007. Google ScholarDigital Library
- J. Francois, S. Wang, R. State, and T. Engel. Bottrack: Tracking botnets using netflow and pagerank. In IFIP Networking 2011, 2011. Google ScholarDigital Library
- F. Freiling, T. Holz, and G. Wicherski. Botnet Tracking: Exploring a Root-Cause Methodology to Prevent Distributed Denial-of-Service Attacks. In 10th European Symposium On Research In Computer Security, 2005. Google ScholarDigital Library
- J. Goebel and T. Holz. Rishi: Identify bot contaminated hosts by IRC nickname evaluation. In Workshop on Hot Topics in Understanding Botnets, 2007. Google ScholarDigital Library
- G. Gu, R. Perdisci, J. Zhang, and W. Lee. BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection. In Usenix Security Symposium, 2008. Google ScholarDigital Library
- G. Gu, P. Porras, V. Yegneswaran, M. Fong, and W. Lee. BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation. In 16th Usenix Security Symposium, 2007. Google ScholarDigital Library
- G. Gu, J. Zhang, and W. Lee. BotSniffer: Detecting Botnet Command and Control Channels in Network Traffic. In 15th Annual Network and Distributed System Security Symposium (NDSS), 2008.Google Scholar
- J. John, A. Moshchuk, S. Gribble, and A. Krishnamurthy. Studying Spamming Botnets Using Botlab. In 6th Usenix Symposium on Networked Systems Design and Implementation (NSDI), 2009. Google ScholarDigital Library
- A. Karasaridis, B. Rexroad, and D. Hoeflin. Wide-scale Botnet Detection and Characterization. In Usenix Workshop on Hot Topics in Understanding Botnets, 2007. Google ScholarDigital Library
- D. E. Knuth. Seminumerical algorithms. In The Art of Computer Programming, Volume 2, Addison Wesley, 1969.Google ScholarDigital Library
- A. Liaw and M. Wiener. Classification and regression by randomforest. In R News, volume 2/3, page 18, 2002.Google Scholar
- C. Livadas, R. Walsh, D. Lapsley, and W. T. Strayer. Using machine learning techniques to identify botnet traffic. In the 2nd IEEE LCN Workshop on Network Security (WoNS'2006), 2006.Google Scholar
- J. Mai, C.-N. Chuah, A. Sridharan, T. Ye, and H. Zang. Is sampled data sufficient for anomaly detection? In Proceedings of the 6th ACM SIGCOMM conference on Internet measurement, IMC '06, 2006. Google ScholarDigital Library
- J. Quinlan. C4.5: Programs for machine learning. In Morgan Kaufmann Publishers, 1993. Google ScholarDigital Library
- M. A. Rajab, J. Zarfoss, F. Monrose, and A. Terzis. A Multi-faceted Approach to Understanding the Botnet Phenomenon. In Internet Measurement Conference (IMC), 2006. Google ScholarDigital Library
- A. Ramachandran and N. Feamster. Understanding the network-level behavior of spammers. In SIGCOMM Comput. Commun., 2006. Google ScholarDigital Library
- M. Reiter and T. Yen. Traffic aggregation for malware detection. In DIMVA, 2008. Google ScholarDigital Library
- A. Sperotto, R. Sadre, and A. Pras. Anomaly characterization in flow-based traffic time series. In Proceedings of the 8th IEEE international workshop on IP Operations and Management, IPOM '08, pages 15--27, 2008. Google ScholarDigital Library
- B. Stone-Gross, C. Kruegel, K. Almeroth, A. Moser, and E. Kirda. Fire: Finding rogue networks. In 2009 Annual Computer Security Applications Conference (ACSAC'09), 2009. Google ScholarDigital Library
- W. Strayer, R. Walsh, C. Livadas, and D. Lapsley. Detecting Botnets with Tight Command and Control. In 31st IEEE Conference on Local Computer Networks (LCN), 2006.Google Scholar
- S. Theodoridis and K. Koutroumbas. Pattern Recognition. Academic Press, 2009. Google ScholarDigital Library
- A. Wagner and B. Plattner. Entropy based worm and anomaly detection in fast ip networks. In SIG SIDAR Graduierten-Workshop uber Reaktive Sicherheit (SPRING'06), 2006.Google Scholar
- P. Wurzinger, L. Bilge, T. Holz, J. Goebel, C. Kruegel, and E. Kirda. Automatically generating models for botnet detection. In ESORICS 2009: 14th European Symposium on Research in Computer Security, 2009. Google ScholarDigital Library
Index Terms
- Disclosure: detecting botnet command and control servers through large-scale NetFlow analysis
Recommendations
Drive-by Disclosure: A Large-Scale Detector of Drive-by Downloads Based on Latent Behavior Prediction
TRUSTCOM '15: Proceedings of the 2015 IEEE Trustcom/BigDataSE/ISPA - Volume 01Drive-by downloads continue to be the basis for many kinds of large-scale web attacks. The detection of Drive-by downloads and heap spraying attacks has been receiving serious research attention. The appearance of complex obfuscation patterns make the ...
Drive-by Disclosure: A Large-Scale Detector of Drive-by Downloads Based on Latent Behavior Prediction
TRUSTCOM '15: Proceedings of the 2015 IEEE Trustcom/BigDataSE/ISPA - Volume 01Drive-by downloads continue to be the basis for many kinds of large-scale web attacks. The detection of Drive-by downloads and heap spraying attacks has been receiving serious research attention. The appearance of complex obfuscation patterns make the ...
WormTerminator: an effective containment of unknown and polymorphic fast spreading worms
ANCS '06: Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systemsThe fast spreading worm is becoming one of the most serious threats to today's networked information systems. A fast spreading worm could infect hundreds of thousands of hosts within a few minutes. In order to stop a fast spreading worm, we need the ...
Comments