skip to main content
10.1145/1963405.1963437acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Heat-seeking honeypots: design and experience

Published:28 March 2011Publication History

ABSTRACT

Many malicious activities on the Web today make use of compromised Web servers, because these servers often have high pageranks and provide free resources. Attackers are therefore constantly searching for vulnerable servers. In this work, we aim to understand how attackers find, compromise, and misuse vulnerable servers. Specifically, we present heat-seeking honeypots that actively attract attackers, dynamically generate and deploy honeypot pages, then analyze logs to identify attack patterns.

Over a period of three months, our deployed honeypots, despite their obscure location on a university network, attracted more than 44,000 attacker visits from close to 6,000 distinct IP addresses. By analyzing these visits, we characterize attacker behavior and develop simple techniques to identify attack traffic. Applying these techniques to more than 100 regular Web servers as an example, we identified malicious queries in almost all of their logs.

References

  1. Bing. http://www.bing.com.Google ScholarGoogle Scholar
  2. DShield Web Honeypot Project. http://sites.google.com/site/webhoneypotsite/.Google ScholarGoogle Scholar
  3. Glasstopf Honeypot Project. http://glastopf.org/.Google ScholarGoogle Scholar
  4. Most web attacks come via compromised legitimate websites. http://www.computerweekly.com/Articles/2010/06/18/241655/Most-web-attacks-come-via-compromised%-legitimate-wesites.htm.Google ScholarGoogle Scholar
  5. PlanetLab. http://www.planet-lab.org/.Google ScholarGoogle Scholar
  6. Snort : a free light-weight network intrustion detection system for UNIX and Windows. http://www.snort.org/.Google ScholarGoogle Scholar
  7. Spam SEO trends & statistics. http://research.zscaler.com/2010/07/spam-seo-trends-statistics-part-ii.html.Google ScholarGoogle Scholar
  8. Google Hack Honeypot, 2005. http://ghh.sourceforge.net/.Google ScholarGoogle Scholar
  9. P. Baecher, M. Koetter, M. Dornseif, and F. Freiling. The Nepenthes platform: An efficient approach to collect malware. In Proceedings of the 9th International Symposium on Recent Advances in Intrusion Detection (RAID), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. Revirt: Enabling intrusion analysis through virtual-machine logging and replay. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Eichmann. The RBSE Spider - balancing effective search against Web load, 1994.Google ScholarGoogle Scholar
  12. J. P. John, F. Yu, Y. Xie, M. Abadi, and A. Krishnamurthy. Searching the Searchers with SearchAudit. In Proceedings of the 19th USENIX Security Symposium, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Kreibich and J. Crowcroft. Honeycomb: Creating intrusion detection signatures using honeypots. In Proceedings of the 2nd Workshop on Hot Topics in Networks (HotNets), 2003.Google ScholarGoogle Scholar
  14. T. Moore and R. Clayton. Evil searching: Compromise and recompromise of internet hosts for phishing. In Proceedings of the 13th International Conference on Financial Cryptography and Data Security, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Moshchuk, T. Bragin, S. D. Gribble, and H. M. Levy. A crawler-based study of spyware on the web. In Proceedings of the 13th Annual Symposium on Network and Distributed System Security (NDSS), 2006.Google ScholarGoogle Scholar
  16. N. Provos. A virtual honeypot framework. In Proceedings of the 13th USENIX Security Symposium, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Provos, J. McClain, and K. Wang. Search worms. In Proceedings of the 4th ACM Workshop on Recurring Malcode (WORM), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Small, J. Mason, F. Monrose, N. Provos, and A. Stubblefield. To catch a predator: a natural language approach for eliciting malicious payloads. In Proceedings of the 17th USENIX Security Symposium, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Xie, F. Yu, K. Achan, R. Panigrahy, G. Hulten, and I. Osipkov. Spamming botnets: Signatures and characteristics. In Proceedings of the ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. Yegneswaran, J. T. Giffin, P. Barford, and S. Jha. An architecture for generating semantics-aware signatures. In Proceedings of the 14th USENIX Security Symposium, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yi. Wang, D. Beck, X. Jiang, R. Roussev, C. Verbowski, S. Chen, and S. King. Automated Web Patrol with Strider HoneyMonkeys. In Proceedings of the 13th Annual Symposium on Network and Distributed System Security (NDSS), 2006.Google ScholarGoogle Scholar
  22. F. Yu, Y. Xie, and Q. Ke. SBotMiner: Large scale search bot detection. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Heat-seeking honeypots: design and experience

    Recommendations

    Reviews

    André C. M. Mariën

    The mere existence of honeypots is a blessing. Suddenly, attackers need to worry: "This vulnerable application on a powerful server looks too good to be true. Could it be that I am the victim of a honeypot__?__" This paper describes an almost-risk-free honeypot setup. The prototypical attack sequence starts with intelligence on possible targets, often via search engines. The sites are completely unaware of this investigation. However, such searches stand out and are identifiable. The authors had the good fortune to have access to search engine information. Queries identifying possible targets are not like most queries, so data mining is possible. (What constitutes a good query to fingerprint vulnerable software__?__) This phase is largely automated, which is important for a practical approach. Inserting honeypot page hits into search engine results to match the specific queries attracts predators to the honeypot. Once an attacker identifies a possible vulnerable target, the attacker will test whether this is indeed the case. The predator's requests to the honeypot site will deviate from any normal traffic that hits these prepared pages. First, the pages are not widely advertised, but are constructed to show up only in target-identifying queries. Second, even if users were to find links to the pages, one can expect only requests for those pages, and nothing else. In that sense, the pages are a whitelist for normal traffic. What is important here is that whitelisting is much easier and less error-prone than blacklisting (attack pattern detection). A problem to address is how to make sure search engines rank honeypot pages high enough to show up in results, while maintaining a low false negative rate by avoiding most of the innocent visitors. The pages must have inbound links, and they must contain the right keywords for crawlers to pick them up. Ideally, this is the only nonmalicious traffic. The authors report an impressive false negative rate of, at most, one percent. This type of honeypot has very attractive properties. One does not need to understand the attack or set up complex and vulnerable configurations, unlike with other honeypots. The concern of being compromised and thereby assisting the attacker does not exist. As each attack fails, one can expect the attacker to try a full range of attacks. This may include attacks under development, long before they are mature. A drawback of the method is that it is slower than the typical honeypot method. Search engines must first notice the pages and place them high enough in their ranking to attract the attackers. Search engines are an Internet power that we can use with either good or bad intentions. It is refreshing to see that we can turn a badly intentioned use into a mighty defense: a honeypot with no vulnerabilities. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      WWW '11: Proceedings of the 20th international conference on World wide web
      March 2011
      840 pages
      ISBN:9781450306324
      DOI:10.1145/1963405

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 March 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader