ABSTRACT
In the last few years, obfuscation has been used more and more by spammers to make spam emails bypass filters. The standard method is to use images that look like text, since typical spam filters are unable to parse such messages; this is what is used in so-called "rock phishing". To fight image-based spam, many spam filters use heuristic rules in which emails containing images are flagged, and since not many legit emails are composed mainly of a big image, this aids in detecting image-based spam. The spammers are thus interested in circumventing these methods. Unicode transliteration is a convenient tool for spammers, since it allows a spammer to create a large number of homomorphic clones of the same looking message; since Unicode contains many characters that are unique but appear very similar, spammers can translate a message's characters at random to hide black-listed words in an effort to bypass filters. In order to defend against these unicode-obfuscated spam emails, we developed a prototype tool that can be used with Spam Assassin to block spam obfuscated in this way by mapping polymorphic messages to a common, more homogeneous representation. This representation can then be filtered using traditional methods. We demonstrate the ease with which Unicode polymorphism can be used to circumvent spam filters such as SpamAssassin, and then describe a de-obfuscation technique that can be used to catch messages that have been obfuscated in this fashion.
- S. Ahmed, F. Mithun, "Word Stemming to Enhance Spam Filtering," in the Conference on Email and Anti-Spam (CEAS'04) 2004. http://www.ceas.cc/papers-2004/167.Google Scholar
- R. Cockerham, "There are 600, 426, 974, 379, 824, 381, 952 ways to spell Viagra." http://cockeyed.com/lessons/viagra/viagra.html. Retrieved on 25 July 2007.Google Scholar
- D. Cook, J. Hartnett, K. Manderson, J. Scanlan, "Catching Spam Before it Arrives:Domain Specific Dynamic Blacklists," http://crpit.com/confpapers/CRPITV54Cook.pdf. Google ScholarDigital Library
- L. F. Cranor, B. A. LaMacchia, "Spam!" Communications of the ACM, August 1998. Google ScholarDigital Library
- A. Y. Fu, W. Zhang, X. Deng, W. Liu, "Safeguard against unicode attacks: generation and Application of UC-simlist," in the 15th International World Wide Web Conference (WWW'06), May 2006. Google ScholarDigital Library
- A. Y. Fu, X. Deng, W. Liu, G. Little, "The Methodology and an Application to Fight Against Unicode Attacks," in Proceedings of the Second Symposium on Usable Privacy and Security (SOUPS'06) July 2006. ACM Press. Google ScholarDigital Library
- F. D. Garcia, J. H. Hoepman, J. V. Nieuwenhuizen, "Spam Filter Analysis," arXiv report, February 2004. Available at http://arxiv.org/PS_cache/cs/pdf/0402/0402046v1.pdfGoogle Scholar
- S. L. Garfinkel and R. C. Miller, "Johnny 2: a user test of key continuity management with S/MIME and Outlook Express," Proceedings of the 2005 Symposium on Usable Privacy and Security, 2005, pp. 13--24 Google ScholarDigital Library
- P. Graham, "Better Bayesian Filtering," Spam Conference, January 2003. Available at http://www.paulgraham.com/better.html.Google Scholar
- E. Gabber, M. Jakobsson, Y. Matias, A. Mayer, "Curbing Junk E-mail via Secure Classification," Financial Cryptograpy, 1998. Google ScholarDigital Library
- E. Gabrilovich, A. Gontmakher, "The Homograph Attack," Communications of the ACM, February 2002. Google ScholarDigital Library
- J. Goodman, G. V. Cormack, D. Heckerman, "Spam and the Ongoing Battle for the Inbox," Communications of the ACM, February 2007. Google ScholarDigital Library
- R. J. Hall, "Channels: Avoiding Unwanted Electronic Mail," Communications of the ACM, Volume 41 Issue 3, 1998. Google ScholarDigital Library
- R. J. Hall, "A Countermeasure to Duplicate-detecting Anti-spam Techniques," Available at http://citeseer.ist.psu.edu/279802.html, accessed 25 July 2007.Google Scholar
- M. Jakobsson, "Modeling and Preventing Phishing Attacks," Phishing Panel in Financial Cryptography 2005. Available at www.informatics.indiana.edu/markus/papers/phishing_jakobsson.pdf Google ScholarDigital Library
- M. Jakobsson, J. Linn, J. Algesheimer, "How to Protect Against a Militant Spammer," http://www.informatics.indiana.edu/markus/papers/spam.pdf, accessed 1 July 2007.Google Scholar
- M. Jakobsson and S. A. Myers (Eds.), Phishing and Countermeasures: Understanding the Increasing Problem of Electronic Identity Theft. ISBN 0-471-78245-9, Hardcover, 739 pages, December 2006. Google ScholarDigital Library
- J. Nazario, "Phishing Corpus," http://monkey.org/~jose/blog/viewpage.php?page=phishing_corpus. Accessed 22 May 2007.Google Scholar
- U. Shardanand, P. Maes, "Social Information Filtering: Algorithms for Automating 'Word of Mouth'," Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. May 1995. Google ScholarDigital Library
- B. Thorson, "How Spammers Bypass E-mail Security," EE Times, 25 July 2007. http://www.eetimes.com/showArticle.jhtml? articleID=23900564Google Scholar
- A. Tsow and M. Jakobsson, "Deceit and Deception: A Large User Study of Phishing," Technical Report TR649, Indiana University, August 2007. http://www.cs.indiana.edu/pub/techreports/TR649.pdfGoogle Scholar
- S. Srikwan, M. Jakobsson, "Using Cartoons to Teach Internet Security." DIMACS Technical Report 2007-11, July 2007. http://www.informatics.indiana.edu/markus/documents/security-education.pdfGoogle Scholar
- CRM114. http://crm114.sourceforge.net, Accessed 22 May 2007.Google Scholar
- Anti-Phishing Group of City University of Hong Kong, http://antiphishing.cs.cityu.edu.hk.Google Scholar
- Messaging Anti-Abuse Working Group, Email Metrics Program: "The Network Operator's Perspective, Report #4--3rd and 4th Quarters 2006," Available at http://www.maawg.org/about/MAAWGMetric_2006_3_4_report.pdfGoogle Scholar
- SpamAssassin. http://wiki.apache.org/spamassassin, Accessed 22 May 2007.Google Scholar
- SpamAssassin Readme file. http://www.cpan.org/modules/by-module/Mail/Mail-SpamAssassin-2.64.readme Accessed 22 May 2007.Google Scholar
- SpamAssassin public Corpus, http://spamassassin.apache.org/publiccorpus, Accessed 25 May 2006.Google Scholar
- Fighting unicode-obfuscated spam
Recommendations
Clustering Spam Emails into Campaigns
ICISSP 2015: Proceedings of the 1st International Conference on Information Systems Security and PrivacySpam emails constitute a fast growing and costly problems associated with the Internet today. To fight effectively
against spammers, it is not enough to block spam messages. Instead, it is necessary to analyze the
behavior of spammer. This analysis is ...
Fighting against web spam: a novel propagation method based on click-through data
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrievalCombating Web spam is one of the greatest challenges for Web search engines. State-of-the-art anti-spam techniques focus mainly on detecting varieties of spam strategies, such as content spamming and link-based spamming. Although these anti-spam ...
Optimization of Anti-Spam Systems with Multiobjective Evolutionary Algorithms
In this paper anti-spam filtering is presented as a cumbersome service, as opposed to a software product perspective. The huge human effort for setting up, adaptation, maintenance, and tuning of filters for spam detection in anti-spam systems is ...
Comments