skip to main content
10.1145/1299015.1299020acmotherconferencesArticle/Chapter ViewAbstractPublication PagesecrimeConference Proceedingsconference-collections
Article

Fighting unicode-obfuscated spam

Published:04 October 2007Publication History

ABSTRACT

In the last few years, obfuscation has been used more and more by spammers to make spam emails bypass filters. The standard method is to use images that look like text, since typical spam filters are unable to parse such messages; this is what is used in so-called "rock phishing". To fight image-based spam, many spam filters use heuristic rules in which emails containing images are flagged, and since not many legit emails are composed mainly of a big image, this aids in detecting image-based spam. The spammers are thus interested in circumventing these methods. Unicode transliteration is a convenient tool for spammers, since it allows a spammer to create a large number of homomorphic clones of the same looking message; since Unicode contains many characters that are unique but appear very similar, spammers can translate a message's characters at random to hide black-listed words in an effort to bypass filters. In order to defend against these unicode-obfuscated spam emails, we developed a prototype tool that can be used with Spam Assassin to block spam obfuscated in this way by mapping polymorphic messages to a common, more homogeneous representation. This representation can then be filtered using traditional methods. We demonstrate the ease with which Unicode polymorphism can be used to circumvent spam filters such as SpamAssassin, and then describe a de-obfuscation technique that can be used to catch messages that have been obfuscated in this fashion.

References

  1. S. Ahmed, F. Mithun, "Word Stemming to Enhance Spam Filtering," in the Conference on Email and Anti-Spam (CEAS'04) 2004. http://www.ceas.cc/papers-2004/167.Google ScholarGoogle Scholar
  2. R. Cockerham, "There are 600, 426, 974, 379, 824, 381, 952 ways to spell Viagra." http://cockeyed.com/lessons/viagra/viagra.html. Retrieved on 25 July 2007.Google ScholarGoogle Scholar
  3. D. Cook, J. Hartnett, K. Manderson, J. Scanlan, "Catching Spam Before it Arrives:Domain Specific Dynamic Blacklists," http://crpit.com/confpapers/CRPITV54Cook.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. F. Cranor, B. A. LaMacchia, "Spam!" Communications of the ACM, August 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Y. Fu, W. Zhang, X. Deng, W. Liu, "Safeguard against unicode attacks: generation and Application of UC-simlist," in the 15th International World Wide Web Conference (WWW'06), May 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Y. Fu, X. Deng, W. Liu, G. Little, "The Methodology and an Application to Fight Against Unicode Attacks," in Proceedings of the Second Symposium on Usable Privacy and Security (SOUPS'06) July 2006. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F. D. Garcia, J. H. Hoepman, J. V. Nieuwenhuizen, "Spam Filter Analysis," arXiv report, February 2004. Available at http://arxiv.org/PS_cache/cs/pdf/0402/0402046v1.pdfGoogle ScholarGoogle Scholar
  8. S. L. Garfinkel and R. C. Miller, "Johnny 2: a user test of key continuity management with S/MIME and Outlook Express," Proceedings of the 2005 Symposium on Usable Privacy and Security, 2005, pp. 13--24 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Graham, "Better Bayesian Filtering," Spam Conference, January 2003. Available at http://www.paulgraham.com/better.html.Google ScholarGoogle Scholar
  10. E. Gabber, M. Jakobsson, Y. Matias, A. Mayer, "Curbing Junk E-mail via Secure Classification," Financial Cryptograpy, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. E. Gabrilovich, A. Gontmakher, "The Homograph Attack," Communications of the ACM, February 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Goodman, G. V. Cormack, D. Heckerman, "Spam and the Ongoing Battle for the Inbox," Communications of the ACM, February 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. J. Hall, "Channels: Avoiding Unwanted Electronic Mail," Communications of the ACM, Volume 41 Issue 3, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. J. Hall, "A Countermeasure to Duplicate-detecting Anti-spam Techniques," Available at http://citeseer.ist.psu.edu/279802.html, accessed 25 July 2007.Google ScholarGoogle Scholar
  15. M. Jakobsson, "Modeling and Preventing Phishing Attacks," Phishing Panel in Financial Cryptography 2005. Available at www.informatics.indiana.edu/markus/papers/phishing_jakobsson.pdf Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Jakobsson, J. Linn, J. Algesheimer, "How to Protect Against a Militant Spammer," http://www.informatics.indiana.edu/markus/papers/spam.pdf, accessed 1 July 2007.Google ScholarGoogle Scholar
  17. M. Jakobsson and S. A. Myers (Eds.), Phishing and Countermeasures: Understanding the Increasing Problem of Electronic Identity Theft. ISBN 0-471-78245-9, Hardcover, 739 pages, December 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Nazario, "Phishing Corpus," http://monkey.org/~jose/blog/viewpage.php?page=phishing_corpus. Accessed 22 May 2007.Google ScholarGoogle Scholar
  19. U. Shardanand, P. Maes, "Social Information Filtering: Algorithms for Automating 'Word of Mouth'," Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. May 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. B. Thorson, "How Spammers Bypass E-mail Security," EE Times, 25 July 2007. http://www.eetimes.com/showArticle.jhtml? articleID=23900564Google ScholarGoogle Scholar
  21. A. Tsow and M. Jakobsson, "Deceit and Deception: A Large User Study of Phishing," Technical Report TR649, Indiana University, August 2007. http://www.cs.indiana.edu/pub/techreports/TR649.pdfGoogle ScholarGoogle Scholar
  22. S. Srikwan, M. Jakobsson, "Using Cartoons to Teach Internet Security." DIMACS Technical Report 2007-11, July 2007. http://www.informatics.indiana.edu/markus/documents/security-education.pdfGoogle ScholarGoogle Scholar
  23. CRM114. http://crm114.sourceforge.net, Accessed 22 May 2007.Google ScholarGoogle Scholar
  24. Anti-Phishing Group of City University of Hong Kong, http://antiphishing.cs.cityu.edu.hk.Google ScholarGoogle Scholar
  25. Messaging Anti-Abuse Working Group, Email Metrics Program: "The Network Operator's Perspective, Report #4--3rd and 4th Quarters 2006," Available at http://www.maawg.org/about/MAAWGMetric_2006_3_4_report.pdfGoogle ScholarGoogle Scholar
  26. SpamAssassin. http://wiki.apache.org/spamassassin, Accessed 22 May 2007.Google ScholarGoogle Scholar
  27. SpamAssassin Readme file. http://www.cpan.org/modules/by-module/Mail/Mail-SpamAssassin-2.64.readme Accessed 22 May 2007.Google ScholarGoogle Scholar
  28. SpamAssassin public Corpus, http://spamassassin.apache.org/publiccorpus, Accessed 25 May 2006.Google ScholarGoogle Scholar
  1. Fighting unicode-obfuscated spam

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        eCrime '07: Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit
        October 2007
        90 pages
        ISBN:9781595939395
        DOI:10.1145/1299015

        Copyright © 2007 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 October 2007

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader