Skip to main content

2017 | OriginalPaper | Buchkapitel

A Domain-Agnostic Approach to Spam-URL Detection via Redirects

verfasst von : Heeyoung Kwon, Mirza Basim Baig, Leman Akoglu

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Web services like social networks, video streaming sites, etc. draw numerous viewers daily. This popularity makes them attractive targets for spammers to distribute hyperlinks to malicious content. In this work we propose a new approach for detecting spam URLs on the Web. Our key idea is to leverage the properties of URL redirections widely deployed by spammers. We combine the redirect chains into a redirection graph that reveals the underlying infrastructure in which the spammers operate, and design our method to build on key characteristics closely associated with the modus operandi of the spammers. Different from previous work, our approach exhibits three key characteristics; (1) domain-independence, which enables it to generalize across different Web services, (2) adversarial robustness, which incurs difficulty, risk, or cost on spammers to evade as it is tightly coupled with their operational behavior, and (3) semi-supervised detection, which uses only a few labeled examples to produce competitive results thanks to its effective usage of the redundancy in spammers’ operations. Evaluation on large Twitter datasets shows that we achieve above 0.96 recall and 0.70 precision with false positive rate below 0.07 with only 1% of labeled data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Note that what is posted on a Web service are the initial URLs. We run a crawler to go through the redirects to extract the chains.
 
2
Note that the given URLs are the observed ones posted on the Web, also referred to as the initial URLs in this work.
 
Literatur
1.
Zurück zum Zitat Anderson, D.S., Fleizach, C., Savage, S., Voelker, G.M.: Spamscatter: characterizing internet scam hosting infrastructure. In: Usenix Security (2007) Anderson, D.S., Fleizach, C., Savage, S., Voelker, G.M.: Spamscatter: characterizing internet scam hosting infrastructure. In: Usenix Security (2007)
2.
Zurück zum Zitat Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: CEAS (2010) Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: CEAS (2010)
3.
Zurück zum Zitat Dalvi, N., Domingos, P., Mausam, Sanghai, S., Verma, D.: Adversarial classification. In: KDD, pp. 99–108 (2004) Dalvi, N., Domingos, P., Mausam, Sanghai, S., Verma, D.: Adversarial classification. In: KDD, pp. 99–108 (2004)
4.
Zurück zum Zitat Gao, H., Hu, J., Wilson, C., Li, Z., Chen, Y., Zhao, B.Y.: Detecting and characterizing social spam campaigns. In: IMC (2010) Gao, H., Hu, J., Wilson, C., Li, Z., Chen, Y., Zhao, B.Y.: Detecting and characterizing social spam campaigns. In: IMC (2010)
5.
Zurück zum Zitat Grier, C., Thomas, K., Paxson, V., Zhang, C.M.: @spam: the underground on 140 characters or less. In: CCS, pp. 27–37 (2010) Grier, C., Thomas, K., Paxson, V., Zhang, C.M.: @spam: the underground on 140 characters or less. In: CCS, pp. 27–37 (2010)
6.
Zurück zum Zitat Gyöngyi, Z., Garcia-Molina, H.: Web spam taxonomy. In: AIRWeb (2005) Gyöngyi, Z., Garcia-Molina, H.: Web spam taxonomy. In: AIRWeb (2005)
7.
Zurück zum Zitat Kindermann, R., Snell, J.L.: MRFs and their applications (1980) Kindermann, R., Snell, J.L.: MRFs and their applications (1980)
8.
Zurück zum Zitat Lee, K., Caverlee, J., Webb, S.: Uncovering social spammers: social honeypots + machine learning. In: SIGIR (2010) Lee, K., Caverlee, J., Webb, S.: Uncovering social spammers: social honeypots + machine learning. In: SIGIR (2010)
9.
Zurück zum Zitat Lee, S., Kim, J.: WarningBird: detecting suspicious URLs in Twitter stream. In: NDSS (2012) Lee, S., Kim, J.: WarningBird: detecting suspicious URLs in Twitter stream. In: NDSS (2012)
10.
Zurück zum Zitat Lowd, D., Meek, C.: Adversarial learning. In: KDD, pp. 641–647 (2005) Lowd, D., Meek, C.: Adversarial learning. In: KDD, pp. 641–647 (2005)
11.
Zurück zum Zitat Lu, L., Perdisci, R., Lee, W.: SURF: detecting and measuring search poisoning. In: CCS, pp. 467–476 (2011) Lu, L., Perdisci, R., Lee, W.: SURF: detecting and measuring search poisoning. In: CCS, pp. 467–476 (2011)
12.
Zurück zum Zitat Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: learning to detect malicious Web sites from suspicious URLs. In: KDD, pp. 1245–1254 (2009) Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: learning to detect malicious Web sites from suspicious URLs. In: KDD, pp. 1245–1254 (2009)
13.
Zurück zum Zitat Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious URLs: an application of large-scale online learning. In: ICML (2009) Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious URLs: an application of large-scale online learning. In: ICML (2009)
14.
Zurück zum Zitat Neville, P.G.: Decision Trees for Predictive Modeling. SAS Institute Inc., Cary (1999) Neville, P.G.: Decision Trees for Predictive Modeling. SAS Institute Inc., Cary (1999)
15.
Zurück zum Zitat Pandit, S., Chau, D.H., Wang, S., Faloutsos, C.: Netprobe: a fast and scalable system for fraud detection in online auction networks. In: WWW (2007) Pandit, S., Chau, D.H., Wang, S., Faloutsos, C.: Netprobe: a fast and scalable system for fraud detection in online auction networks. In: WWW (2007)
16.
Zurück zum Zitat Ramachandran, A., Feamster, N., Vempala, S.: Filtering spam with behavioral blacklisting. In: CCS (2007) Ramachandran, A., Feamster, N., Vempala, S.: Filtering spam with behavioral blacklisting. In: CCS (2007)
17.
Zurück zum Zitat Sinha, S., Bailey, M., Jahanian, F.: Shades of grey: on the effectiveness of reputation-based blacklists. In: Malicious & Unwanted Softw, IEEE (2008) Sinha, S., Bailey, M., Jahanian, F.: Shades of grey: on the effectiveness of reputation-based blacklists. In: Malicious & Unwanted Softw, IEEE (2008)
18.
Zurück zum Zitat Stringhini, G., Kruegel, C., Vigna, G.: Detecting spammers on social networks. In: ACSAC, pp. 1–9 (2010) Stringhini, G., Kruegel, C., Vigna, G.: Detecting spammers on social networks. In: ACSAC, pp. 1–9 (2010)
19.
Zurück zum Zitat Stringhini, G., Kruegel, C., Vigna, G.: Shady paths: leveraging surfing crowds to detect malicious web pages. In: CCS, pp. 133–144 (2013) Stringhini, G., Kruegel, C., Vigna, G.: Shady paths: leveraging surfing crowds to detect malicious web pages. In: CCS, pp. 133–144 (2013)
20.
Zurück zum Zitat Thomas, K., Grier, C., Ma, J., Paxson, V., Song, D.: Design and evaluation of a real-time URL spam filtering service. In: IEEE Symposium on Security and Privacy (2011) Thomas, K., Grier, C., Ma, J., Paxson, V., Song, D.: Design and evaluation of a real-time URL spam filtering service. In: IEEE Symposium on Security and Privacy (2011)
21.
Zurück zum Zitat Wu, B., Davison, B.D.: Cloaking, redirection: a preliminary study. In: AIRWeb, pp. 7–16 (2005) Wu, B., Davison, B.D.: Cloaking, redirection: a preliminary study. In: AIRWeb, pp. 7–16 (2005)
22.
Zurück zum Zitat Yedidia, J.S., Freeman, W.T., Weiss, Y.: Understanding belief propagation and its generalizations. In: Exploring AI in the New Millennium (2003) Yedidia, J.S., Freeman, W.T., Weiss, Y.: Understanding belief propagation and its generalizations. In: Exploring AI in the New Millennium (2003)
Metadaten
Titel
A Domain-Agnostic Approach to Spam-URL Detection via Redirects
verfasst von
Heeyoung Kwon
Mirza Basim Baig
Leman Akoglu
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-57529-2_18