nach oben

Journal of Intelligent Information Systems

Erschienen in:

10.01.2017

Distrust seed set propagation algorithm to detect web spam

verfasst von: Kwang Leng Goh, Ravi Kumar Patchmuthu, Ashutosh Kumar Singh

Erschienen in: Journal of Intelligent Information Systems | Ausgabe 2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Web spam uses numerous techniques to misguide Web search engines in exchange of financial profit. A myriad of semi-automatic propagation model has been proposed with the purpose of combating Web spam. In this paper, distrust propagation is used to detect Web spam. An automatic distrust seed set propagation algorithm (DSP), which acts as an extension to the seed set to propagate distrust further to detect more Web spam. Experiments are conducted on WEBSPAM-UK2006 and WEBSPAM-UK2007 dataset; the results have shown that DSP enhanced the baseline algorithms and detected 17.73 % more spam hosts in the former dataset and detected 8.59 % more spam hosts in later dataset.

Vorheriger Artikel A video summarization approach based on the emulation of bottom-up mechanisms of visual attention

Nächster Artikel Location detection and disambiguation from twitter messages

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Brin, S., & Page, L (1998). The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, 30(1), 107–117.CrossRef

Brinkmeier, M. (2006). Pagerank revisited. ACM Transactions on Internet Technology (TOIT), 6(3), 282–301.CrossRef

Castillo, C., Chellapilla, K., & Davison, B.D. (2007). Web spam challenge track i.

Castillo, C, Chellapilla, K, & Denoyer, L (2008). Web spam challenge 2008.

Chen, Q., Yu, S. N., & Cheng, S. (2008). Link variable trustrank for fighting web spam. In Computer science and software engineering, 2008 international conference on, IEEE, (Vol. 4 pp. 1004–1007).

Eiron, N., McCurley, K.S., & Tomlin, J.A. (2004). Ranking the web frontier. In Proceedings of the 13th international conference on World Wide Web (pp. 309–318): ACM.

Goh, K. L., & Singh, A. K. (2015). Comprehensive literature review on machine learning structures for web spam classification. Procedia Computer Science, 70, 434–441.CrossRef

Goh, K.L., Patchmuthu, R.K., & Singh, A.K. (2014a). Link-based web spam detection using weight properties. Journal of Intelligent Information Systems, 43(1), 129–145.CrossRef

Goh, K.L.A., Kumar Singh, A., Ravi Kumar, P., & Mohan, A. (2014b). Tprank: Contend with web spam using trust propagation. Cybernetics and Systems, 45(4), 307–323.CrossRef

Gyöngyi, Z., Garcia-Molina, H., & Pedersen, J. (2004). Combating web spam with trustrank. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, VLDB Endowment (pp. 576–587).

Gyongyi, Z., Berkhin, P., Garcia-Molina, H., & Pedersen, J. (2006). Link spam detection based on mass estimation. In Proceedings of the 32nd international conference on Very large data bases, VLDB Endowment (pp. 439–450).

Krishnan, V. (2006). Web spam detection with anti-trust rank. In In AIRWEB (pp. 37–40).

Leng, A.G.K., Kumar, P.R., Singh, A.K., & Mohan, A. (2012a). Link-based spam algorithms in adversarial information retrieval. Cybernetics and Systems, 43(6), 459–475.CrossRef

Leng, A.G.K., Patchmuthu, R., & Singh, A.K. (2012b). Incorporating weight properties in detection of web spam. In The 2nd international conference on uncertainty reasoning and knowledge engineering, 14-15 August (pp. 18–21).

Li, Z., Qiancheng, J., & Yan, Z. (2008). From good to bad ones: Making spam detection easier. In IEEE 8th International Conference on Computer and Information Technology Workshops (pp. 129–134), DOI 10.1109/CIT. 2008.Workshops.49, (to appear in print).

Liang, C., Ru, L., & Zhu, X. (2007). R-spamrank: a spam detection algorithm based on link analysis. Journal of Computational Information Systems, 3(4), 1705–1712.

Nie, L., Wu, B., & Davison, B.D. (2007). Winnowing wheat from the chaff: Propagating trust to sift spam from the web. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 869–870): ACM.

Shen, G., Gao, B., Liu, T.Y., Feng, G., Song, S., & Li, H. (2006). Detecting link spam using temporal information, IEEE.

Sobek, M. (2002). Pr0 - google’s pagerank 0 penalty. URL http://pr.efactory.de/e-pr0.shtml.

Wu, B., & Davison, B.D. (2005). Identifying link farm spam pages. In Special interest tracks and posters of the 14th international conference on World Wide Web (pp. 820–829): ACM.

Wu, B., Goel, V., & Davison, B.D. (2006a). Propagating trust and distrust to demote web spam. MTW 190.

Wu, B., Goel, V., & Davison, B.D. (2006b). Topical trustrank: Using topicality to combat web spam. In Proceedings of the 15th international conference on World Wide Web (pp. 63–72): ACM.

Yang, H., King, I., & Lyu, M.R. (2007). Diffusionrank: a possible penicillin for web spamming. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 431–438): ACM.

Zhang, X., Han, B., & Liang, W. (2009a). Automatic seed set expansion for trust propagation based anti-spamming algorithms. In Proceedings of the eleventh international workshop on Web information and data management (pp. 31–38): ACM.

Zhang, X., Wang, Y., Mou, N., & Liang, W. (2011). Propagating both trust and distrust with target differentiation for combating web spam. In: AAAI.

Zhang, Y., Jiang, Q., Zhang, L., & Zhu, Y. (2009b). Exploiting bidirectional links: making spamming detection easier. In Proceedings of the 18th ACM conference on Information and knowledge management (pp. 1839–1842): ACM.

Titel: Distrust seed set propagation algorithm to detect web spam
verfasst von: Kwang Leng Goh
Ravi Kumar Patchmuthu
Ashutosh Kumar Singh
Publikationsdatum: 10.01.2017
Verlag: Springer US
Erschienen in: Journal of Intelligent Information Systems / Ausgabe 2/2017
Print ISSN: 0925-9902
Elektronische ISSN: 1573-7675
DOI: https://doi.org/10.1007/s10844-016-0439-y

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2017

Location detection and disambiguation from twitter messages

A survey on expert finding techniques

Temporal Query Processing in Social Network

Towards portable natural language interfaces based on case-based reasoning

An audio-visual corpus for multimodal automatic speech recognition

A video summarization approach based on the emulation of bottom-up mechanisms of visual attention