nach oben

Arabian Journal for Science and Engineering

Erschienen in:

15.10.2020 | Research Article-Computer Engineering and Computer Science

GT2FS-SMOTE: An Intelligent Oversampling Approach Based Upon General Type-2 Fuzzy Sets to Detect Web Spam

verfasst von: Prabhjot Kaur, Anjana Gosain

Erschienen in: Arabian Journal for Science and Engineering | Ausgabe 4/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

With the growing internet, web spam is also increasing, which majorly affect the user experiences with search engines. Web spam methods target the search engine’s internal programs to push targeted web sites at the upper positions. This paper proposed an intelligent oversampling approach based upon general type-2 fuzzy sets to balance the distribution and hence enhance the classification performance for web spam detection. The proposed method is validated with the real-world benchmark dataset, WEBSPAM-UK 2007, and its performance is assessed with AUC (Area under the ROC curve), F-measure, and G-mean. It is compared with SMOTE in combination with 11 well-known base classifiers available with WEKA Tool. The computational complexity of the proposed method is the same as that of SMOTE. It is reported that when the proposed method is combined with the base classifiers, it boosts up the classifier’s performance and outperforms SMOTE in every case. Proposed combinations are also statistically analyzed using Friedman, Holm, and Wilcoxon test to know the best combination among the 11 base classifiers. It is evident from the analysis that the proposed method, in combination with random forest (GT2FS-SMOTE+RF), performed best among every other combination.

Vorheriger Artikel Secure and Efficient Cloud-based IoT Authenticated Key Agreement scheme for e-Health Wireless Sensor Networks

Nächster Artikel Sparse to Dense Scale Prediction for Crowd Couting in High Density Crowds

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Convey, E.: Porn Sneaks Way Back on Web. The Boston Herald 28 (1996)

Ghiam, S.; Pour, AN.: A Survey on Web Spam Detection Methods: Taxonomy (2012). arXiv preprint arXiv:1210.3131

Wu, B.; Davison, B.D.: Identifying link farm spam pages. In: Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, pp. 820–829 (2005b)

Tung, T.S.; Yahaya, N.A.; Mustapha, SS.: Multi-level link structure analysis technqiue for detecting link farm spam pages. In: 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops, pp. 614–617. IEEE (2006)

Wang, Y.; Qin, Z.; Tong, B.; Jin, J.: Link farm spam detection based on its properties. In: 2008 International Conference on Computational Intelligence and Security, vol. 2, pp. 477–480. IEEE (2008)

Gyongyi, Z.; Garcia-Molina, H.; Pedersen, J.: Combating web spam with trustrank. In: Proceedings of the 30th International Conference on Very Large Data Bases (VLDB) (2004)

Chen, Q.; Yu, S.N.; Cheng, S.: Link variable trustrank for fighting web spam. In: 2008 International Conference on Computer Science and Software Engineering, vol. 4, pp. 1004–1007. IEEE (2008)

Pu, B.Y.; Huang, T.Z.; Wen, C.: An improved pagerank algorithm: immune to spam. In: 2010 Fourth International Conference on Network and System Security, pp. 425–429. IEEE (2010)

Najork, MA.: System and Method for Identifying Cloaked Web Servers. US Patent 6,910,077 (2005)

10.

Wu, B.; Davison, BD.: Cloaking and redirection: a preliminary study. In: AIRWeb, pp. 7–16 (2005a)

11.

Chellapilla, K.; Chickering, D.M.: Improving cloaking detection using search query popularity and monetizability. In: AIRWeb, pp. 17–23 (2006)

12.

Wu, B.; Davison, B.D.: Detecting semantic cloaking on the web. In: Proceedings of the 15th International Conference on World Wide Web, pp. 819–828 (2006)

13.

Lin, J.L.: Detection of cloaked web spam by using tag-based methods. Exp. Syst. Appl. 36(4), 7493–7499 (2009)

14.

Geng, G.G.; Wang, C.H.; Li, Q.D.; Xu, L.; Jin, X.B.: Boosting the performance of web spam detection with ensemble under-sampling classification. In: Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007), vol. 4, pp. 583–587. IEEE (2007)

15.

Mahmoudi, M.; Yari, A.; Khadivi, S.: Web spam detection based on discriminative content and link features. In: 2010 5th International Symposium on Telecommunications, pp. 542–546. IEEE (2010)

16.

Ntoulas, A.; Najork, M.; Manasse, M.; Fetterly, D.: Detecting spam web pages through content analysis. In: Proceedings of the 15th International Conference on World Wide Web, pp. 83–92 (2006)

17.

Silva, R.M.; Yamakami, A.; Almeida, T.A.: An analysis of machine learning methods for spam host detection. In: 2012 11th International Conference on Machine Learning and Applications, vol. 2, pp. 227–232. IEEE (2012c)

18.

Silva, R.M.; Almeida, T.A.; Yamakami, A.: Artificial neural networks for content-based web spam detection. In: Proceedings on the International Conference on Artificial Intelligence (ICAI), The Steering Committee of The World Congress in Computer Science, Computer, p. 1 (2012a)

19.

Silva, R.M.; Almeida, T.A.; Yamakami, A.: Towards web spam filtering with neural-based approaches. In: Ibero-American Conference on Artificial Intelligence, pp. 199–209. Springer (2012b)

20.

Almeida, T.A.; Yamakami, A.: Compression-based spam filter. Secur. Commun. Netw. 9(4), 327–335 (2016)

21.

Almeida, T.A.; Yamakami, A.: Occam’s razor-based spam filter. J. Internet Serv. Appl. 3(3), 245–253 (2012c)

22.

Almeida, T.A.; Yamakami, A.: Advances in spam filtering techniques. In: Elizondo, D.A., Solanas, A., Martinez-Balleste, A. (eds.) Computational Intelligence for Privacy and Security, pp. 199–214. Springer, Berlin (2012a)

23.

Almeida, T.A.; Yamakami, A.: Facing the spammers: a very effective approach to avoid junk e-mails. Exp. Syst. Appl. 39(7), 6557–6561 (2012b)

24.

Singh, T.; Kumari, M.; Mahajan, S.: Feature oriented fuzzy logic based web spam detection. J. Inf. Optim. Sci. 38(6), 999–1015 (2017)

25.

Li, Y.; Nie, X.; Huang, R.: Web spam classification method based on deep belief networks. Exp. Syst. Appl. 96, 261–270 (2018)

26.

Afzal, A.L.; Asharaf, S.: Deep multiple multilayer kernel learning in core vector machines. Exp. Syst. Appl. 96, 149–156 (2018)

27.

Kotian, H.; Gupta, K.; Stephy, J.J.: Using fuzzy logic for email spam filtering. Int. J. 5(10) (2015)

28.

Dhingra, K.; Yadav, S.K.: Spam analysis of big reviews dataset using fuzzy ranking evaluation algorithm and hadoop. Int. J. Mach. Learn. Cybern. 10, 1–20 (2019)

29.

Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)MATH

30.

Han, H.; Wang, WY.; Mao, BH.: Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: International Conference on Intelligent Computing, pp. 878–887. Springer (2005)

31.

Stefanowski, J.; Wilk, S.: Selective pre-processing of imbalanced data for improving classification performance. In: International Conference on Data Warehousing and Knowledge Discovery, pp. 283–292. Springer (2008)

32.

Hu, S.; Liang, Y.; Ma, L.; He, Y.: Msmote: improving classification performance when training data is imbalanced. In: 2009 Second International Workshop on Computer Science and Engineering, vol. 2, pp. 13–17. IEEE (2009)

33.

Bunkhumpornpat, C.; Sinapiromsaran, K.; Lursinsap, C.: Safe-level-smote: safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, pp. 475–482 (2009)

34.

Mi, Y.: Imbalanced classification based on active learning smote. Res. J. Appl. Sci. Eng. Technol. 5(3), 944–949 (2013)

35.

García, S.; Herrera, F.: Evolutionary undersampling for classification with imbalanced datasets: proposals and taxonomy. Evolut. Comput. 17(3), 275–306 (2009)MathSciNet

36.

Yen, S.J.; Lee, Y.S.: Cluster-based under-sampling approaches for imbalanced data distributions. Exp. Syst. Appl. 36(3), 5718–5727 (2009)

37.

Raghuwanshi, B.S.; Shukla, S.: Smote based class-specific extreme learning machine for imbalanced learning. Knowl. Based Syst. 187, 104814 (2020)

38.

Chen, L.; Dong, P.; Su, W.; Zhang, Y.: Improving classification of imbalanced datasets based on km++ smote algorithm. In: 2019 2nd International Conference on Safety Produce Informatization (IICSPI), pp. 300–306 (2019)

39.

Uriz, M.; Elkano, M.; Bustince, H.; Galar, M.: Fuzz-eq: a data equalizer for boosting the discrimination power of fuzzy classifiers. Appl. Soft Comput. 93, 106399 (2020)

40.

Son, M.; Jung, S.; Moon, J.; Hwang, E.: Bcgan-based over-sampling scheme for imbalanced data. In: 2020 IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 155–160 (2020)

41.

Koziarski, M.: Csmoute: Combined Synthetic Oversampling and Undersampling Technique for Imbalanced Data Classification (2020). arXiv preprint arXiv:2004.03409

42.

Liang, X.; Jiang, A.; Li, T.; Xue, Y.; Wang, G.: Lr-smote—an improved unbalanced data set oversampling based on k-means and SVM. Knowl. Based Syst. 196, 105845 (2020)

43.

Tarawneh, A.S.; Hassanat, A.B.; Almohammadi, K.; Chetverikov, D.; Bellinger, C.: Smotefuna: synthetic minority over-sampling technique based on furthest neighbour algorithm. IEEE Access 8, 59069–59082 (2020)

44.

Pal, B.; Tarafder, AK.; Rahman, M.S.: Synthetic samples generation for imbalance class distribution with LSTM recurrent neural networks. In: Proceedings of the International Conference on Computing Advancements, pp. 1–5 (2020)

45.

Ren, R.; Yang, Y.; Sun, L.: Oversampling technique based on fuzzy representativeness difference for classifying imbalanced data. Appl. Intell. 50, 1–23 (2020)

46.

Devi, D.; Namasudra, S.; Kadry, S.: A boosting-aided adaptive cluster-based undersampling approach for treatment of class imbalance problem. Int. J. Data Warehous. Min. (IJDWM) 16(3), 60–86 (2020)

47.

Guo, C.; Ma, Y.; Xu, Z.; Cao, M.; Yao, Q.: An improved oversampling method for imbalanced data-smote based on canopy and k-means. In: 2019 Chinese Automation Congress (CAC), pp. 1467–1469. IEEE (2019)

48.

Nnamoko, N.; Korkontzelos, I.: Efficient treatment of outliers and class imbalance for diabetes prediction. Artif. Intell. Med. 104, 101815 (2020)

49.

Arafat, M.Y.; Hoque, S.; Xu, S.; Farid, D.M.: An under-sampling method with support vectors in multi-class imbalanced data classification. In: 2019 13th International Conference on Software. Knowledge, Information Management and Applications (SKIMA), pp. 1–6. IEEE (2019)

50.

Tao, X.; Li, Q.; Guo, W.; Ren, C.; He, Q.; Liu, R.; Zou, J.: Adaptive weighted over-sampling for imbalanced datasets based on density peaks clustering with heuristic filtering. Inf. Sci. 519, 43–73 (2020)MathSciNet

51.

Al Majzoub, H.; Elgedawy, I.; Akaydın, Ö.; Ulukök, M.K.: Hcab-smote: a hybrid clustered affinitive borderline smote approach for imbalanced data binary classification. Arab. J. Sci. Eng. 45, 1–18 (2020)

52.

Zhang, J.; Wang, T.; Ng, WW.; Zhang, S.; Nugent, CD.: Undersampling near decision boundary for imbalance problems. In: 2019 International Conference on Machine Learning and Cybernetics (ICMLC), pp. 1–8. IEEE (2019)

53.

Hussein, A.S.; Li, T.; Yohannese, C.W.; Bashir, K.: A-smote: a new preprocessing approach for highly imbalanced datasets by improving smote. Int. J. Comput. Intell. Syst. 12(2), 1412–1422 (2019)

54.

Pan, T.; Zhao, J.; Wu, W.; Yang, J.: Learning imbalanced datasets based on smote and gaussian distribution. Inf. Sci. 512, 1214–1233 (2020)

55.

Bashir, K.; Li, T.; Yohannese, C.W.; Yahaya, M.: Smotefris-inffc: handling the challenge of borderline and noisy examples in imbalanced learning for software defect prediction. J. Intell. Fuzzy Syst. 38(1), 917–933 (2020)

56.

Kaur, P.; Gosain, A.: Ff-smote: a metaheuristic approach to combat class imbalance in binary classification. Appl. Artif. Intell. 33(5), 420–439 (2019)

57.

Sánchez-Hernández, F.; Ballesteros-Herráez, J.C.; Kraiem, M.S.; Sánchez-Barba, M.; Moreno-García, M.N.: Predictive modeling of ICU healthcare-associated infections from imbalanced data. Using ensembles and a clustering-based undersampling approach. Appl. Sci. 9(24), 5287 (2019)

58.

Sarkar, S.; Khatedi, N.; Pramanik, A.; Maiti, J.: An ensemble learning-based undersampling technique for handling class-imbalance problem. In: Proceedings of ICETIT 2019, pp. 586–595. Springer (2020)

59.

Liu, T.; Zhu, X.; Pedrycz, W.; Li, Z.: A design of information granule-based under-sampling method in imbalanced data classification. Soft Comput. (2020)

60.

Kaur, P.; Gosain, A.: An intelligent undersampling technique based upon intuitionistic fuzzy sets to alleviate class imbalance problem of classification with noisy environment. Int. J. Intell. Eng. Inform. 6(5), 417–433 (2018b)

61.

Kaur, P.; Gosain, A.: Robust hybrid data-level sampling approach to handle imbalanced data during classification. Soft Comput. 24(20), 15715–15732 (2020)

62.

Batuwita, R.; Palade, V.: Class Imbalance Learning Methods for Support Vector Machines. Wiley, New York (2013)

63.

Lin, C.F.; Wang, S.D.: Fuzzy support vector machines. IEEE Trans. Neural Netw. 13(2), 464–471 (2002)

64.

Wu, G.; Chang, E.Y.: Adaptive feature-space conformal transformation for imbalanced-data learning. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 816–823 (2003a)

65.

Wu, S.; Amari, S.I.: Conformal transformation of kernel functions: a data-dependent way to improve support vector machine classifiers. Neural Process. Lett. 15(1), 59–67 (2002)MATH

66.

Wu, G.; Chang, E.Y.: Class-boundary alignment for imbalanced dataset learning. In: ICML 2003 Eorkshop on Learning from Imbalanced Data Sets II, pp. 49–56. Washington DC (2003b)

67.

Cristianini, N.; Shawe-Taylor, J.; Elisseeff, A.; Kandola, J.S.: On kernel-target alignment. In: Advances in Neural Information Processing Systems, pp. 367–373 (2002)

68.

Kandola, JS.; Shawe-Taylor, J.: Refining kernels for regression and uneven classification problems. In: AISTATS (2003)

69.

Wu, G.; Chang, E.Y.: Kba: kernel boundary alignment considering imbalanced data distribution. IEEE Trans. Knowl. Data Eng. 17(6), 786–795 (2005)

70.

Imam, T.; Ting, K.M.; Kamruzzaman, J.: z-svm: an svm for improved classification of imbalanced data. In: Australasian Joint Conference on Artificial Intelligence, pp. 264–273. Springer (2006)

71.

Hong, X.; Chen, S.; Harris, C.J.: A kernel-based two-class classifier for imbalanced data sets. IEEE Trans. Neural Netw. 18(1), 28–41 (2007)

72.

Fernández, A.; García, S.; del Jesus, M.J.; Herrera, F.: A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets. Fuzzy Sets Syst. 159(18), 2378–2398 (2008)MathSciNet

73.

Chi, Z.; Yan, H.; Pham, T.: Fuzzy Algorithms: With Applications to Image Processing and Pattern Recognition, vol. 10. World Scientific, Singapore (1996)MATH

74.

Ganaie, M.; Tanveer, M.; Suganthan, P.: Regularized robust fuzzy least squares twin support vector machine for class imbalance learning. In: 2020 International Joint Conference on Neural Networks, pp. 1–8. IJCNN, IEEE (2020)

75.

Rekha, G.; Reddy, V.K.; Tyagi, A.K.; Nair, M.M.: Distance-based bootstrap sampling in bagging for imbalanced data-set. In: 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), pp. 1–6 (2020)

76.

Deng, X.; Xu, Y.; Chen, L.; Zhong, W.; Jolfaei, A.; Zheng, X.: Dynamic clustering method for imbalanced learning based on adaboost. J. Supercomput. 76, 1–23 (2020)

77.

Richhariya, B.; Tanveer, M.: A reduced universum twin support vector machine for class imbalance learning. Pattern Recognit. 102, 107150 (2020)

78.

Wang, Q.; Tian, Y.; Liu, D.: Adaptive fh-svm for imbalanced classification. IEEE Access 7, 130410–130422 (2019)

79.

Chawla, N.V.; Lazarevic, A.; Hall, L.O.; Bowyer, K.W.: Smoteboost: improving prediction of the minority class in boosting. In: European Conference on Principles of Data Mining and Knowledge Discovery, pp. 107–119. Springer (2003)

80.

Guo, H.; Viktor, H.L.: Learning from imbalanced data sets with boosting and data generation: the databoost-im approach. ACM Sigkdd Explor. Newsl. 6(1), 30–39 (2004)

81.

Wang, B.X.; Japkowicz, N.: Boosting support vector machines for imbalanced data sets. Knowl. Inf. Syst. 25(1), 1–20 (2010)

82.

Kim, M.J.; Kang, D.K.: Geometric mean based boosting algorithm to resolve data imbalance problem. DBKDA 2013, 23 (2013)

83.

Kumaraguru, M.A.; Vinod, V.; Rajkumar, N.; Karthikeyan, S.: Parallel selective sampling for imbalance data sports activities. In: Soft Computing: Theories and Applications, pp. 879–886. Springer (2020)

84.

Wu, Q.; Lin, Y.; Zhu, T.; Wei, J.: Husboost: a hubness-aware boosting for high-dimensional imbalanced data classification. In: 2019 International Conference on Machine Learning and Data Engineering (iCMLDE), pp. 36–41. IEEE (2019)

85.

Seiffert, C.; Khoshgoftaar, T.; Van Hulse, J.; Napolitano, A.: Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 40(1), 185–197 (2009)

86.

Castillo, C.; Chellapilla, K.; Fetterly, D.: Fourth international workshop on adversarial information retrieval on the web (airweb 2008). In: Proceedings of the 17th International Conference on World Wide Web, pp. 1267–1268 (2008)

87.

Yu, H.; Kaminsky, M.; Gibbons, P.B.; Flaxman, A.: Sybilguard: defending against sybil attacks via social networks. In: Proceedings of the 2006 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 267–278 (2006)

88.

Abernethy, J.; Chapelle, O.; Castillo, C.: Graph regularization methods for web spam detection. Mach. Learn. 81(2), 207–225 (2010)MathSciNet

89.

Castillo, C.; Donato, D.; Gionis, A.; Murdock, V.; Silvestri, F.: Know your neighbors: web spam detection using the web topology. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 423–430 (2007)

90.

Cohen, W.W.; Kou, Z.: Stacked Graphical Learning: Learning in Markov Random Fields Using Very Short Inhomogeneous Markov Chains, pp. 1–8. Carnegie Mellon University, Pittsburgh (2006)

91.

Anagnostakis, K.G.; Sidiroglou, S.; Akritidis, P.; Xinidis, K.; Markatos, E.; Keromytis, A.D.: Detecting targeted attacks using shadow honeypots (2005)

92.

Moshchuk, A.; Bragin, T.; Gribble, SD.; Levy, HM.: A crawler-based study of spyware in the web. In: NDSS, vol. 1, p. 2 (2006)

93.

Provos, N.; McNamee, D.; Mavrommatis, P.; Wang, K.; Modadugu, N.; et al.: The ghost in the browser: analysis of web-based malware. HotBots 7, 4–4 (2007)

94.

Cafarella, M.; Cutting, D.: Building nutch: open source search. Queue 2(2), 54–61 (2004)

95.

Fetterly, D.; Manasse, M.; Najork, M.: Detecting phrase-level duplication on the world wide web. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 170–177 (2005)

96.

Fuad, M.M.; Deb, D.; Hossain, M.S.: A trainable fuzzy spam detection system. In: Proceedings of the 7th International Conference on Computer and Information Technology (2004)

97.

Sanglerdsinlapachai, N.; Rungsawang, A.: Web phishing detection using classifier ensemble. In: Proceedings of the 12th International Conference on Information Integration and Web-based Applications and Services, pp. 210–215 (2010)

98.

Martin, A.; Anutthamaa, N.; Sathyavathy, M.; Francois, M.M.S.; Venkatesan, D.V.P. et al.: A framework for predicting Phishing websites using neural networks (2011). arXiv preprint arXiv:1109.1074

99.

Dudley, J.; Barone, L.; While, L.: Multi-objective spam filtering using an evolutionary algorithm. In: 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence), pp. 123–130. IEEE (2008)

100.

Hans, K.; Ahuja, L.; Muttoo, S.: Approaches for web spam detection. Int. J. Comput. Appl. 101(1), 38–44 (2014)

101.

Quinlan, J.R.: C4.5: Programs for Machine Learning. Elsevier, Amsterdam (2014)

102.

Erdélyi, M.; Garzó, A.; Benczúr, AA.: Web spam classification: a few features worth more. In: Proceedings of the 2011 Joint WICOW/AIRWeb Workshop on Web Quality, pp. 27–34 (2011)

103.

Friedman, J.; Hastie, T.; Tibshirani, R.; et al.: Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann. Stat. 28(2), 337–407 (2000)MATH

104.

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)MATH

105.

Liu, Y.; Chen, F.; Kong, W.; Yu, H.; Zhang, M.; Ma, S.; Ru, L.: Identifying web spam with the wisdom of the crowds. ACM Tran. Web (TWEB) 6(1), 1–30 (2012)

106.

Prieto, V.M.; Álvarez, M.; Cacheda, F.: Saad, a content based web spam analyzer and detector. J. Syst. Softw. 86(11), 2906–2918 (2013)

107.

Mathew, J.; Pang, C.K.; Luo, M.; Leong, W.H.: Classification of imbalanced data by oversampling in kernel space of support vector machines. IEEE Trans. Neural Netw. Learn. Syst. 29(9), 4065–4076 (2018)

108.

Gao, T.; Wang, J.; Zhang, B.; Zhang, H.; Ren, P.; Pal, N.R.: A Polak–Ribière–Polyak conjugate gradient-based neuro-fuzzy network and its convergence. IEEE Access 6, 41551–41565 (2018)

109.

Singh, S.; Singh, A.K.: Detection of spam using particle swarm optimisation in feature selection. Pertanika J. Sci. Technol. 26(3), 1355–1372 (2018)

110.

Luckner, M.: Practical web spam lifelong machine learning system with automatic adjustment to current lifecycle phase. Secur. Commun. Netw. 2019, 1–16 (2019)

111.

Mendel, J.M.; John, R.: A fundamental decomposition of type-2 fuzzy sets. In: Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569), vol 4, pp. 1896–1901. IEEE (2001)

112.

Linda, O.; Manic, M.: General type-2 fuzzy c-means algorithm for uncertain fuzzy clustering. IEEE Trans. Fuzzy Syst. 20(5), 883–897 (2012)

113.

Mendel, J.M.: General type-2 fuzzy logic systems made simple: a tutorial. IEEE Trans. Fuzzy Syst. 22(5), 1162–1182 (2013)

114.

Bezdek, J.C.; Ehrlich, R.; Full, W.: Fcm: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984)

115.

Kaur, P., Gosain, A.: Comparing the behavior of oversampling and undersampling approach of class imbalance learning by combining class imbalance problem with noise. In: ICT Based Innovations, pp. 23–30. Springer (2018a)

116.

Denoyer, CCKCL.: Web Spam Challenge 2007 (2007)

117.

Goh, K.L.; Singh, A.K.: Comprehensive literature review on machine learning structures for web spam classification. Procedia Comput. Sci. 70, 434–441 (2015)

118.

Hall, M.; Frank, E.; Holmes, G.; Pfahringer, B.; Reutemann, P.; Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

119.

Cortes, C.; Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)MATH

120.

Aha, D.W.; Kibler, D.; Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)

121.

Haykin, S.: Neural Networks: A Comprehensive Foundation. Prentice Hall ptr, Upper Saddle River (1998)MATH

122.

Cohen, W.W.: Fast effective rule induction. In: Machine Learning Proceedings 1995, pp. 115–123. Elsevier (1995)

123.

Freund, Y.; Schapire, RE.; et al.: Experiments with a new boosting algorithm. In: ICML, vol. 96, pp. 148–156. Citeseer (1996)

124.

Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)MATH

125.

Matlab, V.: 7.10. 0 (r2018a). The MathWorks Inc, Natick, Massachusetts (2018)

126.

Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979)MathSciNetMATH

127.

Wilcoxon, F.: Individual comparisons by ranking methods. Biom. Bull. 1(6), 80–83 (1945)

Titel: GT2FS-SMOTE: An Intelligent Oversampling Approach Based Upon General Type-2 Fuzzy Sets to Detect Web Spam
verfasst von: Prabhjot Kaur
Anjana Gosain
Publikationsdatum: 15.10.2020
Verlag: Springer Berlin Heidelberg
Erschienen in: Arabian Journal for Science and Engineering / Ausgabe 4/2021
Print ISSN: 2193-567X
Elektronische ISSN: 2191-4281
DOI: https://doi.org/10.1007/s13369-020-04995-5

Premium Partner

Marktübersichten

Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.

Zur Marktübersicht

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 4/2021

Multivariate Time Series Forecasting with Dilated Residual Convolutional Neural Networks for Urban Air Quality Prediction

Two-Phase Group Decision-Aiding System using ELECTRE III Method in Pythagorean Fuzzy Environment

Requirement Engineering Challenges: A Systematic Mapping Study on the Academic and the Industrial Perspective

Providing a Personalization Model Based on Fuzzy Topic Modeling

Design and Analysis of Pattern Matching Algorithms Based on QuRAM Processing

A Systematic Analysis for Energy Performance Predictions in Residential Buildings Using Ensemble Learning

Premium Partner

Marktübersichten