Skip to main content

2015 | OriginalPaper | Buchkapitel

A Method for Class Noise Detection Based on K-means and SVM Algorithms

verfasst von : Zahra Nematzadeh, Roliana Ibrahim, Ali Selamat

Erschienen in: Intelligent Software Methodologies, Tools and Techniques

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

One of the techniques for improving the accuracy of induced classifier is noise filtering. The classifiers prediction performance is affected by the noisy datasets used in the induction of classifiers. Therefore, it is very important to detect and remove the noise in order to increase the classification accuracy. This paper proposed a model for noise detection in the datasets using k-means and support vector machine (SVM) techniques. The proposed model has been tested using the datasets from University of California, Irvine machine learning repository. Experimental results reveal that the proposed model can improve data quality and increase the classification accuracies.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Lowongtrakool, C.: Noise filtering in unsupervised clustering using computation intelligence. Int. J. Math. Anal. 6, 2911–2920 (2012) Lowongtrakool, C.: Noise filtering in unsupervised clustering using computation intelligence. Int. J. Math. Anal. 6, 2911–2920 (2012)
2.
Zurück zum Zitat Sluban, B., Gamberger, D., Lavra, N.: Advances in class noise detection, pp. 1105–1106 (2010) Sluban, B., Gamberger, D., Lavra, N.: Advances in class noise detection, pp. 1105–1106 (2010)
3.
Zurück zum Zitat Daza, L., Acuna, E.: An algorithm for detecting noise on supervised classification (2007) Daza, L., Acuna, E.: An algorithm for detecting noise on supervised classification (2007)
5.
Zurück zum Zitat Hodge, V.J., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 1–43 (2004)CrossRef Hodge, V.J., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 1–43 (2004)CrossRef
6.
Zurück zum Zitat Van Hulse, J.D., Khoshgoftaar, T.M., Huang, H.: The pairwise attribute noise detection algorithm. Knowl. Inf. Syst. 11, 171–190 (2006)CrossRef Van Hulse, J.D., Khoshgoftaar, T.M., Huang, H.: The pairwise attribute noise detection algorithm. Knowl. Inf. Syst. 11, 171–190 (2006)CrossRef
7.
Zurück zum Zitat Miranda, A.L., Garcia, L.P.F., Carvalho, A.C., Lorena, A.C.: Use of classification algorithms in noise detection and elimination. In: Corchado, E., Wu, X., Oja, E., Herrero, Á., Baruque, B. (eds.) HAIS 2009. LNCS, vol. 5572, pp. 417–424. Springer, Heidelberg (2009)CrossRef Miranda, A.L., Garcia, L.P.F., Carvalho, A.C., Lorena, A.C.: Use of classification algorithms in noise detection and elimination. In: Corchado, E., Wu, X., Oja, E., Herrero, Á., Baruque, B. (eds.) HAIS 2009. LNCS, vol. 5572, pp. 417–424. Springer, Heidelberg (2009)CrossRef
8.
Zurück zum Zitat Li, D., Hu, W., Xiong, W., Yang, J.: Fuzzy relevance vector machine for learning from unbalanced data and noise. Pattern Recogn. Lett. 29, 1175–1181 (2008)CrossRef Li, D., Hu, W., Xiong, W., Yang, J.: Fuzzy relevance vector machine for learning from unbalanced data and noise. Pattern Recogn. Lett. 29, 1175–1181 (2008)CrossRef
9.
Zurück zum Zitat Xiong, H., Pandey, G., Member, S.: Enhancing data analysis with noise removal. IEEE Trans. Knowl. Data Eng. 18, 304–319 (2006)CrossRef Xiong, H., Pandey, G., Member, S.: Enhancing data analysis with noise removal. IEEE Trans. Knowl. Data Eng. 18, 304–319 (2006)CrossRef
10.
Zurück zum Zitat Li, Y.: Classification in the presence of class noise. Pattern Recogn. 5, 1–30 (2003) Li, Y.: Classification in the presence of class noise. Pattern Recogn. 5, 1–30 (2003)
11.
Zurück zum Zitat Zeng, X., Martinez, T.: A noise filtering method using neural networks. In: IEEE lnternational Workshop on Soft Computing Techniques in Instrumentatian, Measurement and Related Application, SCIMA 2003, pp. 26–31. IEEE (2003) Zeng, X., Martinez, T.: A noise filtering method using neural networks. In: IEEE lnternational Workshop on Soft Computing Techniques in Instrumentatian, Measurement and Related Application, SCIMA 2003, pp. 26–31. IEEE (2003)
12.
Zurück zum Zitat Zhu, X., Chen, Q.: eliminating class noise in large datasets, pp. 920–927.(2003) Zhu, X., Chen, Q.: eliminating class noise in large datasets, pp. 920–927.(2003)
13.
Zurück zum Zitat Lawrence, N.D., Schölkopf, B.: Estimating a kernel Fisher discriminant in the presence of label noise. In: ICML, pp. 306–313. Citeseer (2001) Lawrence, N.D., Schölkopf, B.: Estimating a kernel Fisher discriminant in the presence of label noise. In: ICML, pp. 306–313. Citeseer (2001)
14.
Zurück zum Zitat Gamberger, D., Lavrac, N.: Noise detection and elimination in data preprocessing: experiments in medical domains. Appl. Artif. Intell. 14(2), 205–223 (2000)CrossRef Gamberger, D., Lavrac, N.: Noise detection and elimination in data preprocessing: experiments in medical domains. Appl. Artif. Intell. 14(2), 205–223 (2000)CrossRef
15.
Zurück zum Zitat Shah, Z., Mahmood, A.N., Mustafa, A.K.: A hybrid approach to improving clustering accuracy using SVM. In: Industrial Electronics and Applications (ICIEA), pp. 783–788. IEEE (2013) Shah, Z., Mahmood, A.N., Mustafa, A.K.: A hybrid approach to improving clustering accuracy using SVM. In: Industrial Electronics and Applications (ICIEA), pp. 783–788. IEEE (2013)
16.
Zurück zum Zitat Vapnik, V.N., Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998) Vapnik, V.N., Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
17.
Zurück zum Zitat Jiang, B., Zhang, X., Cai, T.: Estimating the confidence interval for prediction errors of support vector machine classifiers. J Mach. Learn. Res. 9, 521–540 (2008)MathSciNet Jiang, B., Zhang, X., Cai, T.: Estimating the confidence interval for prediction errors of support vector machine classifiers. J Mach. Learn. Res. 9, 521–540 (2008)MathSciNet
18.
Zurück zum Zitat Kordos, M., Rusiecki, A.: Improving MLP neural network performance by noise reduction. In: Dediu, A.-H., Martín-Vide, C., Truthe, B., Vega-Rodríguez, M.A. (eds.) TPNC 2013. LNCS, vol. 8273, pp. 133–144. Springer, Heidelberg (2013)CrossRef Kordos, M., Rusiecki, A.: Improving MLP neural network performance by noise reduction. In: Dediu, A.-H., Martín-Vide, C., Truthe, B., Vega-Rodríguez, M.A. (eds.) TPNC 2013. LNCS, vol. 8273, pp. 133–144. Springer, Heidelberg (2013)CrossRef
19.
Zurück zum Zitat Salehi, S., Selamat, A., Mashinchi, R., Fujita, H.: The synergistic combination of particle swarm optimization and fuzzy sets to design granular classifier. Knowl.-Based Syst. 76, 200–218 (2015)CrossRef Salehi, S., Selamat, A., Mashinchi, R., Fujita, H.: The synergistic combination of particle swarm optimization and fuzzy sets to design granular classifier. Knowl.-Based Syst. 76, 200–218 (2015)CrossRef
20.
Zurück zum Zitat Byeon, B., Rasheed, K., Doshi, P.: Enhancing the quality of noisy training data using a genetic algorithm and prototype selection. In: IC-AI, pp. 821–827 (2008) Byeon, B., Rasheed, K., Doshi, P.: Enhancing the quality of noisy training data using a genetic algorithm and prototype selection. In: IC-AI, pp. 821–827 (2008)
21.
Zurück zum Zitat Utkin, L.V., Zhuk, Y.A.: Robust boosting classification models with local sets of probability distributions. Knowl.-Based Syst. 61, 59–75 (2014)CrossRef Utkin, L.V., Zhuk, Y.A.: Robust boosting classification models with local sets of probability distributions. Knowl.-Based Syst. 61, 59–75 (2014)CrossRef
Metadaten
Titel
A Method for Class Noise Detection Based on K-means and SVM Algorithms
verfasst von
Zahra Nematzadeh
Roliana Ibrahim
Ali Selamat
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-22689-7_23